Schedule - PGConf.EU 2022
Database branching to scale and speed up application development and performance optimization
Date: 2022-10-26
Time: 17:20–18:10
Room: Zurich
Level: Intermediate
Over the last decade, clouds, containers, modern CI tools, and Kubernetes drastically boosted development and operational tasks.
However, if you deal with heavily-loaded multi-terabyte databases, there is still one major roadblock on your DevOps path: it is not easy to develop and test because getting a database copy takes hours, if not days, and it is may cost a lot.
To solve this, my team and I developed an open-source tool called Database Lab Engine (DLE, https://github.com/postgres-ai/database-lab-engine) that addresses this very issue. With copy-on-write capabilities provided by filesystem or storage, and with a properly tuned Postgres server, it becomes possible to have a 10 TiB database cloned in less than 10 seconds passed since the request to clone till readiness to serve queries ("thin cloning"). Moreover, dozens of such clones can run on a single machine, with just 10-15 TiB disk space, supporting dozens of experiments conducted simultaneously. Due to the proper tuning, DLE also ensures the Postgres planner works as close to production as possible, choosing the same plans and allowing SQL optimization to be performed in such an environment.
We will discuss how this new approach and the corresponding tooling improve and speed up all the development processes related to Postgres databases in fast-growing and big teams, covering the following:
- database branches for each development branch in Git – a low-cost infrastructure for multiple non-production environments;
- testing on thin clones in CI, including automated verification of database migrations and massive data changes,
- "serverless EXPLAIN": a Slack bot / artificial DBA Joe (https://gitlab.com/postgres-ai/joe) helping developers to optimize SQL queries using thin clones of multi-terabyte databases without providing direct access to the data,
- building the knowledge base around micro-experiments for SQL optimization based on thin clones, improving interoperability between Infrastructure and Development teams.