What We Learned from Containerizing Databases

Home > Enteros’ Blog – Thoughts on Database Technology, Machine / Deep learning, and a Generative AI > Software Engineering > What We Learned from Containerizing Databases

What We Learned from Containerizing Databases

Our production services now run on top of “Container Fabric,” a deployment target that enables our engineers to achieve ever-increasing scale and velocity requirements reliably. Container Fabric separates our hardware from our stateless applications (which don’t require a server to track sessions or communication requests) and makes both considerably easier to administer. After witnessing many of the benefits of containers firsthand, we wanted to keep using Docker for our stateful apps. Apps that maintain and save data, such as databases. We use databases extensively for internal engineering projects and store customer information. We needed to be able to deploy them more quickly, consistently, and effectively.

Containerizing databases has four inherent problems.
Putting databases in containers, however, comes with several drawbacks. Databases have a few essential characteristics that make containerizing them difficult:

– They necessitate high-throughput and low-latency networking.
– In our case, they require the capacity to handle persistent data storage.
– They necessitate layers of a complex setup.
– They are less portable since they require a lot of disk space to store significant volumes of data.

The first challenge is to create the ideal networking environment.

Control groups (also known as groups) and namespaces are two Linux kernel capabilities that Docker containers are based on. These features give memory and CPU resource limits, but they don’t provide practical storage and network resource isolation on their own—both of which are crucial for databases.

Docker currently supports four native network drivers: host, bridge, overlay, and MACVLAN. We weighed the benefits and drawbacks of each option:

Host networking is the simplest and fastest method. Containers place in the same Linux network namespace as the host on which they are operating. Although host networking has minimal overhead, it does not isolate the underlying hardware or other containers.
Docker builds a default Linux bridge named docker on the host for containers to use. A virtual ethernet interface connects a container’s interface to the bridge in bridge mode. Can isolate container networks via Linux bridging and network namespaces; however, this adds substantial overhead to CPU-bound tasks.

Overlay networking abstracts the container’s networking from the physical network using Linux bridges and kernel Virtual eXtensible LAN (VXLAN) technologies. Overlay networking is robust and offers various features, but implementation intricacies can lead to performance issues that put CPUs under strain.
The MACVLAN driver allows containers to access host interfaces without dealing with network address translation overhead (NAT). MACVLAN, unlike the internal networks used by the bridge and overlay drivers, can assign containers an external physical network address, which can make mobility difficult.
After weighing the trade-offs between isolation, performance, and portability, we picked host networking for the database container project.
We use randomised port allocation and cautious container placement to prevent installing numerous high-throughput services on the same physical host. (I recommend this reference on Docker networking choices for a more in-depth review of the aspects that drove our selection.)

Challenge 2: Dealing with long-term data storage

“Volumes provide the best and most consistent performance for write-heavy applications… since they bypass the storage driver and do not incur any of the potential overheads caused by thin provisioning and copy-on-write,” according to the Device Mapper storage documentation.

Volumes are used in all of our internal stateful container projects. While this makes sense in terms of performance, it can make container portability more complex.
We use Logical Volume Manager (LVM) to provide an additional layer of storage separation in some projects. We build a new logical volume from a thin pool and allocate it to a container when deploying a database. Because we monitor each book as an independent resource, it will not affect nearby containers if the container accidentally fills up its volume. We can grow the filesystem and extend the logical volume housing any container’s persistent data without disrupting service.

We carefully locate high-throughput services in locations where we can minimise “noisy neighbour” concerns on hosts with several tenants, just as we do with networking. Finally, we must strike a delicate balance between proper resource isolation and resource efficiency. We continuously monitor the network to maintain that balance and satisfy our cost efficiency targets and our service-level agreements—observability is critical. Containerizing Databases.

Configuring the database is the third challenge.

According to traditional knowledge, a container image should be an immutable product created from a specific version of code and settings. Building a new vision for each database configuration change, on the other hand, can result in container image sprawl.

The high-performance needs of databases necessitate layers of complicated configuration adjustment as a secondary consequence. The hardware, kernel, and operating system are usually tuned when a host provides. We size container memory limits and CPU shares at deployment time to ensure that resources are allocated to each workload and avoid contention. At deployment time, we additionally pass in environment variables to guarantee that the service inside the container can resolve any external dependencies.

Databases frequently have hundreds of tweaking parameters. Because many of those parameters can be updated dynamically, those changes must be reflected on disk, so they don’t get rolled back when the database is redeployed or restarted.

Moving database setup into environment variables is a frequent solution to this problem (used to specify configuration depending on the context in which the databases will run; a basic, fundamental example includes setting environments for dev, staging, and prod). Environment variables make updating configuration during deployment and preserve the structure even if the container is restarted.

While mapping database settings to environment variables is beneficial, it can make determining the container’s final setup more challenging. We synchronize database configuration files from our version control system to an object store and pull them in during deployment to solve these issues. The trade-off between immutability and knowability against container image sprawl seems acceptable.

Big data isn’t portable, which is the fourth challenge.

Containers have several advantages, one of which is application mobility. It should be relatively quick and straightforward to redeploy your containerised application to another host if one fails. The mechanics of moving significant data volumes, on the other hand, makes portability challenges. Given that we address data redundancy and high availability (HA) at the application level, we found that giving up container-level portability was acceptable for databases with a fault-tolerant, shared-nothing architecture.

Additional services are necessary for typical relational database management systems (RDBMS) to provide assurances such as data redundancy and high availability (HA). One option we investigated was to delegate state management to a lower-level “storage fabric” to handle data replication across several hosts. Regrettably, this resulted in a performance penalty.

In the end, we found that embracing “small data,” also known as micro-services, was the best solution to the big data dilemma.
Micro-service architecture necessitates services that are well-defined and have control over their state. This principle is fulfilled by breaking down monolithic databases into smaller, easier-to-manage components. It reduces the “blast radius” of database faults, allowing site reliability engineers (SREs) to work on operational tasks with less chance of catastrophic service outages. On the other hand, Microservices can necessitate significant changes to existing applications. They can also make cross-domain data access patterns more difficult and strain the network.

We needed to design a new set of tools because adopting a micro-services architecture meant we’d be maintaining a vast fleet of small databases.

Value delivers in functional increments.

We delivered the solution in stages was crucial to our success in containerising our databases. We didn’t need to implement a full-fledged container orchestration system to produce benefits. Our primary goal was to provide a dependable database solution that was also quick and resource-efficient.

All new database instances are now deployed in Docker containers regardless of the kind or version. We’ve been able to achieve a new level of consistency and repeatability throughout our database tiers due to this. We can provide new databases to internal teams faster and more reliably with a single deployment methodology. Our resource efficiency has considerably increased, even without complete resource isolation. By creating fresh images, we could also add support for a new backup solution, monitoring, and operating systems throughout our database tiers. See the second post in this series for further information.

We’re keeping a careful eye on upstream open-source activities as we move forward, especially those dealing with resource management and orchestration.

While upstream work to support stateful containers is still in its early stages, we believe it has great potential. The work in the Kubernetes and DC/OS projects to provide stateful services is particularly noteworthy.

About Enteros

Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of RDBMS, NoSQL, and machine learning database platforms.

The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.

Are you interested in writing for Enteros’ Blog? Please send us a pitch!

Revolutionizing Healthcare IT: Leveraging Enteros, FinOps, and DevOps Tools for Superior Database Software Management

21 November 2024
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Optimizing Real Estate Operations with Enteros: Harnessing Azure Resource Groups and Advanced Database Software

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Revolutionizing Real Estate: Enhancing Database Performance and Cost Efficiency with Enteros and Cloud FinOps

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Enteros in Education: Leveraging AIOps for Advanced Anomaly Management and Optimized Learning Environments

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…