Why we need a Multi Cloud Database and how to build one

Why we need a Multi Cloud Database and how to build one

In this blog we will talk about what is a multi-cloud architecture and why it is important but also we have provided a working example of how to deploy CockroachDB across three different cloud providers using Kubernetes and network VPNs. This is not an easy task so with this example it can get you on the road to a true multi-cloud database to deploy your multi-cloud application with.

What is multi-cloud?

Multi-cloud is often used to refer to when a business decides to use more than one cloud to deliver its applications. By doing this it enables organizations to take advantage of the best of bread services provided by each of the cloud providers involved. A multi-cloud deployment could include a combination of public cloud providers or private cloud or a combination of both. In the case of CockroachDB it would be three or more due to the RAFT Consensus algorithm used to manage data resilience.

With a multi cloud application you can:

  • Power a single application with data stored across multiple clouds.
  • Use data that is created in one cloud to perform analysis in another cloud without having to manage or maintain manual data movement.
  • Enhance the mobility of applications by being able to move them from one cloud to another.

Why is multi-cloud important?

Putting all your eggs in one basket with a single provider could prove to be a risky approach. No one is immune to outages and that includes the big cloud providers. By spreading the risk across multiple cloud providers you are mitigating the chance of an outage affecting your end  customers. Also each cloud provider has their forte, the services that they do best. By spreading your data across cloud providers you are able to take advantage of the best of bread services each cloud offers like analytics servers in GCP for example.

Manage your own destiny

Having a multi-cloud strategy gives you the flexibility to migrate your applications between the clouds with relative ease. Historically this had its challenges particularly if cloud provider specific services have been used. By using cloud agnostic solutions like CockroachDB gives you the flexibility to move applications as the data is available in all locations due the CockroachDB presenting a single logical database consistently across all locations. Preventing vendor lock in and putting the power in the hands of the consumer to do things like take advantage of the best prices.

Avoid cloud concentration risk

If organizations are allowed to choose which cloud to use independently particularly financial institutions or other critical national infrastructure this could result in a dependance on a single provider. This could be accidental but the result could be catastrophic as this could result in the failure of critical elements of the civilized world. As a result of this, regulators of these industries are becoming nervous about the risk that cloud concentration presents. To avoid the possibility of your infrastructure being impacted by an outage of a cloud provider or a cloud provider going out of business. You need to adopt a multi-cloud strategy where your infrastructure and application are deployed across two or more cloud providers. This can be a challenge especially when we think of how to distribute the data.

How does CockroachDB enable multi-cloud?

Multi-cloud refers to the practice of using multiple cloud service providers to host different components or instances of an application or service. Instead of relying on a single cloud provider, organizations opt to distribute their workloads across multiple cloud platforms to avoid vendor lock-in, enhance resilience, improve performance, and reduce the risk of downtime due to provider-specific outages. CockroachDB enables multi-cloud applications through the following features and capabilities:

Replication Across Cloud Providers: CockroachDB allows data to be replicated across multiple cloud providers. This means that you can have clusters of CockroachDB instances running on different cloud platforms, and data can be automatically synchronized between them to maintain consistency and availability.

Global Data Distribution: CockroachDB supports geo-replication, enabling data to be distributed across various geographical regions hosted by different cloud providers. This enables you to place data closer to end-users, reducing latency and improving the application's performance.

Active-Active Deployments: With CockroachDB's distributed architecture and data replication capabilities, you can set up active-active deployments across different cloud providers. In an active-active setup, read and write operations can be handled by multiple clusters simultaneously, offering better load distribution and fault tolerance.

Failover and Disaster Recovery: By deploying CockroachDB clusters in multiple cloud providers, you can implement effective failover and disaster recovery strategies. If one cloud provider experiences an outage or becomes unavailable, the application can automatically failover to another cloud provider where CockroachDB is running.

Data Sovereignty and Compliance: Multi-cloud deployments can help organizations adhere to data sovereignty regulations that require certain data to be stored within specific geographic regions. CockroachDB's ability to replicate data across clouds while maintaining strong consistency facilitates compliance with such regulations.

Vendor Lock-In Mitigation: Using multiple cloud providers reduces the risk of vendor lock-in. Organizations can take advantage of competitive pricing, unique features, and specialized services from different cloud providers without being tied to a single vendor.

Load Balancing and Performance Optimization: CockroachDB's automatic load balancing ensures that data and workloads are evenly distributed across the multi-cloud environment, maximizing resource utilization and maintaining optimal performance.

Security and Compliance: CockroachDB provides built-in security features, such as encryption at rest and in transit, ensuring that data remains secure and compliant with regulatory requirements, even in multi-cloud setups.

By combining these multi-cloud capabilities, CockroachDB allows developers and organizations to create highly available, fault-tolerant, and performant applications that can span multiple cloud providers. It provides the flexibility to leverage the strengths of different cloud platforms while mitigating the risks associated with relying solely on one provider.

How to create a multi-cloud SQL Database

In this github repo we have created a working example of a multi-cloud CockroachDB cluster. A Kuberentes cluster is created in each of the three cloud providers using their managed Kubernetes services. These are then connected together using VPN devices with CockroachDB deployed across all three cloud providers. Sounds simple right? There are a number of considerations to take into account when deploying such a solution.

Networking

With any infrastructure solution IP addressing is important so when designing your multi-cloud solution ensure that no overlapping address space exists across all clouds and pod networks across all Kuberentes clusters in the solution. This makes routing simple and allows traffic to flow seamlessly across clouds. If overlapping address space exists then complex Network Address Translation (NAT) has to take place which can add lots of complexity and make solutions hard to manage and administer.

Connectivity

All of the nodes that make up your CockroachDB cluster need to be able to communicate with each other. Clouds can be joined using Virtual Private Networks (VPNs), these are encrypted tunnels over the internet that connect two or more local networks. Other more private and resilient networking solutions are available but these come at a premium from a cost perspective. For this reason in the demo code we will stick to VPNs and these are deployed in Step 2 and Step 3. Another consideration is the cost of networking. Cloud providers can tend to charge larger amounts for data leaving their specific cloud in an attempt to encourage you to use all of their services. However, if your workload dictated a multi-cloud strategy then these need to be factored into your plans and budgets.

Name Resolution

Along with network connectivity, name resolution is also important. CockroachDB nodes running in one cloud need to be able resolve the name of the nodes running in other clouds. In Kubernetes there are generally two solutions CoreDNS and kube-dns. In AKS and EKS CoreDNS is used for DNS however in GKE kube-dns is used for DNS. This makes a single DNS solution tricky across all three Kubernetes clusters. The solution to this is you are able to replace kube-dns with CoreDNS. You can follow the instructions in Step 5. By doing this there is a unified solution for DNS across all three clusters enabling cross cluster name resolution.

Deploying Multi-region CockroachDB in Kubernetes

To deploy CockroachDB in a multi-region setup the manifests are the recommended approach as documented in Step 5. CockroachDB is best deployed in a secure configuration, this means that all traffic between nodes and clients is encrypted using TLS. To configure this, use the Cockroach binary, which includes a Certificate Authority. These certificates need to be shared across all the nodes in the cluster so need to be added as secrets in all the three clusters. This ensures that all the nodes trust each other and are able to communicate with each other. Follow the remaining steps to deploy CockroachDB across the three Kubernetes clusters.

Multi-cloud considerations

No cloud provider is immune from outages. We have seen this in recent months with both AWS and GCP suffering major outages. If you are running mission critical applications that require this level of resilience then with CockroachDB you are able to create true multi-cloud applications that have a common data plane that extends beyond traditional boundaries of a single cloud, protecting your workload for a cloud outage.

By adopting a multi-cloud strategy and using CockroachDB you put the power back into your hands on where to best run your application. With CockroachDB as the data plane for your application you can achieve true mobility for your application. Whether it is moving from on-prem to cloud or between clouds CockroachDB makes it possible. Avoiding vendor lockin or taking advantage of all best of bread services delivered by the various providers.

There are still considerations to be made however. Like the cost of data transfers between clouds or the additional complexity of such a solution but a multi-cloud application can provide you with the competitive advantage you are looking for to put you ahead of the pack.