How to get started with CockroachDB - Binary, Docker or Kubernetes

Want to get started with CockroachDB. This is the perfect blog post for you. Today I am going to show you how to get started with distributed sql databases so you can start developing your application locally on your laptop.

Method 1 - Binary

Installing CockroachDB

  • Download the Binary and add it to your path
curl https://binaries.cockroachdb.com/cockroach-v20.2.8.darwin-10.9-amd64.tgz | tar -xJ
cp -i cockroach-v20.2.8.darwin-10.9-amd64/cockroach /usr/local/bin/

Starting a Cluster

To start a cluster you need to use the cockroach start command. This command is used to start a node and pass in configuration options on startup.

  • Start the first node in our three node cluster
cockroach start \
--insecure \
--store=node1 \
--listen-addr=localhost:26257 \
--http-addr=localhost:8080 \
--join=localhost:26257,localhost:26258,localhost:26259 \
--background

Let us talk a little bit about each of these flags

--Insecure - This tells cockroachdb that we want to start the cluster in insecure mode. This means that we can connect to the instance without passing in a password or using certificates. this is perfect for local development but in production, you would want to use the Secure flag for better security.

--store - This specifies the store directory. This directory stores all the DB data files. As we are deploying this locally we have to give three separate directories, one for each node.

--listen-addr - This flag tells the node to listen only on localhost, with port 26257 used for internal and client traffic. In production, you would want to specify IP addresses or FQDN here. As we are deploying locally we will need to use a different port for each node.

--http-addr - Telles the node to listen on localhost port 8080 used for HTTP requests from the DB Console. Again in production, you would want to use IP addresses or FQDN here. As we are deploying locally we will need to use unique ports for each node. In production you can use the same port as the nodes will have different IP addresses.

--join - When we add nodes to a cockroachdb cluster we need to let the new nodes know which nodes to connect to get cluster information. The join flag allows us to do this. Typically in production, you would want at least one node per region in the join list. For single region deployments then 3 nodes will be sufficient.

--background - Tells the Cockroach process to start in the background as to free up the terminal being used.

  • Start the other 2 Nodes in the cluster
cockroach start \
--insecure \
--store=node2 \
--listen-addr=localhost:26258 \
--http-addr=localhost:8081 \
--join=localhost:26257,localhost:26258,localhost:26259 \
--background
cockroach start \
--insecure \
--store=node3 \
--listen-addr=localhost:26259 \
--http-addr=localhost:8082 \
--join=localhost:26257,localhost:26258,localhost:26259 \
--background
  • Lastly we have to tell the nodes to initialise.
    Initialization is a one time process when creating a new cluster. when adding new nodes initialization is not required. To initialise the cluster you just need to run the cockroach init command against any of the nodes in the cluster
cockroach init --insecure --host=localhost:26257
  • To see the startup parameters you can use the below command
grep 'node starting' node1/logs/cockroach.log -A 11

The output will show something like

That's it, you now have a local CRDB cluster on your laptop. Next steps you could try setting up a load balancer like HAPROXY and connecting an application to the cluster.

Method 2 - Docker

Create a Network Bridge

  • Create a bridge network
    The first thing we need to do is create a bridge network. As we are running multiple containers on a single machine we need a way for these containers o communicate and we can do this in docker by creating a bridged network. I am going to use the name roachnet here and throughout this post but feel free to call the network whatever you like
docker network create -d bridge roachnet

Start the Cluster

  • Create the volumes for each container
    Volumes are important for persisting data to your local laptop
docker volume create roach1
docker volume create roach2
docker volume create roach3
  • Start the CockroachDB first node/container
    Next up is a similar process to the binary installation method. A lot of the flags are relatable just in a different format.
docker run -d \
--name=roach1 \
--hostname=roach1 \
--net=roachnet \
-p 26257:26257 -p 8080:8080  \
-v "roach1:/cockroach/cockroach-data"  \
cockroachdb/cockroach:v20.2.8 start \
--insecure \
--join=roach1,roach2,roach3

Let's break down this command
docker run: The Docker command to start a new container.

-d: This flag runs the container in the background so you can continue the next steps in the same shell.

--name: The name for the container. This is optional, but a custom name makes it significantly easier to reference the container in other commands, for example, when opening a Bash session in the container or stopping the container.

--hostname: The hostname for the container. You will use this to join other containers/nodes to the cluster.

--net: The bridge network for the container to join. See step 1 for more details.

-p 26257:26257 -p 8080:8080: These flags map the default port for inter-node and client-node communication (26257) and the default port for HTTP requests to the DB Console (8080) from the container to the host. This enables inter-container communication and makes it possible to call up the DB Console from a browser.

-v "roach1:/cockroach/cockroach-data": This flag mounts a host directory as a data volume. This means that data and logs for this node will be stored in the roach1 volume on the host and will persist after the container is stopped or deleted. For more details, see Docker's volumes topic.

cockroachdb/cockroach:v20.2.8 start --insecure --join: The CockroachDB command to start a node in the container in insecure mode. The --join flag specifies the hostname of each node that will initially comprise your cluster. Otherwise, all cockroach start defaults are accepted. Note that since each node is in a unique container, using identical default ports won’t cause conflicts.

  • Start the other 2 Nodes
docker run -d \
--name=roach2 \
--hostname=roach2 \
--net=roachnet \
-v "roach2:/cockroach/cockroach-data" \
cockroachdb/cockroach:v20.2.8 start \
--insecure \
--join=roach1,roach2,roach3
docker run -d \
--name=roach3 \
--hostname=roach3 \
--net=roachnet \
-v "roach3:/cockroach/cockroach-data" \
cockroachdb/cockroach:v20.2.8 start \
--insecure \
--join=roach1,roach2,roach3
  • Initialise the cluster
    As with the binary installation, a one-time initialization is required.
docker exec -it roach1 ./cockroach init --insecure
  • Again you can check the startup parameter by executing the below
docker exec -it roach1 grep 'node starting' cockroach-data/logs/cockroach.log -A 11

That's it, you know have a local deployment of cockroachdb deployed using docker. Next steps try several different docker-compose files found here. This will make automating a lot of this deployment easier. You can also use a load balancer and connect your applications to Cockroach too.

Method 3 - Kubernetes

Install a Kubernetes Distribution

CockroachDB can be deployed on most Kubernetes distributions. If your doing this locally MiniKube is one of the easiest distributions to get started with. Please follow the instructions to set up MiniKube here Consider giving the MiniKube VM more Memory and CPU.

Start the Cluster

There are 3 ways to deploy CockroachDB on Kubernetes.

  • An Operator
  • Helm Charts
  • Configuration Files

I am going to use configuration files as it's the most universal of all the deployment methods.

  • Create the CockroachDB pods using the cockroachlabs provided stateful set and configuration files
kubectl create -f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/cockroachdb-statefulset.yaml
  • Ensure the pods are running. If not give it a minute or 2 for them to be created. Please note they will not become "Ready" until after we complete the initialisation that is required
kubectl get pods

  • Its also a good idea to check the Persistent volumes have been created
kubectl get pv
  • Next up is the one-time initialization that is required.
kubectl create \
-f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/cluster-init.yaml
  • Confirm the Cluster job has completed successfully
kubectl get job cluster-init
  • Now you have a working cockroachdb cluster deployed in Kubernetes on your laptop. Just check the cockroachdb pods are in a ready state. Note the cluster init pod will not be ready but should show as completed.
kubectl get pods

Summary

As you can see getting started with cockroachDB locally is super simple. You can deploy any of these 3 ways in under 5 minutes to get a fully working cluster up and running. All this information is also available on the official CockraochDB website. Check out the website for more fun things to do while getting started with CockroachDB