Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Tutorial: Getting Started

This tutorial walks you through creating a database cluster from scratch, covering initialization, infrastructure provisioning, and database configuration. The examples below use Cassandra, but the same infrastructure supports ClickHouse, OpenSearch, and Spark.

Prerequisites

Before starting, ensure you've completed the Setup process by running easy-db-lab setup-profile.

Part 1: Initialize Your Cluster

The init command creates local configuration files for your cluster. It does not provision AWS resources yet.

easy-db-lab init my-cluster

This creates a 3-node Cassandra cluster by default.

Init Options

OptionDescriptionDefault
--db, --cassandra, -cNumber of Cassandra instances3
--app, --stress, -sNumber of stress/application instances0
--instance, -iCassandra instance typer3.2xlarge
--stress-instance, -siStress instance typec7i.2xlarge
--azs, -zAvailability zones (e.g., a,b,c)all available
--arch, -aCPU architecture (AMD64, ARM64)AMD64
--ebs.typeEBS volume type (NONE, gp2, gp3, io1, io2)NONE
--ebs.sizeEBS volume size in GB256
--ebs.iopsEBS IOPS (gp3 only)0
--ebs.throughputEBS throughput (gp3 only)0
--untilWhen instances can be deletedtomorrow
--tagCustom tags (key=value, repeatable)-
--vpcUse existing VPC ID-
--upAuto-provision after initfalse
--cleanRemove existing config firstfalse

Examples

Basic 3-node cluster:

easy-db-lab init my-cluster

5-node cluster with 2 stress nodes:

easy-db-lab init my-cluster --db 5 --stress 2

Production-like cluster with EBS storage:

easy-db-lab init prod-test --db 5 --ebs.type gp3 --ebs.size 500 --ebs.iops 3000

ARM64 cluster for Graviton instances:

easy-db-lab init my-cluster --arch ARM64 --instance r7g.2xlarge

Initialize and provision in one step:

easy-db-lab init my-cluster --up

Part 2: Launch Infrastructure

Once initialized, provision the AWS infrastructure:

easy-db-lab up

This command creates:

  • S3 Storage: Cluster data stored under a dedicated prefix in the account S3 bucket
  • VPC: With subnets and security groups
  • EC2 Instances: Cassandra nodes, stress nodes, and a control node
  • K3s Cluster: Lightweight Kubernetes across all nodes

What Happens During up

  1. Configures account S3 bucket with cluster prefix
  2. Creates VPC with public subnets in your availability zones
  3. Provisions EC2 instances in parallel
  4. Waits for SSH availability
  5. Configures K3s cluster on all nodes
  6. Writes SSH config and environment files

Up Options

OptionDescription
--no-setup, -nSkip K3s setup and AxonOps configuration

Environment Setup

After up completes, source the environment file:

source env.sh

This configures your shell with:

  • SSH shortcuts: ssh db0, ssh db1, ssh stress0, etc.
  • Cluster aliases: c0, c-all, c-status
  • SOCKS proxy configuration

See Shell Aliases for all available shortcuts.

Part 3: Configure Cassandra 5.0

With infrastructure running, configure and start Cassandra.

Step 1: Select Cassandra Version

easy-db-lab cassandra use 5.0

This command:

  • Sets the active Cassandra version on all nodes
  • Downloads configuration files to your local directory
  • Applies any existing patch configuration

Available versions: 3.0, 3.11, 4.0, 4.1, 5.0, 5.0-HEAD, trunk

Step 2: Customize Configuration (Optional)

Edit cassandra.patch.yaml to customize settings:

# Example: Change token count
vim cassandra.patch.yaml

Common customizations:

SettingDescriptionDefault
num_tokensVirtual nodes per instance4
concurrent_readsMax concurrent read operations64
concurrent_writesMax concurrent write operations64
endpoint_snitchNetwork topology snitchEc2Snitch

Step 3: Apply Configuration

easy-db-lab cassandra update-config

This uploads and applies the patch to all Cassandra nodes.

To apply and restart Cassandra in one command:

easy-db-lab cassandra update-config --restart

Step 4: Start Cassandra

easy-db-lab cassandra start

Step 5: Verify Cluster

Check cluster status:

ssh db0 nodetool status

Or use the shell alias (after sourcing env.sh):

c-status

You should see all nodes in UN (Up/Normal) state.

Part 4: Working with Your Cluster

SSH Access

After sourcing env.sh:

ssh db0          # First Cassandra node
ssh db1          # Second Cassandra node
ssh stress0      # First stress node (if provisioned)
ssh control0     # Control node

Cassandra Management

# Stop Cassandra on all nodes
easy-db-lab cassandra stop

# Start Cassandra on all nodes
easy-db-lab cassandra start

# Restart Cassandra on all nodes
easy-db-lab cassandra restart

Filter to Specific Hosts

Most commands support the --hosts filter:

# Apply config only to db0 and db1
easy-db-lab cassandra update-config --hosts db0,db1

# Restart only db2
easy-db-lab cassandra restart --hosts db2

Download Configuration Files

To download the current configuration from nodes:

easy-db-lab cassandra download-config

This saves configuration files to a local directory named after the version (e.g., 5.0/).

Part 5: Shut Down

When finished, destroy the cluster infrastructure:

easy-db-lab down

Warning

This permanently destroys all EC2 instances, the VPC, and associated resources. S3 data under the cluster prefix is scheduled for expiration (default: 1 day).

Quick Reference

TaskCommand
Initialize clustereasy-db-lab init <name> [options]
Provision infrastructureeasy-db-lab up
Initialize and provisioneasy-db-lab init <name> --up
Select Cassandra versioneasy-db-lab cassandra use <version>
Apply configurationeasy-db-lab cassandra update-config
Start Cassandraeasy-db-lab cassandra start
Stop Cassandraeasy-db-lab cassandra stop
Restart Cassandraeasy-db-lab cassandra restart
Check cluster statusssh db0 nodetool status
Download configeasy-db-lab cassandra download-config
Destroy clustereasy-db-lab down
Display hostseasy-db-lab hosts
Clean local fileseasy-db-lab clean

Next Steps