Tutorial: Getting Started

This tutorial walks you through creating a database cluster from scratch, covering initialization, infrastructure provisioning, and database configuration. The examples below use Cassandra, but the same infrastructure supports ClickHouse, OpenSearch, and Spark.

Prerequisites

Before starting, ensure you've completed the Setup process by running easy-db-lab setup-profile.

Part 1: Initialize Your Cluster

The init command creates local configuration files for your cluster. It does not provision AWS resources yet.

easy-db-lab init my-cluster

This creates a 3-node Cassandra cluster by default.

Init Options

Option	Description	Default
`--db`, `--cassandra`, `-c`	Number of Cassandra instances	3
`--app`, `--stress`, `-s`	Number of stress/application instances	0
`--instance`, `-i`	Cassandra instance type	r3.2xlarge
`--stress-instance`, `-si`	Stress instance type	c7i.2xlarge
`--azs`, `-z`	Availability zones (e.g., `a,b,c`)	all available
`--arch`, `-a`	CPU architecture (AMD64, ARM64)	AMD64
`--ebs.type`	EBS volume type (NONE, gp2, gp3, io1, io2)	NONE
`--ebs.size`	EBS volume size in GB	256
`--ebs.iops`	EBS IOPS (gp3 only)	0
`--ebs.throughput`	EBS throughput (gp3 only)	0
`--until`	When instances can be deleted	tomorrow
`--tag`	Custom tags (key=value, repeatable)	-
`--vpc`	Use existing VPC ID	-
`--up`	Auto-provision after init	false
`--clean`	Remove existing config first	false

Examples

Basic 3-node cluster:

easy-db-lab init my-cluster

5-node cluster with 2 stress nodes:

easy-db-lab init my-cluster --db 5 --stress 2

Production-like cluster with EBS storage:

easy-db-lab init prod-test --db 5 --ebs.type gp3 --ebs.size 500 --ebs.iops 3000

ARM64 cluster for Graviton instances:

easy-db-lab init my-cluster --arch ARM64 --instance r7g.2xlarge

Initialize and provision in one step:

easy-db-lab init my-cluster --up

Part 2: Launch Infrastructure

Once initialized, provision the AWS infrastructure:

easy-db-lab up

This command creates:

S3 Storage: Cluster data stored under a dedicated prefix in the account S3 bucket
VPC: With subnets and security groups
EC2 Instances: Cassandra nodes, stress nodes, and a control node
K3s Cluster: Lightweight Kubernetes across all nodes

What Happens During `up`

Configures account S3 bucket with cluster prefix
Creates VPC with public subnets in your availability zones
Provisions EC2 instances in parallel
Waits for SSH availability
Configures K3s cluster on all nodes
Writes SSH config and environment files

Up Options

Option	Description
`--no-setup`, `-n`	Skip K3s setup and AxonOps configuration

Environment Setup

After up completes, source the environment file:

source env.sh

This configures your shell with:

SSH shortcuts: ssh db0, ssh db1, ssh stress0, etc.
Cluster aliases: c0, c-all, c-status
SOCKS proxy configuration

See Shell Aliases for all available shortcuts.

Part 3: Configure Cassandra 5.0

With infrastructure running, configure and start Cassandra.

Step 1: Select Cassandra Version

easy-db-lab cassandra use 5.0

This command:

Sets the active Cassandra version on all nodes
Downloads configuration files to your local directory
Applies any existing patch configuration

Available versions: 3.0, 3.11, 4.0, 4.1, 5.0, 5.0-HEAD, trunk

Step 2: Customize Configuration (Optional)

Edit cassandra.patch.yaml to customize settings:

# Example: Change token count
vim cassandra.patch.yaml

Common customizations:

Setting	Description	Default
`num_tokens`	Virtual nodes per instance	4
`concurrent_reads`	Max concurrent read operations	64
`concurrent_writes`	Max concurrent write operations	64
`endpoint_snitch`	Network topology snitch	Ec2Snitch

Step 3: Apply Configuration

easy-db-lab cassandra update-config

This uploads and applies the patch to all Cassandra nodes.

To apply and restart Cassandra in one command:

easy-db-lab cassandra update-config --restart

Step 4: Start Cassandra

easy-db-lab cassandra start

Step 5: Verify Cluster

Check cluster status:

ssh db0 nodetool status

Or use the shell alias (after sourcing env.sh):

c-status

You should see all nodes in UN (Up/Normal) state.

Part 4: Working with Your Cluster

SSH Access

After sourcing env.sh:

ssh db0          # First Cassandra node
ssh db1          # Second Cassandra node
ssh stress0      # First stress node (if provisioned)
ssh control0     # Control node

Cassandra Management

# Stop Cassandra on all nodes
easy-db-lab cassandra stop

# Start Cassandra on all nodes
easy-db-lab cassandra start

# Restart Cassandra on all nodes
easy-db-lab cassandra restart

Filter to Specific Hosts

Most commands support the --hosts filter:

# Apply config only to db0 and db1
easy-db-lab cassandra update-config --hosts db0,db1

# Restart only db2
easy-db-lab cassandra restart --hosts db2

Download Configuration Files

To download the current configuration from nodes:

easy-db-lab cassandra download-config

This saves configuration files to a local directory named after the version (e.g., 5.0/).

Part 5: Shut Down

When finished, destroy the cluster infrastructure:

easy-db-lab down

Warning

This permanently destroys all EC2 instances, the VPC, and associated resources. S3 data under the cluster prefix is scheduled for expiration (default: 1 day).

Quick Reference

Task	Command
Initialize cluster	`easy-db-lab init <name> [options]`
Provision infrastructure	`easy-db-lab up`
Initialize and provision	`easy-db-lab init <name> --up`
Select Cassandra version	`easy-db-lab cassandra use <version>`
Apply configuration	`easy-db-lab cassandra update-config`
Start Cassandra	`easy-db-lab cassandra start`
Stop Cassandra	`easy-db-lab cassandra stop`
Restart Cassandra	`easy-db-lab cassandra restart`
Check cluster status	`ssh db0 nodetool status`
Download config	`easy-db-lab cassandra download-config`
Destroy cluster	`easy-db-lab down`
Display hosts	`easy-db-lab hosts`
Clean local files	`easy-db-lab clean`

Next Steps

Kubernetes Access - Access K3s cluster with kubectl and k9s
Shell Aliases - All available CLI shortcuts
ClickHouse - Deploy ClickHouse for analytics
Spark - Set up Apache Spark via EMR

Keyboard shortcuts

easy-db-lab