Tutorial: From Setup to Running Cassandra¶
This tutorial walks you through creating a Cassandra cluster from scratch, covering initialization, infrastructure provisioning, and Cassandra configuration.
Prerequisites
Before starting, ensure you've completed the Setup process by running easy-db-lab setup-profile.
Part 1: Initialize Your Cluster¶
The init command creates local configuration files for your cluster. It does not provision AWS resources yet.
easy-db-lab init my-cluster
This creates a 3-node Cassandra cluster by default.
Init Options¶
| Option | Description | Default |
|---|---|---|
--db, --cassandra, -c |
Number of Cassandra instances | 3 |
--app, --stress, -s |
Number of stress/application instances | 0 |
--instance, -i |
Cassandra instance type | r3.2xlarge |
--stress-instance, -si |
Stress instance type | c7i.2xlarge |
--azs, -z |
Availability zones (e.g., a,b,c) |
all available |
--arch, -a |
CPU architecture (AMD64, ARM64) | AMD64 |
--ebs.type |
EBS volume type (NONE, gp2, gp3, io1, io2) | NONE |
--ebs.size |
EBS volume size in GB | 256 |
--ebs.iops |
EBS IOPS (gp3 only) | 0 |
--ebs.throughput |
EBS throughput (gp3 only) | 0 |
--until |
When instances can be deleted | tomorrow |
--tag |
Custom tags (key=value, repeatable) | - |
--vpc |
Use existing VPC ID | - |
--up |
Auto-provision after init | false |
--clean |
Remove existing config first | false |
Examples¶
Basic 3-node cluster:
easy-db-lab init my-cluster
5-node cluster with 2 stress nodes:
easy-db-lab init my-cluster --db 5 --stress 2
Production-like cluster with EBS storage:
easy-db-lab init prod-test --db 5 --ebs.type gp3 --ebs.size 500 --ebs.iops 3000
ARM64 cluster for Graviton instances:
easy-db-lab init my-cluster --arch ARM64 --instance r7g.2xlarge
Initialize and provision in one step:
easy-db-lab init my-cluster --up
Part 2: Launch Infrastructure¶
Once initialized, provision the AWS infrastructure:
easy-db-lab up
This command creates:
- S3 Bucket: Stores cluster state and backups
- VPC: With subnets and security groups
- EC2 Instances: Cassandra nodes, stress nodes, and a control node
- K3s Cluster: Lightweight Kubernetes across all nodes
What Happens During up¶
- Creates S3 bucket for cluster state
- Creates VPC with public subnets in your availability zones
- Provisions EC2 instances in parallel
- Waits for SSH availability
- Configures K3s cluster on all nodes
- Writes SSH config and environment files
Up Options¶
| Option | Description |
|---|---|
--no-setup, -n |
Skip K3s setup and AxonOps configuration |
Environment Setup¶
After up completes, source the environment file:
source env.sh
This configures your shell with:
- SSH shortcuts:
ssh db0,ssh db1,ssh stress0, etc. - Cluster aliases:
c0,c-all,c-status - SOCKS proxy configuration
See Shell Aliases for all available shortcuts.
Part 3: Configure Cassandra 5.0¶
With infrastructure running, configure and start Cassandra.
Step 1: Select Cassandra Version¶
easy-db-lab cassandra use 5.0
This command:
- Sets the active Cassandra version on all nodes
- Downloads configuration files to your local directory
- Applies any existing patch configuration
Available versions: 3.0, 3.11, 4.0, 4.1, 5.0, 5.0-HEAD, trunk
Step 2: Generate Configuration¶
easy-db-lab cassandra write-config
This creates cassandra.patch.yaml with cluster-specific settings:
cluster_name: "my-cluster"
num_tokens: 4
seed_provider:
class_name: "org.apache.cassandra.locator.SimpleSeedProvider"
parameters:
seeds: "10.0.1.28"
hints_directory: "/mnt/db1/cassandra/hints"
data_file_directories:
- "/mnt/db1/cassandra/data"
commitlog_directory: "/mnt/db1/cassandra/commitlog"
concurrent_reads: 64
concurrent_writes: 64
trickle_fsync: true
endpoint_snitch: "Ec2Snitch"
Auto-Injected Fields
listen_address and rpc_address are automatically injected per-node with each host's private IP address. You don't need to specify them in the patch file.
Step 3: Customize Configuration (Optional)¶
Edit cassandra.patch.yaml to customize settings:
# Example: Change token count
vim cassandra.patch.yaml
Common customizations:
| Setting | Description | Default |
|---|---|---|
num_tokens |
Virtual nodes per instance | 4 |
concurrent_reads |
Max concurrent read operations | 64 |
concurrent_writes |
Max concurrent write operations | 64 |
endpoint_snitch |
Network topology snitch | Ec2Snitch |
Step 4: Apply Configuration¶
easy-db-lab cassandra update-config
This uploads and applies the patch to all Cassandra nodes.
To apply and restart Cassandra in one command:
easy-db-lab cassandra update-config --restart
Step 5: Start Cassandra¶
easy-db-lab cassandra start
Step 6: Verify Cluster¶
Check cluster status:
ssh db0 nodetool status
Or use the shell alias (after sourcing env.sh):
c-status
You should see all nodes in UN (Up/Normal) state.
Part 4: Working with Your Cluster¶
SSH Access¶
After sourcing env.sh:
ssh db0 # First Cassandra node
ssh db1 # Second Cassandra node
ssh stress0 # First stress node (if provisioned)
ssh control0 # Control node
Cassandra Management¶
# Stop Cassandra on all nodes
easy-db-lab cassandra stop
# Start Cassandra on all nodes
easy-db-lab cassandra start
# Restart Cassandra on all nodes
easy-db-lab cassandra restart
Filter to Specific Hosts¶
Most commands support the --hosts filter:
# Apply config only to db0 and db1
easy-db-lab cassandra update-config --hosts db0,db1
# Restart only db2
easy-db-lab cassandra restart --hosts db2
Download Configuration Files¶
To download the current configuration from nodes:
easy-db-lab cassandra download-config
This saves configuration files to a local directory named after the version (e.g., 5.0/).
Part 5: Shut Down¶
When finished, destroy the cluster infrastructure:
easy-db-lab down
Warning
This permanently destroys all EC2 instances, the VPC, and associated resources. S3 buckets are retained for state recovery.
Quick Reference¶
| Task | Command |
|---|---|
| Initialize cluster | easy-db-lab init <name> [options] |
| Provision infrastructure | easy-db-lab up |
| Initialize and provision | easy-db-lab init <name> --up |
| Select Cassandra version | easy-db-lab cassandra use <version> |
| Generate config patch | easy-db-lab cassandra write-config |
| Apply configuration | easy-db-lab cassandra update-config |
| Start Cassandra | easy-db-lab cassandra start |
| Stop Cassandra | easy-db-lab cassandra stop |
| Restart Cassandra | easy-db-lab cassandra restart |
| Check cluster status | ssh db0 nodetool status |
| Download config | easy-db-lab cassandra download-config |
| Destroy cluster | easy-db-lab down |
| Display hosts | easy-db-lab hosts |
| Clean local files | easy-db-lab clean |
Next Steps¶
- Kubernetes Access - Access K3s cluster with kubectl and k9s
- Shell Aliases - All available CLI shortcuts
- ClickHouse - Deploy ClickHouse for analytics
- Spark - Set up Apache Spark via EMR