Clustering
True peer-to-peer distributed architecture. All nodes are equal — no leader, no single point of failure.
Architecture
Master-Replica
One master, optional replicas
CRC16 Sharding
16384 hash slots
Gossip Protocol
Node discovery & state sync
MOVED Redirect
Automatic key routing
Redis Cluster-Compatible Mode
Enabling Cluster Mode
SoliKV supports Redis Cluster-compatible sharding with 16384 hash slots:
# Start a node in cluster mode
solikv --port 7000 --cluster-enabled
CLUSTER Commands
| Command | Description |
|---|---|
| CLUSTER INFO | Get cluster state and slot information |
| CLUSTER NODES | Get all nodes and their slot assignments |
| CLUSTER SLOTS | Get slot ranges with owner info |
| CLUSTER MEET ip port | Join another cluster node |
| CLUSTER ADDSLOTS slot [slot...] | Claim ownership of hash slots |
| CLUSTER DELSLOTS slot [slot...] | Remove ownership of hash slots |
| CLUSTER KEYSLOT key | Get hash slot for a key |
MOVED Redirect
When a key's slot is owned by a different node, SoliKV returns a MOVED redirect:
GET mykey
# If slot 1234 is owned by node 7001:
MOVED 1234 127.0.0.1:7001
Clients should redirect to the correct node and retry the command.
Internal Sharding (--shards)
Controls internal key distribution within a single node:
--shards Option
Controls internal key distribution within a single node:
| Value | Behavior |
|---|---|
| --shards 0 | Auto - use number of CPU cores (default) |
| --shards 1 | Single shard (no internal partitioning) |
| --shards N | N internal shards for parallel processing |
Why shards?
- • Parallelize writes across CPU cores
- • Reduce lock contention for concurrent keys
- • Each shard has its own mutex & memory
When to adjust?
- • High write throughput: increase shards
- • Low memory: decrease shards
- • Default (0) is usually optimal
Master-Replica Replication (Optional)
By default, SoliKV operates in peer-to-peer mode where all nodes are equal. Master-replica replication is an optional feature for specific use cases like read scalability or manual data distribution. It's not required for clustering — nodes automatically discover and communicate with peers.
Replication Factor
You can have multiple replicas of a master for read scalability and HA:
| Factor | Setup |
|---|---|
| 1 (default) | No replicas - single node |
| 2 | 1 master + 1 replica |
| N+1 | 1 master + N replicas |
REPLICAOF Command
Make a node a replica of another master:
REPLICAOF 127.0.0.1 6379
# Node becomes a replica of master at 127.0.0.1:6379
REPLICAOF NO ONE
# Node stops being a replica, becomes master
ROLE Command
Check the replication role of a node:
ROLE
# Master: ["master", "12345", []]
# Replica: ["slave", "127.0.0.1:6379", "connect"]
Auto-Replication with Keyfile
Nodes can automatically become replicas on startup using a keyfile. This is useful for containerized deployments:
# Generate keyfile on master (creates {dir}/master.keyfile)
solikv --port 6379 --dir /data/master --generate-master-keyfile
# Copy keyfile to slave's data directory
cp /data/master/master.keyfile /data/slave/
# Start slave - it will auto-connect using the keyfile
solikv --port 6380 --dir /data/slave
The keyfile format is: host:port:auth_key
CLI Options
| Option | Description |
|---|---|
| --replicaof HOST:PORT | Become replica of specified master on startup |
| --generate-master-keyfile | Generate master.keyfile and exit |
Conflict Resolution
CRDTs provide automatic conflict-free resolution without coordination between nodes.
| CRDT | Merge Strategy | Use case |
|---|---|---|
| LWW-Register | Most recent write by timestamp wins | Strings, hashes, general key-value |
| OR-Set | Add wins over concurrent remove | Sets, membership tracking |
| PN-Counter | Increments and decrements are additive | Counters, INCR/DECR operations |
Failover & Health Checks
Each node periodically checks peers. Unhealthy nodes are marked and traffic is rerouted to healthy peers.
Detection
- ● Periodic health checks on all peers
- ● Peers marked
unhealthyon failed response - ● Slot routing skips unhealthy nodes
Recovery
- ● Recovered nodes rejoin automatically
- ● CRDT merge catches up missed writes
- ● Slot map rebalanced, broadcast to cluster
Testing & Validation
SoliKV includes a chaos cluster test script to validate cluster behavior under failure scenarios.
Chaos Cluster Test
Run chaos testing to validate cluster resilience:
# Run the chaos test
./tests/chaos_cluster_test.sh
# With custom key count
NUM_KEYS=10000 ./tests/chaos_cluster_test.sh
The test:
- • Starts multiple independent SoliKV nodes
- • Distributes keys across nodes using consistent hashing
- • Kills nodes and measures data availability
- • Tests master-replica replication failover