ScyllaDB University LIVE, FREE Virtual Training Event | March 21
Register for Free
ScyllaDB Documentation Logo Documentation
  • Server
  • Cloud
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Download
ScyllaDB Docs ScyllaDB Open Source Scylla for Administrators Procedures Scylla Best Practices Best Practices for Running Scylla on Docker

Caution

You're viewing documentation for a previous version. Switch to the latest stable version.

Best Practices for Running Scylla on Docker¶

This is an article on how to use the ScyllaDB Docker image to start up a Scylla node, access nodetool and cqlsh utilities, start a cluster of Scylla nodes, configure data volume for storage, configure resource limits of the Docker container, use additional command line flags and overwrite scylla.yaml settings. Finally, there is an additional section with some basic usage of Scylla within Docker.

See also the image description on Docker Hub or our original blog.

Please note that these instructions assume that you have configured Docker so that you can run it as a regular user. Usually, this is done by adding the user to a Docker group. See your platform-specific Docker installation documentation on how to do that (see, for example, instructions for Fedora and Ubuntu). If you have not configured a Docker group, you need to prefix the Docker commands with sudo to have sufficient permissions to run them.

NOTE: You should allocate a minimum of 1.5 GB of RAM per container.

Basic Operations¶

Starting a Single Scylla Node¶

To start a single Scylla node instance in a Docker container, run:

docker run --name some-scylla -d scylladb/scylla

The docker run command starts a new Docker instance in the background named some-scylla that runs the Scylla server:

docker ps

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                          NAMES
616ee646cb9d        scylladb/scylla     "/docker-entrypoint.p"   4 seconds ago       Up 4 seconds        7000-7001/tcp, 9042/tcp, 9160/tcp, 10000/tcp   some-scylla

As seen from the docker ps output, the image exposes ports 7000-7001 (Inter-node RPC), 9042 (CQL), 9160 (Thrift), and 10000 (REST API).

Viewing Scylla Server Logs¶

To access Scylla server logs, you can use the docker logs command:

docker logs some-scylla  | tail

INFO  2016-11-09 10:27:48,191 [shard 6] database - Setting compaction strategy of system_traces.node_slow_log to SizeTieredCompactionStrategy
INFO  2016-11-09 10:27:48,191 [shard 4] database - Setting compaction strategy of system_traces.node_slow_log to SizeTieredCompactionStrategy
INFO  2016-11-09 10:27:48,191 [shard 3] database - Setting compaction strategy of system_traces.node_slow_log to SizeTieredCompactionStrategy
INFO  2016-11-09 10:27:48,191 [shard 1] database - Setting compaction strategy of system_traces.node_slow_log to SizeTieredCompactionStrategy

Checking Server Status with Nodetool¶

The Docker image also has Scylla’s utilities installed. Nodetool is a command line tool for querying and managing a ScyllaDB cluster. The simplest nodetool command is nodetool status, which displays information about the cluster state:

docker exec -it some-scylla nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens  Owns (effective)  Host ID                               Rack
UN  172.17.0.2  125 KB     256     100.0%            c1906b2b-ce0c-4890-a9d4-8c360f111ad0  rack1

Using cqlsh¶

The cqlsh tool (CQL Shell) is an interactive Cassandra Query Language (CQL) shell for querying and manipulating data in the Scylla cluster.

To start an interactive session, run the following command:

docker exec -it some-scylla cqlsh
Connected to Test Cluster at 172.17.0.2:9042.
[cqlsh 5.0.1 | Cassandra 2.1.8 | CQL spec 3.2.1 | Native protocol v3]
Use HELP for help.

and then run CQL queries against the cluster:

cqlsh> SELECT cluster_name FROM system.local;

cluster_name
--------------
Test Cluster

(1 rows)

Starting a Cluster¶

With a single some-scylla instance running, joining new nodes to form a cluster is easy:

docker run --name some-scylla2 -d scylladb/scylla --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' some-scylla)"

To query when the node is up and running (and view the status of the entire cluster) use the nodetool status command:

docker exec -it some-scylla nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens  Owns (effective)  Host ID                               Rack
UN  172.17.0.3  177.48 KB  256     100.0%            097caff5-892d-412f-af78-11d572795d6f  rack1
UN  172.17.0.2  125 KB     256     100.0%            c1906b2b-ce0c-4890-a9d4-8c360f111ad0  rack1

Restarting Scylla from within the Running Node¶

The Docker image uses supervisord to manage Scylla processes. You can restart Scylla in a Docker container using:

docker exec -it some-scylla supervisorctl restart scylla

Configuring a Data Volume for Storage¶

The default filesystem in Docker is inadequate for anything else than just testing out Scylla, but you can use Docker volumes for improving storage performance.

To use data volumes, ensure first that it’s on a Scylla-supported filesystem like XFS, then create a Scylla data directory /var/lib/scylla on the host. This will be used by Scylla container to store all data:

sudo mkdir -p /var/lib/scylla/data /var/lib/scylla/commitlog

Then launch Scylla instances using Docker’s --volume command line option to mount the created host directory as a data volume in the container and disable Scylla’s developer mode to run I/O tuning before starting up the Scylla node.

docker run --name some-scylla --volume /var/lib/scylla:/var/lib/scylla -d scylladb/scylla --developer-mode=0

Overriding scylla.yaml with a Master File¶

Sometimes, it’s not possible to adjust Scylla-specific settings (including non-network properties, like cluster_name ) directly from the command line when Scylla is running within Docker.

Instead, it may be necessary to incrementally override scylla.yaml settings by passing an external, master Scylla.yaml file when starting the Docker container for the node.

To do this, you can use the --volume (-v) command as before to specify the overriding .yaml file:

NOTE: you can create a master_scylla.yaml in current host dir: just copy the file from https://github.com/scylladb/scylla/blob/master/conf/scylla.yaml.

  1. On the host, create and edit master_scylla.yaml, for example. Uncomment and change the “cluster_name” parameter.

  2. Start the Scylla node, with the command to override scylla.yaml with master_scylla.yaml :

docker run --name some-scylla --volume ~/master_scylla.yaml:/etc/scylla/scylla.yaml -d scylladb/scylla

NOTE: You can start a Docker node with any other alternate parameter configured in scylla.yaml using this technique.

  1. Finally, you can check that the setting was changed:

docker exec -it some-scylla nodetool describecluster

Cluster Information:
       Name: Doobie Snarf
       Snitch: org.apache.cassandra.locator.SimpleSnitch
       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
       Schema versions:
               34259144-0f3f-305f-a777-2811e30e17b3: [172.17.0.2]

Getting Performance out of your Docker Container¶

By default, our Docker image defaults to a mode where Scylla’s architectural optimizations are not enabled. With these command-line settings, you can introduce incremental changes that boost your Scylla performance on Docker even more.

Configuring Resource Limits¶

Scylla uses all CPUs and memory by default. To configure resource limits for your Docker container, you can use the --smp, --memory, and --cpuset command line options documented in the section “Network and command-line settings” below.

The recommended way to run multiple Scylla instances on the same physical hardware is by statically partitioning all resources. For example, using the --cpuset option to assign cores 0 and 1 to one instance, and 2 and 3 to another.

In scenarios in which static partitioning is not desired - like mostly-idle cluster without hard latency requirements, the --overprovisioned command-line option is recommended. This enables certain optimizations for Scylla to run efficiently in an overprovisioned environment.

NOTE: You should allocate a minimum of 1.5 GB of RAM per container.

Network and Command-Line Settings¶

The Scylla image supports many command line options that are passed to the Docker run command. Keep in mind that these command-line settings override the corresponding settings in your scylla.yaml.

–seeds SEEDS¶

The --seeds command line option configures Scylla’s seed nodes. If no --seeds option is specified, Scylla uses its own IP address as the seed.

For example, to configure Scylla to run with two seed nodes 192.168.0.100 and 192.168.0.200.

docker run --name some-scylla -d scylladb/scylla --seeds 192.168.0.100,192.168.0.200

–listen-address ADDR¶

The --listen-address command line option configures the IP address the Scylla instance listens for client connections.

For example, to configure Scylla to use listen address 10.0.0.5:

docker run --name some-scylla -d scylladb/scylla --listen-address 10.0.0.5

–broadcast-address ADDR¶

The --broadcast-address command line option configures the IP address the Scylla instance tells other Scylla nodes in the cluster to connect to.

For example, to configure Scylla to use broadcast address 10.0.0.5:

docker run --name some-scylla -d scylladb/scylla --broadcast-address 10.0.0.5

–broadcast-rpc-address ADDR¶

The --broadcast-rpc-address command line option configures the IP address the Scylla instance tells clients to connect to.

For example, to configure Scylla to use broadcast RPC address 10.0.0.5:

docker run --name some-scylla -d scylladb/scylla --broadcast-rpc-address 10.0.0.5

–smp COUNT¶

The --smp command line option restricts Scylla to COUNT number of CPUs. The option does not, however, mandate a specific placement of CPUs. See the --cpuset command line option if you need Scylla to run on specific CPUs.

For example, to restrict Scylla to 2 CPUs:

docker run --name some-scylla -d scylladb/scylla --smp 2

–memory AMOUNT¶

The --memory command line option restricts Scylla to use up to AMOUNT of memory. The AMOUNT value supports both M unit for megabytes and G unit for gigabytes.

For example, to restrict Scylla to 4 GB of memory:

docker run --name some-scylla -d scylladb/scylla --memory 4G

**NOTE: You should allocate a minimum of 1.5 GB of RAM per container.**

–overprovisioned ENABLE¶

The --overprovisioned command line option enables or disables optimizations for running Scylla in an overprovisioned environment. If no --overprovisioned option is specified, Scylla defaults to running with optimizations disabled.

For example, to enable optimizations for running in an overprovisioned environment:

docker run --name some-scylla -d scylladb/scylla --overprovisioned 1

–cpuset CPUSET¶

The --cpuset command line option restricts Scylla to run on only on CPUs specified by CPUSET. The CPUSET value is either a single CPU (e.g. --cpuset 1), a range (e.g. --cpuset 2-3), or a list (e.g. --cpuset 1,2,5), or a combination of the last two options (e.g. --cpuset 1-2,5).

For example, to restrict Scylla to run on physical CPUs 0 to 2 and 4:

docker run --name some-scylla -d scylladb/scylla --cpuset 0-2,4

–developer-mode ENABLE¶

The --developer-mode command line option enables Scylla’s developer mode, which relaxes checks for things like XFS and enables Scylla to run on unsupported configurations (which usually results in suboptimal performance). If no --developer-mode command line option is defined, Scylla defaults to running with developer mode enabled.

It is highly recommended to disable developer mode for production deployments to ensure Scylla is able to run with maximum performance.

To disable developer mode:

docker run --name some-scylla -d scylladb/scylla --developer-mode 0

–experimental ENABLE¶

The --experimental command line option enables Scylla’s experimental mode. If no --experimental command line option is defined, Scylla defaults to running with experimental mode disabled.

It is highly recommended to disable experimental mode for production deployments.

For example, to enable experimental mode:

docker run --name some-scylla -d scylladb/scylla --experimental 1

Other Useful Tips and Tricks¶

Checking the Current Version of Scylla on the Node¶

docker exec -it some-scylla scylla --version

Using a Local CSV File to Import Data into Scylla¶

First, download the file locally to the node:

sudo docker exec -it some-scylla.2.0.1 curl -o file.csv https://<url>.com/<path>/<path>/<file>.csv

Once you have the .csv downloaded, you can use the CQL COPY FROM command as explained here to load the data into Scylla.

Such a copy command might look like this:

cqlsh:my_keyspace> COPY <table_name> FROM 'file.csv' WITH HEADER=true;

Searching for a setting in scylla.yaml¶

scylla.yaml can be found at /etc/scylla/scylla.yaml. In this case, you can search for a specific entry in the file. For example, if you wanted to determine if a setup was experimental and were to search for experimental in the file, you could try:

docker exec -it some-scylla grep -H 'experimental' /etc/scylla/scylla.yaml

Copyright

© 2016, The Apache Software Foundation.

Apache®, Apache Cassandra®, Cassandra®, the Apache feather logo and the Apache Cassandra® Eye logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

Was this page helpful?

PREVIOUS
Scylla Best Practices
NEXT
Production Readiness Guidelines
  • Create an issue
  • Edit this page

On this page

  • Best Practices for Running Scylla on Docker
    • Basic Operations
      • Starting a Single Scylla Node
      • Viewing Scylla Server Logs
      • Checking Server Status with Nodetool
      • Using cqlsh
      • Starting a Cluster
      • Restarting Scylla from within the Running Node
      • Configuring a Data Volume for Storage
    • Overriding scylla.yaml with a Master File
    • Getting Performance out of your Docker Container
      • Configuring Resource Limits
      • Network and Command-Line Settings
        • –seeds SEEDS
        • –listen-address ADDR
        • –broadcast-address ADDR
        • –broadcast-rpc-address ADDR
        • –smp COUNT
        • –memory AMOUNT
        • –overprovisioned ENABLE
        • –cpuset CPUSET
        • –developer-mode ENABLE
        • –experimental ENABLE
    • Other Useful Tips and Tricks
      • Checking the Current Version of Scylla on the Node
      • Using a Local CSV File to Import Data into Scylla
      • Searching for a setting in scylla.yaml
ScyllaDB Open Source
  • 5.1
    • master
    • 6.2
    • 6.1
    • 6.0
    • 5.4
    • 5.2
    • 5.1
  • Getting Started
    • Install Scylla
      • ScyllaDB Web Installer for Linux
      • Scylla Unified Installer (relocatable executable)
      • Air-gapped Server Installation
      • What is in each RPM
      • Scylla Housekeeping and how to disable it
      • Scylla Developer Mode
      • Scylla Configuration Reference
    • Configure Scylla
    • ScyllaDB Requirements
      • System Requirements
      • OS Support by Platform and Version
      • Scylla in a Shared Environment
    • Migrate to ScyllaDB
      • Migration Process from Cassandra to Scylla
      • Scylla and Apache Cassandra Compatibility
      • Migration Tools Overview
    • Integration Solutions
      • Integrate Scylla with Spark
      • Integrate Scylla with KairosDB
      • Integrate Scylla with Presto
      • Integrate Scylla with Elasticsearch
      • Integrate Scylla with Kubernetes
      • Integrate Scylla with the JanusGraph Graph Data System
      • Integrate Scylla with DataDog
      • Integrate Scylla with Kafka
      • Integrate Scylla with IOTA Chronicle
      • Integrate Scylla with Spring
      • Shard-Aware Kafka Connector for Scylla
      • Install Scylla with Ansible
      • Integrate Scylla with Databricks
    • Tutorials
  • Scylla for Administrators
    • Administration Guide
    • Procedures
      • Cluster Management
      • Backup & Restore
      • Change Configuration
      • Maintenance
      • Best Practices
      • Benchmarking Scylla
      • Migrate from Cassandra to Scylla
      • Disable Housekeeping
    • Security
      • Scylla Security Checklist
      • Enable Authentication
      • Enable and Disable Authentication Without Downtime
      • Generate a cqlshrc File
      • Reset Authenticator Password
      • Enable Authorization
      • Grant Authorization CQL Reference
      • Role Based Access Control (RBAC)
      • Scylla Auditing Guide
      • Encryption: Data in Transit Client to Node
      • Encryption: Data in Transit Node to Node
      • Generating a self-signed Certificate Chain Using openssl
      • Encryption at Rest
      • LDAP Authentication
      • LDAP Authorization (Role Management)
    • Admin Tools
      • Nodetool Reference
      • CQLSh
      • REST
      • Tracing
      • Scylla SStable
      • Scylla Types
      • SSTableLoader
      • cassandra-stress
      • SSTabledump
      • SSTable2json
      • SSTable Index
      • Scylla Logs
      • Seastar Perftune
      • Virtual Tables
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
    • ScyllaDB Manager
    • Upgrade Procedures
      • Scylla Enterprise
      • Scylla Open Source
      • Scylla Open Source to Scylla Enterprise
      • Scylla AMI
    • System Configuration
      • System Configuration Guide
      • scylla.yaml
      • Scylla Snitches
    • Benchmarking Scylla
  • Scylla for Developers
    • Learn To Use Scylla
      • Scylla University
      • Course catalog
      • Scylla Essentials
      • Basic Data Modeling
      • Advanced Data Modeling
      • MMS - Learn by Example
      • Care-Pet an IoT Use Case and Example
    • Scylla Alternator
    • Scylla Features
      • Scylla Open Source Features
      • Scylla Enterprise Features
    • Scylla Drivers
      • Scylla CQL Drivers
      • Scylla DynamoDB Drivers
  • CQL Reference
    • CQLSh: the CQL shell
    • Appendices
    • Compaction
    • Consistency Levels
    • Consistency Level Calculator
    • Data Definition
    • Data Manipulation
    • Data Types
    • Definitions
    • Global Secondary Indexes
    • Additional Information
    • Expiring Data with Time to Live (TTL)
    • Additional Information
    • Functions
    • JSON Support
    • Materialized Views
    • Non-Reserved CQL Keywords
    • Reserved CQL Keywords
    • ScyllaDB CQL Extensions
  • Scylla Architecture
    • Scylla Ring Architecture
    • Scylla Fault Tolerance
    • Consistency Level Console Demo
    • Scylla Anti-Entropy
      • Scylla Hinted Handoff
      • Scylla Read Repair
      • Scylla Repair
    • SSTable
      • Scylla SSTable - 2.x
      • ScyllaDB SSTable - 3.x
    • Compaction Strategies
    • Raft Consensus Algorithm in ScyllaDB
  • Troubleshooting Scylla
    • Errors and Support
      • Report a Scylla problem
      • Error Messages
      • Change Log Level
    • Scylla Startup
      • Ownership Problems
      • Scylla will not Start
      • Scylla Python Script broken
    • Cluster and Node
      • Failed Decommission Problem
      • Cluster Timeouts
      • Node Joined With No Data
      • SocketTimeoutException
      • NullPointerException
    • Data Modeling
      • Scylla Large Partitions Table
      • Scylla Large Rows and Cells Table
      • Large Partitions Hunting
    • Data Storage and SSTables
      • Space Utilization Increasing
      • Disk Space is not Reclaimed
      • SSTable Corruption Problem
      • Pointless Compactions
      • Limiting Compaction
    • CQL
      • Time Range Query Fails
      • COPY FROM Fails
      • CQL Connection Table
      • Reverse queries fail
    • Scylla Monitor and Manager
      • Manager and Monitoring integration
      • Manager lists healthy nodes as down
  • Knowledge Base
    • Upgrading from experimental CDC
    • Compaction
    • Counting all rows in a table is slow
    • CQL Query Does Not Display Entire Result Set
    • When CQLSh query returns partial results with followed by “More”
    • Run Scylla and supporting services as a custom user:group
    • Decoding Stack Traces
    • Snapshots and Disk Utilization
    • DPDK mode
    • Debug your database with Flame Graphs
    • How to Change gc_grace_seconds for a Table
    • Gossip in Scylla
    • Increase Permission Cache to Avoid Non-paged Queries
    • How does Scylla LWT Differ from Apache Cassandra ?
    • Map CPUs to Scylla Shards
    • Scylla Memory Usage
    • NTP Configuration for Scylla
    • Updating the Mode in perftune.yaml After a ScyllaDB Upgrade
    • POSIX networking for Scylla
    • Scylla consistency quiz for administrators
    • Recreate RAID devices
    • How to Safely Increase the Replication Factor
    • Scylla and Spark integration
    • Increase Scylla resource limits over systemd
    • Scylla Seed Nodes
    • How to Set up a Swap Space
    • Scylla Snapshots
    • Scylla payload sent duplicated static columns
    • Stopping a local repair
    • System Limits
    • How to flush old tombstones from a table
    • Time to Live (TTL) and Compaction
    • Scylla Nodes are Unresponsive
    • Update a Primary Key
    • Using the perf utility with Scylla
    • Configure Scylla Networking with Multiple NIC/IP Combinations
  • ScyllaDB University
  • Scylla FAQ
  • Contribute to ScyllaDB
  • Glossary
  • Alternator: DynamoDB API in Scylla
    • Getting Started With ScyllaDB Alternator
    • Scylla Alternator for DynamoDB users
Docs Tutorials University Contact Us About Us
© 2025, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 08 May 2025.
Powered by Sphinx 7.4.7 & ScyllaDB Theme 1.8.6