ScyllaDB University LIVE, FREE Virtual Training Event | March 21
Register for Free
ScyllaDB Documentation Logo Documentation
  • Server
  • Cloud
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Download
ScyllaDB Docs ScyllaDB Open Source Glossary

Caution

You're viewing documentation for a previous version. Switch to the latest stable version.

Glossary¶

Anti-entropy¶

A state where data is in order and organized. Scylla has processes in place to make sure that data is antientropic where all replicas contain the most recent data and that data is consistent between replicas. See Scylla Anti-Entropy.

Bootstrap¶

When a new node is added to a cluster, the bootstrap process ensures that the data in the cluster is automatically redistributed to the new node. A new node in this case is an empty node without system tables or data. See bootstrap.

CAP Theorem¶

The CAP Theorem is the notion that C (Consistency), A (Availability) and P (Partition Tolerance) of data are mutually dependent in a distributed system. Increasing any 2 of these factors will reduce the third. Scylla chooses availability and partition tolerance over consistency. See Fault Tolerance.

Cluster¶

One or multiple Scylla nodes, acting in concert, which own a single contiguous token range. State is communicated between nodes in the cluster via the Gossip protocol. See Ring Architecture.

Clustering Key¶

A single or multi-column clustering key determines a row’s uniqueness and sort order on disk within a partition. See Ring Architecture.

Column Family¶

See table.

Compaction¶

The process of reading several SSTables, comparing the data and time stamps and then writing one SSTable containing the merged, most recent, information. See Compaction Strategies.

Compaction Strategy¶

Determines which of the SSTables will be compacted, and when. See Compaction Strategies.

Consistency Level (CL)¶

A dynamic value which dictates the number of replicas (in a cluster) that must acknowledge a read or write operation. This value is set by the client on a per operation basis. For the CQL Shell, the consistency level defaults to ONE for read and write operations. See Consistency Levels.

Date-tiered compaction strategy (DTCS)¶

DTCS is designed for time series data, but should not be used. Use Time-Window Compaction Strategy. See Compaction Strategies.

Dummy Rows¶

Cache dummy rows are entries in the row set, which have a clustering position, although they do not represent CQL rows written by users. Scylla cache uses them to mark boundaries of population ranges, to represent the information that the whole range is complete, and there is no need to go to sstables to read the gaps between existing row entries when scanning.

Entropy¶

A state where data is not consistent. This is the result when replicas are not synced and data is random. Scylla has measures in place to be antientropic. See Scylla Anti-Entropy.

Eventual Consistency¶

In Scylla, when considering the CAP Theorem, availability and partition tolerance are considered a higher priority than consistency.

Hint¶

A short record of a write request that is held by the co-ordinator until the unresponsive node becomes responsive again, at which point the write request data in the hint is written to the replica node. See Hinted Handoff.

Hinted Handoff¶

Reduces data inconsistency which can occur when a node is down or there is network congestion. In Scylla, when data is written and there is an unresponsive replica, the coordinator writes itself a hint. When the node recovers, the coordinator sends the node the pending hints to ensure that it has the data it should have received. See Hinted Handoff.

Idempotent¶

Denoting an element of a set which is unchanged in value when multiplied or otherwise operated on by itself. Scylla Counters are not indepotent because in the case of a write failure, the client cannot safely retry the request.

JBOD¶

JBOD or Just another Bunch Of Disks is a non-raid storage system using a server with multiple disks in order to instantiate a separate file system per disk. The benefit is that if a single disk fails, only it needs to be replaced and not the whole disk array. The disadvantage is that free space and load may not be evenly distributed. See the FAQ.

Key Management Interoperability Protocol (KMIP)¶

KMIP is a communication protocol that defines message formats for storing keys on a key management server (KMIP server). You can use a KMIP server to protect your keys when using Encryption at Rest. See Encryption at Rest.

Keyspace¶

A collection of tables with attributes which define how data is replicated on nodes. See Ring Architecture.

Leveled compaction strategy (LCS)¶

LCS uses small, fixed-size (by default 160 MB) SSTables divided into different levels. See Compaction Strategies.

Log-structured-merge (LSM)¶

A technique of keeping sorted files and merging them. LSM is a data structure that maintains key-value pairs. See Compaction

Logical Core (lcore)¶

A hyperthreaded core on a hyperthreaded system, or a physical core on a system without hyperthreading.

MemTable¶

An in-memory data structure servicing both reads and writes. Once full, the Memtable flushes to an SSTable. See Compaction Strategies.

Mutation¶

A change to data such as column or columns to insert, or a deletion. See Hinted Handoff.

Node¶

A single installed instance of Scylla. See Ring Architecture.

Nodetool¶

A simple command-line interface for administering a Scylla node. A nodetool command can display a given node’s exposed operations and attributes. Scylla’s nodetool contains a subset of these operations. See Ring Architecture.

Partition¶

A subset of data that is stored on a node and replicated across nodes. There are two ways to consider a partition. In CQL, a partition appears as a group of sorted rows, and is the unit of access for queried data, given that most queries access a single partition. On the physical layer, a partition is a unit of data stored on a node and is identified by a partition key. See Ring Architecture.

Partition Key¶

The unique identifier for a partition, a partition key may be hashed from the first column in the primary key. A partition key may also be hashed from a set of columns, often referred to as a compound primary key. A partition key determines which virtual node gets the first partition replica. See Ring Architecture.

Partitioner¶

A hash function for computing which data is stored on which node in the cluster. The partitioner takes a partition key as an input, and returns a ring token as an output. By default Scylla uses the 64 bit Murmurhash3 function and this hash range is numerically represented as a signed 64bit integer, see Ring Architecture.

Primary Key¶

In a CQL table definition, the primary key clause specifies the partition key and optional clustering key. These keys uniquely identify each partition and row within a partition. See Ring Architecture.

Quorum¶

Quorum is a global consistency level setting across the entire cluster including all data centers. See Consistency Levels.

Read Amplification¶

Excessive read requests which require many SSTables. RA is calculated by the number of disk reads per query. High RA occurs when there are many pages to read in order to answer a query. See Compaction Strategies.

Read Operation¶

A read operation occurs when an application gets information from an SSTable and does not change that information in any way. See Fault Tolerance.

Read Repair¶

An anti-entropy mechanism for read operations ensuring that replicas are updated with most recently updated data. These repairs run automatically, asynchronously, and in the background. See Scylla Read Repair.

Reconciliation¶

A verification phase during a data migration where the target data is compared against original source data to ensure that the migration architecture has transferred the data correctly. See Scylla Read Repair.

Repair¶

A process which runs in the background and synchronizes the data between nodes, so that eventually, all the replicas hold the same data. See Scylla Repair.

Replication¶

The process of replicating data across nodes in a cluster. See Fault Tolerance.

Replication Factor (RF)¶

The total number of replica nodes across a given cluster. An RF of 1 means that the data will only exist on a single node in the cluster and will not have any fault tolerance. This number is a setting defined for each keyspace. All replicas share equal priority; there are no primary or master replicas. An RF for any table, can be defined for each DC. See Fault Tolerance.

Reshape¶

Rewrite a set of SSTables to satisfy a compaction strategy’s criteria. For example, restoring data from an old backup or before the strategy update.

Reshard¶

Splitting an SSTable, that is owned by more than one shard (core), into SSTables that are owned by a single shard. For example: when restoring data from a different server, importing SSTables from Apache Cassandra, or changing the number of cores in a machine (upscale).

Shard¶

Each Scylla node is internally split into shards, an independent thread bound to a dedicated core. Each shard of data is allotted CPU, RAM, persistent storage, and networking resources which it uses as efficiently as possible. See Scylla Shard per Core Architecture for more information.

Shedding¶

Dropping requests to protect the system. This will occur if the request is too large or exceeds the max number of concurrent requests per shard.

Size-tiered compaction strategy¶

Triggers when the system has enough (four by default) similarly sized SSTables. See Compaction Strategies.

Snapshot¶

Snapshots in Scylla are an essential part of the backup and restore mechanism. Whereas in other databases a backup starts with creating a copy of a data file (cold backup, hot backup, shadow copy backup), in Scylla the process starts with creating a table or keyspace snapshot. See Scylla Snapshots.

Snitch¶

The mapping from the IP addresses of nodes to physical and virtual locations, such as racks and data centers. There are several types of snitches. The type of snitch affects the request routing mechanism. See Scylla Snitches.

Space amplification¶

Excessive disk space usage which requires that the disk be larger than a perfectly-compacted representation of the data (i.e., all the data in one single SSTable). SA is calculated as the ratio of the size of database files on a disk to the actual data size. High SA occurs when there is more disk space being used than the size of the data. See Compaction Strategies.

SSTable¶

A concept borrowed from Google Big Table, SSTables or Sorted String Tables store a series of immutable rows where each row is identified by its row key. See Compaction Strategies. The SSTable format is a persistent file format. See Scylla SSTable Format.

Table¶

A collection of columns fetched by row. Columns are ordered by Clustering Key. See Ring Architecture.

Time-window compaction strategy¶

TWCS is designed for time series data and replaced Date-tiered compaction. See Compaction Strategies.

Token¶

A value in a range, used to identify both nodes and partitions. Each node in a Scylla cluster is given an (initial) token, which defines the end of the range a node handles. See Ring Architecture.

Token Range¶

The total range of potential unique identifiers supported by the partitioner. By default, each Scylla node in the cluster handles 256 token ranges. Each token range corresponds to a Vnode. Each range of hashes in turn is a segment of the total range of a given hash function. See Ring Architecture.

Tombstone¶

A marker that indicates that data has been deleted. A large number of tombstones may impact read performance and disk usage, so an efficient tombstone garbage collection strategy should be employed. See Tombstones GC options.

Tunable Consistency¶

The possibility for unique, per-query, Consistency Level settings. These are incremental and override fixed database settings intended to enforce data consistency. Such settings may be set directly from a CQL statement when response speed for a given query or operation is more important. See Fault Tolerance.

Virtual node¶

A range of tokens owned by a single Scylla node. Scylla nodes are configurable and support a set of Vnodes. In legacy token selection, a node owns one token (or token range) per node. With Vnodes, a node can own many tokens or token ranges; within a cluster, these may be selected randomly from a non-contiguous set. In a Vnode configuration, each token falls within a specific token range which in turn is represented as a Vnode. Each Vnode is then allocated to a physical node in the cluster. See Ring Architecture.

Workload¶

A database category that allows you to manage different sources of database activities, such as requests or administrative activities. By defining workloads, you can specify how ScyllaDB will process those activities. For example, you can prioritize one workload over another (e.g., user requests over administrative activities). See Workload Prioritization.

Write Amplification¶

Excessive compaction of the same data. WA is calculated by the ratio of bytes written to storage versus bytes written to the database. High WA occurs when there are more bytes/second written to storage than are actually written to the database. See Compaction Strategies.

Write Operation¶

A write operation occurs when information is added or removed from an SSTable. See Fault Tolerance.

Was this page helpful?

PREVIOUS
Contribute to ScyllaDB
NEXT
Alternator: DynamoDB API in Scylla
  • Create an issue
  • Edit this page
ScyllaDB Open Source
  • 5.2
    • master
    • 6.2
    • 6.1
    • 6.0
    • 5.4
    • 5.2
    • 5.1
  • Getting Started
    • Install ScyllaDB
      • ScyllaDB Web Installer for Linux
      • ScyllaDB Unified Installer (relocatable executable)
      • Air-gapped Server Installation
      • What is in each RPM
      • ScyllaDB Housekeeping and how to disable it
      • ScyllaDB Developer Mode
      • ScyllaDB Configuration Reference
    • Configure ScyllaDB
    • ScyllaDB Requirements
      • System Requirements
      • OS Support by Linux Distributions and Version
      • ScyllaDB in a Shared Environment
    • Migrate to ScyllaDB
      • Migration Process from Cassandra to Scylla
      • Scylla and Apache Cassandra Compatibility
      • Migration Tools Overview
    • Integration Solutions
      • Integrate Scylla with Spark
      • Integrate Scylla with KairosDB
      • Integrate Scylla with Presto
      • Integrate Scylla with Elasticsearch
      • Integrate Scylla with Kubernetes
      • Integrate Scylla with the JanusGraph Graph Data System
      • Integrate Scylla with DataDog
      • Integrate Scylla with Kafka
      • Integrate Scylla with IOTA Chronicle
      • Integrate Scylla with Spring
      • Shard-Aware Kafka Connector for Scylla
      • Install Scylla with Ansible
      • Integrate Scylla with Databricks
    • Tutorials
  • ScyllaDB for Administrators
    • Administration Guide
    • Procedures
      • Cluster Management
      • Backup & Restore
      • Change Configuration
      • Maintenance
      • Best Practices
      • Benchmarking Scylla
      • Migrate from Cassandra to Scylla
      • Disable Housekeeping
    • Security
      • ScyllaDB Security Checklist
      • Enable Authentication
      • Enable and Disable Authentication Without Downtime
      • Generate a cqlshrc File
      • Reset Authenticator Password
      • Enable Authorization
      • Grant Authorization CQL Reference
      • Role Based Access Control (RBAC)
      • ScyllaDB Auditing Guide
      • Encryption: Data in Transit Client to Node
      • Encryption: Data in Transit Node to Node
      • Generating a self-signed Certificate Chain Using openssl
      • Encryption at Rest
      • LDAP Authentication
      • LDAP Authorization (Role Management)
    • Admin Tools
      • Nodetool Reference
      • CQLSh
      • REST
      • Tracing
      • Scylla SStable
      • Scylla Types
      • SSTableLoader
      • cassandra-stress
      • SSTabledump
      • SSTable2json
      • Scylla Logs
      • Seastar Perftune
      • Virtual Tables
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
    • ScyllaDB Manager
    • Upgrade Procedures
      • ScyllaDB Open Source Upgrade
      • ScyllaDB Open Source to ScyllaDB Enterprise Upgrade
      • ScyllaDB Image
      • ScyllaDB Enterprise
    • System Configuration
      • System Configuration Guide
      • scylla.yaml
      • ScyllaDB Snitches
    • Benchmarking ScyllaDB
  • ScyllaDB for Developers
    • Learn To Use ScyllaDB
      • Scylla University
      • Course catalog
      • Scylla Essentials
      • Basic Data Modeling
      • Advanced Data Modeling
      • MMS - Learn by Example
      • Care-Pet an IoT Use Case and Example
    • Scylla Alternator
    • Scylla Features
      • Scylla Open Source Features
      • Scylla Enterprise Features
    • Scylla Drivers
      • Scylla CQL Drivers
      • Scylla DynamoDB Drivers
    • Workload Attributes
  • CQL Reference
    • CQLSh: the CQL shell
    • Appendices
    • Compaction
    • Consistency Levels
    • Consistency Level Calculator
    • Data Definition
    • Data Manipulation
    • Data Types
    • Definitions
    • Global Secondary Indexes
    • Additional Information
    • Expiring Data with Time to Live (TTL)
    • Additional Information
    • Functions
    • JSON Support
    • Materialized Views
    • Non-Reserved CQL Keywords
    • Reserved CQL Keywords
    • ScyllaDB CQL Extensions
  • ScyllaDB Architecture
    • ScyllaDB Ring Architecture
    • ScyllaDB Fault Tolerance
    • Consistency Level Console Demo
    • ScyllaDB Anti-Entropy
      • Scylla Hinted Handoff
      • Scylla Read Repair
      • Scylla Repair
    • SSTable
      • ScyllaDB SSTable - 2.x
      • ScyllaDB SSTable - 3.x
    • Compaction Strategies
    • Raft Consensus Algorithm in ScyllaDB
  • Troubleshooting ScyllaDB
    • Errors and Support
      • Report a Scylla problem
      • Error Messages
      • Change Log Level
    • ScyllaDB Startup
      • Ownership Problems
      • Scylla will not Start
      • Scylla Python Script broken
    • Upgrade
      • Inaccessible configuration files after ScyllaDB upgrade
    • Cluster and Node
      • Failed Decommission Problem
      • Cluster Timeouts
      • Node Joined With No Data
      • SocketTimeoutException
      • NullPointerException
    • Data Modeling
      • Scylla Large Partitions Table
      • Scylla Large Rows and Cells Table
      • Large Partitions Hunting
    • Data Storage and SSTables
      • Space Utilization Increasing
      • Disk Space is not Reclaimed
      • SSTable Corruption Problem
      • Pointless Compactions
      • Limiting Compaction
    • CQL
      • Time Range Query Fails
      • COPY FROM Fails
      • CQL Connection Table
      • Reverse queries fail
    • ScyllaDB Monitor and Manager
      • Manager and Monitoring integration
      • Manager lists healthy nodes as down
  • Knowledge Base
    • Upgrading from experimental CDC
    • Compaction
    • Counting all rows in a table is slow
    • CQL Query Does Not Display Entire Result Set
    • When CQLSh query returns partial results with followed by “More”
    • Run Scylla and supporting services as a custom user:group
    • Decoding Stack Traces
    • Snapshots and Disk Utilization
    • DPDK mode
    • Debug your database with Flame Graphs
    • How to Change gc_grace_seconds for a Table
    • Gossip in Scylla
    • Increase Permission Cache to Avoid Non-paged Queries
    • How does Scylla LWT Differ from Apache Cassandra ?
    • Map CPUs to Scylla Shards
    • Scylla Memory Usage
    • NTP Configuration for Scylla
    • Updating the Mode in perftune.yaml After a ScyllaDB Upgrade
    • POSIX networking for Scylla
    • Scylla consistency quiz for administrators
    • Recreate RAID devices
    • How to Safely Increase the Replication Factor
    • Scylla and Spark integration
    • Increase Scylla resource limits over systemd
    • Scylla Seed Nodes
    • How to Set up a Swap Space
    • Scylla Snapshots
    • Scylla payload sent duplicated static columns
    • Stopping a local repair
    • System Limits
    • How to flush old tombstones from a table
    • Time to Live (TTL) and Compaction
    • Scylla Nodes are Unresponsive
    • Update a Primary Key
    • Using the perf utility with Scylla
    • Configure Scylla Networking with Multiple NIC/IP Combinations
  • ScyllaDB FAQ
  • Contribute to ScyllaDB
  • Glossary
  • Alternator: DynamoDB API in Scylla
    • Getting Started With ScyllaDB Alternator
    • ScyllaDB Alternator for DynamoDB users
Docs Tutorials University Contact Us About Us
© 2025, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 08 May 2025.
Powered by Sphinx 7.4.7 & ScyllaDB Theme 1.8.6