Cassandra is a NoSQL database

Apache Cassandra: Distributed management of large databases

As a real one distributed system Cassandra does not use a master. All clusters have equal rights and can process any database request, which significantly increases performance. The data are distributed across the nodes. By simply adding more nodes, the system is also easily scalable. After the installation it is only necessary to distribute the configuration files to the new node. Cassandra provides suitable tools for this.

To the Resilience and to guarantee the restoration of the data in an emergency, Apache Cassandra has a replication system that can be configured as required. Fault tolerance is minimized by automatically replicating the data between the nodes. Failed nodes can be easily replaced. The system remains available for inquiries at all times.

Cassandra also offers a high Availability and partition tolerance. According to the CAP theorem of computer science, it is never possible to guarantee consistency, availability and partition tolerance at the same time. Consistency is granted, which means that all nodes see the same data at all times, as is the case with many big data systems. After a failure, the consistency can be guaranteed again promptly by data recovery, whereas the other two properties must be guaranteed at all times.

Cassandra databases support the programming model developed by Google MapReduce for calculations with large amounts of data on distributed systems. The database query language CQL (Cassandra Query Language) is specially adapted to the data structures of Cassandra.