Apache Cassandra is a distributed database system designed to manage massive amounts of data, fast, without losing sleep. If you have big data needs that require clusters of servers, and don’t want to struggle with traditional relational databases and their associated scaling difficulties, then Cassandra may be your solution. Proven in dozens of high-profile companies, it provides highly available replicated data storage on commodity hardware or cloud infrastructure. It also integrates with existing monitoring and logging frameworks, both essential components of a resilient cloud deployment.
This guide will get you up and running with a Cassandra database instance which you can later scale up. It expects that you have the following:
• 1 server (Cloud Server or Dedicated Server) running Ubuntu 14.
• Root access
Cassandra requires Java 1.8, which is unfortunately not shipped with this distribution. Begin by adding this package repository which includes the necessary Java version.
With the repository installed, Java 1.8 should be available. Install that next.
apt-get install oracle-java8-set-default
The installation should have succeeded. To confirm that it worked, let’s check the Java version by running the below command.
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
With the correct Java version installed, we’ll next install the Datastax repository. This makes Cassandra available via your package manager, thus easing the initial installation and future upgrades.
echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
You’ll also need the trusted key for this repository. The key lets your package manager confirm that the packages it downloads are those actually built by the Datastax maintainers.
curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add -
With these pieces in place, we can now update your package list and install the Cassandra package.
apt-get install cassandra -y
If you want to enter the Cassandra command line, use the cqlsh command as shown:
[root@cassandra ~]# cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.0.8 | CQL spec 3.4.0 | Native protocol v4]
Use HELP for help.
You can also check the status of your Cassandra nodes with the “nodetool” command:
[root@cassandra ~]# nodetool status
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 102.78 KB 256 100.0% 05d9accc-9404-438c-b03b-869432439042 rack1
Cassandra is up and running, ready to be used directly or scaled out across a cluster. If you know anyone looking for a quality solution for their big data needs, share this article so they too can enjoy the benefits of this robust database solution.