Skip to content

voxpupuli/puppet-cassandra

Cassandra

CI Puppet Forge Puppet Forge - downloads Puppet Forge - endorsement Puppet Forge - scores

Table of Contents

Overview

Module to install, configure and manage Cassandra.

Setup

What Cassandra affects

What the Cassandra class affects

  • Optionally configures a repository to install the Cassandra packages.
  • Installs the Cassandra package.
  • Installs the Cassandra support tools
  • Optionally configures settings in ${config_path}/cassandra.yaml.
  • Optionally installs a JRE/JDK package (e.g. java-1.8.0-openjdk-headless) and the Java Native Access (JNA).
  • Optionally ensures that the Cassandra service is enabled and running.

Requirements

You need a compatible version of Java installed. You can use puppetlabs/java module or set cassandra::java_package.

On Debian systems there's a soft dependency on puppetlabs/apt module.

When using cassandra::schema resources you also need a compatible version of Python installed.

Beginning with Cassandra

Create a cassandra cluster called MyCassandraCluster which uses the GossipingPropertyFileSnitch and password authentication. In this very basic example the node itself becomes a seed for the cluster and the credentials will default to a user called cassandra with a password called of cassandra.

class { 'cassandra':
  baseline_settings => {
    authenticator               => 'AllowAllAuthenticator',
    authorizer                  => 'AllowAllAuthorizer',
    cluster_name                => 'MyCassandraCluster',
    commitlog_sync              => 'periodic',
    commitlog_sync_period_in_ms => 10000,
    listen_interface            => $facts['networking']['primary'],
    endpoint_snitch             => 'SimpleSnitch',
    partitioner                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
    seed_provider               => [
      {
        class_name => 'org.apache.cassandra.locator.SimpleSeedProvider',
        parameters => [
          {
            seeds => $facts['networking']['ip']
          },
        ],
      },
    ],
  },
}

However, PLEASE note that this is the ABSOLUTE MINIMUM configuration to get Cassandra up and running but will probably give you a rather badly configured node. Please see Suggested Baseline Settings for details on making your configuration a lot more robust.

Hiera

In your top level node classification (usually common.yaml), add the settings hash and all the tweaks you want all the clusters to use:

cassandra::baseline_settings:
  authenticator: AllowAllAuthenticator
  authorizer: AllowAllAuthorizer
  auto_bootstrap: true
  auto_snapshot: true
  ...

Then, in the individual node classification add the parts which define the cluster:

cassandra::settings:
  cluster_name: developer playground cassandra cluster
cassandra::dc: Onsite1
cassandra::rack: RAC1
cassandra::package_ensure: 3.0.5-1
cassandra::package_name: cassandra30

Usage

Setup a keyspace and users

We assume that authentication has been enabled for the cassandra cluster and we are connecting with the default user name and password ('cassandra/cassandra').

In this example, we create a keyspace (mykeyspace) with a table called 'users' and an index called 'users_lname_idx'.

We also add three users (to Cassandra, not the mykeyspace.users table) called spillman, akers and boone while ensuring that a user called lucan is absent.

class { 'cassandra':
  ...
}

class { 'cassandra::schema':
  cqlsh_password => 'cassandra',
  cqlsh_user     => 'cassandra',
  cqlsh_host     => $facts['networking']['ip'],
  indexes        => {
    'users_lname_idx' => {
      table    => 'users',
      keys     => 'lname',
      keyspace => 'mykeyspace',
    },
  },
  keyspaces      => {
    'mykeyspace' => {
      durable_writes  => false,
      replication_map => {
        keyspace_class     => 'SimpleStrategy',
        replication_factor => 1,
      },
    }
  },
  permissions    => {
    'Grant select permissions to spillman to all keyspaces' => {
      permission_name => 'SELECT',
      user_name       => 'spillman',
    },
    'Grant modify to to keyspace mykeyspace to akers'       => {
      keyspace_name   => 'mykeyspace',
      permission_name => 'MODIFY',
      user_name       => 'akers',
    },
    'Grant alter permissions to mykeyspace to boone'        => {
      keyspace_name   => 'mykeyspace',
      permission_name => 'ALTER',
      user_name       => 'boone',
    },
    'Grant ALL permissions to mykeyspace.users to gbennet'  => {
      keyspace_name   => 'mykeyspace',
      permission_name => 'ALTER',
      table_name      => 'users',
      user_name       => 'gbennet',
    },
  },
  tables         => {
    'users' => {
      columns  => {
        user_id       => 'int',
        fname         => 'text',
        lname         => 'text',
        'PRIMARY KEY' => '(user_id)',
      },
      keyspace => 'mykeyspace',
    },
  },
  users          => {
    'spillman' => {
      password => 'Niner27',
    },
    'akers'    => {
      password  => 'Niner2',
      superuser => true,
    },
    'boone'    => {
      password => 'Niner75',
    },
    'gbennet'  => {
      'password' => 'foobar',
    },
    'lucan'    => {
      'ensure' => absent
    },
  },
}

Create a Cluster in a Single Data Center

This is a basic example of a six node cluster with two seeds to be created in a single data center spanning two racks. The nodes in the cluster are:

Node Name IP Address
node0 (seed 1) 110.82.155.0
node1 110.82.155.1
node2 110.82.155.2
node3 (seed 2) 110.82.156.3
node4 110.82.156.4
node5 110.82.156.5

Each node is configured to use the GossipingPropertyFileSnitch and 256 virtual nodes (vnodes). The name of the cluster is 'MyCassandraCluster'. Also, while building the initial cluster, we are setting the auto_bootstrap to false.

In this example, we are going to expand the example by:

node /^node\d+$/ {
  class { 'cassandra':
    settings => {
      'authenticator'               => 'AllowAllAuthenticator',
      'auto_bootstrap'              => false,
      'cluster_name'                => 'MyCassandraCluster',
      'commitlog_directory'         => '/var/lib/cassandra/commitlog',
      'commitlog_sync'              => 'periodic',
      'commitlog_sync_period_in_ms' => 10000,
      'data_file_directories'       => ['/var/lib/cassandra/data'],
      'endpoint_snitch'             => 'GossipingPropertyFileSnitch',
      'hints_directory'             => '/var/lib/cassandra/hints',
      'listen_interface'            => 'eth1',
      'num_tokens'                  => 256,
      'partitioner'                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
      'saved_caches_directory'      => '/var/lib/cassandra/saved_caches',
      'seed_provider'               => [
        {
          'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider',
          'parameters' => [
            {
              'seeds' => '110.82.155.0,110.82.156.3',
            },
          ],
        },
      ],
      'start_native_transport'      => true,
    },
  }
}

The default value for the num_tokens is already 256, but it is included in the example for clarity. Do not forget to either set auto_bootstrap to true or not set the attribute at all after initializing the cluster.

Create a Cluster in Multiple Data Centers

Node Name IP Address Data Center Rack
node0 (seed 1) 10.168.66.41 DC1 RAC1
node1 10.176.43.66 DC1 RAC1
node2 10.168.247.41 DC1 RAC1
node3 (seed 2) 10.176.170.59 DC2 RAC1
node4 10.169.61.170 DC2 RAC1
node5 10.169.30.138 DC2 RAC1

For the sake of simplicity, we will confine this example to the nodes:

node /^node[012]$/ {
  class { 'cassandra':
    dc             => 'DC1',
    settings       => {
      'authenticator'               => 'AllowAllAuthenticator',
      'auto_bootstrap'              => false,
      'cluster_name'                => 'MyCassandraCluster',
      'commitlog_directory'         => '/var/lib/cassandra/commitlog',
      'commitlog_sync'              => 'periodic',
      'commitlog_sync_period_in_ms' => 10000,
      'data_file_directories'       => ['/var/lib/cassandra/data'],
      'endpoint_snitch'             => 'GossipingPropertyFileSnitch',
      'hints_directory'             => '/var/lib/cassandra/hints',
      'listen_interface'            => 'eth1',
      'num_tokens'                  => 256,
      'partitioner'                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
      'saved_caches_directory'      => '/var/lib/cassandra/saved_caches',
      'seed_provider'               => [
        {
          'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider',
          'parameters' => [
            {
              'seeds' => '110.82.155.0,110.82.156.3',
            },
          ],
        },
      ],
      'start_native_transport'      => true,
    },
  }
}

node /^node[345]$/ {
  class { 'cassandra':
    dc             => 'DC2',
    settings       => {
      'authenticator'               => 'AllowAllAuthenticator',
      'auto_bootstrap'              => false,
      'cluster_name'                => 'MyCassandraCluster',
      'commitlog_directory'         => '/var/lib/cassandra/commitlog',
      'commitlog_sync'              => 'periodic',
      'commitlog_sync_period_in_ms' => 10000,
      'data_file_directories'       => ['/var/lib/cassandra/data'],
      'endpoint_snitch'             => 'GossipingPropertyFileSnitch',
      'hints_directory'             => '/var/lib/cassandra/hints',
      'listen_interface'            => 'eth1',
      'num_tokens'                  => 256,
      'partitioner'                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
      'saved_caches_directory'      => '/var/lib/cassandra/saved_caches',
      'seed_provider'               => [
        {
          'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider',
          'parameters' => [
            {
              'seeds' => '110.82.155.0,110.82.156.3',
            },
          ],
        },
      ],
      'start_native_transport'      => true,
    },
  }
}

We don't need to specify the rack name (with the rack attribute) as RAC1 is the default value. Again, do not forget to either set auto_bootstrap to true or not set the attribute at all after initializing the cluster.

Reference

The reference documentation is generated using the puppet-strings tool. To see all of it, please go to http://voxpupuli.github.io/puppet-cassandra.

Limitations

  • When creating key spaces, indexes, cql_types and users the settings will only be used to create a new resource if it does not currently exist. If a change is made to the Puppet manifest but the resource already exits, this change will not be reflected.

Migrating to new module version

New version (>3.1) of this module drops some build-in resources like firewall and sysctl but they can be replaced. See examples

Development

Contributions will be gratefully accepted. Please go to the project page, fork the project, make your changes locally and then raise a pull request. Details on how to do this are available at https://guides.github.com/activities/contributing-to-open-source.

Please also see the CONTRIBUTING.md page for project specific requirements.

Additional Contributers

For a list of contributers see CONTRIBUTING.md and https://github.com/voxpupuli/puppet-cassandra/graphs/contributors