Starting with Cassandra 0.7 the schema management in Cassandra is very easy. It is as good as centralized schema management with no SPoF . Typically schema operations involve loading schema initially, making changes to existing schema like adding CF and/or modifying existing CF attributes, and dropping schema elements like CFs and Keyspaces.
There are 3 ways these operations can be performed:
Load schema from cassandra.yaml using schematool or JMX Console: This option can be used to load schema only once. Running it twice in a cluster won't have any impact. So this is good for loading initial schema.
schematool import
OR
JConsole:MBeans->org.apache.cassandra.db->StorageService -> Operations -> loadSchemaFromYAML
Create/Modify schema using Thrift APIs: This provides high flexiibility and good for applications that wish to create/drop Keyspaces and ColumnFamilies on fly. You cannot modify existing ColumnFamilies using the APIs. Refer to Cassandra Wiki - API for details of the APIs available. Following APIs are available:
- describe_keyspace
- describe_keyspaces
- system_add_column_family
- system_drop_column_family
- system_add_keyspace
- system_drop_keyspace
Create/Modify schema using cassandra-cli: This is the most flexible option available. It allow practically everything that option #1 and #2 allow collectively. Following commands are supported. You can see the commands by entering "help;" command on cassandra-cli. For details of specific command type "help ;". For eg "help create keyspace;".
- Describe keyspace
- Show list of keyspaces
- Add a new keyspace with the specified attribute(s) and value(s)
- Update a keyspace with the specified attribute(s) and value(s)
- Create a new column family with the specified attribute(s) and value(s)
- Update a column family with the specified attribute(s) and value(s)
- Delete a keyspace
- Delete a column family
Under the hood
The Cassandra Wiki - Schema Updates describes the operations in good details. Following is the high level summary:
- Cassandra uses Schema and Migrations ColumnFamily in system keyspace for maintaining schema and changes to schema respectively.
- Schema changes done on one node are propagated on other nodes in the cluster
- Migrations CF tracks individual changes to schema. Schema CF contains reference to the latest version in use
- Some manual cleanup may be needed if node crashes while schema changes are being applied to the cluster
- To avoid concurrency issues always push schema changes through one node
Examples
Dropping a Keyspace
- Connect to cassandra-cli on a node and run drop keyspace command.
[root@rwc-sb6240-1 bin]# ./cassandra-cli
Welcome to cassandra CLI.
Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit.
[default@unknown] connect 20.17.221.19/9160;
Connected to: "NarenCluster072" on 20.17.221.19/9160
[default@unknown] drop keyspace KeyspaceMigration;
5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f
[default@unknown] exit;
[root@rwc-sb6240-1 bin]#
- The logs on the node will show following events (DEBUG MODE)
DEBUG [pool-1-thread-151] 2011-03-09 11:21:03,334 CassandraServer.java (line 759) drop_keyspace
DEBUG [MigrationStage:1] 2011-03-09 11:21:03,343 Table.java (line 397) applying mutation of row 35666261336631662d346138322d313165302d623865652d663930663861336635653166
...
DEBUG [CompactionExecutor:1] 2011-03-09 11:21:04,146 CompactionManager.java (line 109) Checking to see if compaction of Schema would be useful DEBUG [MigrationStage:1] 2011-03-09 11:21:04,146 MigrationManager.java (line 106) Announcing my schema is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f
DEBUG [CompactionExecutor:1] 2011-03-09 11:21:04,147 CompactionManager.java (line 109) Checking to see if compaction of Migrations would be useful
DEBUG [ReadStage:14] 2011-03-09 11:21:04,150 MigrationManager.java (line 87) Their data definitions are old. Sending updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [ReadStage:15] 2011-03-09 11:21:04,151 MigrationManager.java (line 87) Their data definitions are old. Sending updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
...
DEBUG [pool-1-thread-151] 2011-03-09 11:21:05,629 StorageProxy.java (line 628) My version is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f DEBUG [pool-1-thread-151] 2011-03-09 11:21:05,629 StorageProxy.java (line 659) Schemas are in agreement.
- On the other nodes the log entries will look like
DEBUG [ReadStage:9] 2011-03-09 11:12:19,250 MigrationManager.java (line 82) My data definitions are old. Asking for updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [ReadStage:9] 2011-03-09 11:12:19,253 MigrationManager.java (line 106) Announcing my schema is d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [MigrationStage:1] 2011-03-09 11:12:19,273 SchemaCheckVerbHandler.java (line 36) Received schema check request.
...
DEBUG [MigrationStage:1] 2011-03-09 11:12:20,681 MigrationManager.java (line 106) Announcing my schema is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f
3 comments:
Didn't understand a thing in this post, but I hope to see more blogging from you in future. :-)
By the way, you should really mask user/machine names and IP addresses from your post.
>>By the way, you should really mask user/machine names and IP addresses from your post.
Point taken. Thanks for the suggestion.
Would you please share a method on how you change schema in cassandra?
Post a Comment