VoltDB FAQ

Courtesy: VoltDB FAQ

What is VoltDB?

VoltDB is a SQL RDBMS for database applications that have ultra-high throughput requirements.  VoltDB offers:

  • Orders of magnitude better performance than conventional DBMSs
  • Equally fast for read and write work
  • Linear scalability
  • SQL as the DBMS interface
  • ACID transactions to ensure data consistency and accuracy
  • Built-in High availability

What is the architecture of VoltDB?

VoltDB is an in-memory database distributed on a scalable cluster of shared-nothing servers. Transactions are defined as Java stored procedures, using ANSI-standard SQL for data access. Parallel single-threaded processing ensures transactional consistency while avoiding the overhead of locking, latching, and resource management seen in traditional database systems.

How does VoltDB achieve ACID compliance?

ACID stands for Atomicity, Consistency, Isolation, and Durability — the cornerstones of database transaction processing.

  • Atomicity: VoltDB defines a transaction as a stored procedure, which either succeeds or rolls back on failure.
  • Consistency: VoltDB enforces schema and datatype constraints in all database queries.
  • Isolation: VoltDB transactions are globally ordered and run to completion on all affected partitions without interleaving.
  • Durability: VoltDB provides both replication of partitions (known as K-safety) and periodic database snapshots to ensure the availability of the data.

What is VoltDB’s scaling model?

VoltDB automatically partitions frequently accessed database tables across the available cluster nodes. Both the capacity and performance of the database can be increased by adding nodes to the cluster. Upon changes to cluster size, VoltDB automatically redistributes the partitions to the new configuration when you reload the data. VoltDB also allows tables with infrequently-changing data to be replicated to each node to further optimize performance.

How does VoltDB partition a database?

For partitioned tables, VoltDB distributes the rows across the partitions using a hash scheme. The user identifies, for each partitioned table, which column is used as input to the internal hashing function.

How are VoltDB partitions different from sharding a traditional database?

In sharding, database partitions are actually separate, unrelated databases and it is the responsibility of the application code to know what shard contains specific data as well as to manage the complexities of any queries that require data from multiple shards. More importantly, there is no guarantee of data or transactional consistency within the database system. All consistency logic must be written into your application. With VoltDB, the database engine transparently provides partition management, cross parition data access and full ACID-compliance across the entire database and all partitions.

Another cost of sharding is the complexity of managing the individual database instances. Backup, recovery, and all other management tasks must be performed separately for every node. With VoltDB, these management operations are coordinated centrally.

What are good use cases for VoltDB?

VoltDB is used today for traditional high performance applications such as capital markets data feeds, financial trade, telco record streams and sensor-based network systems. It’s also used in emerging applications like wireless, online gaming, fraud detection, digital ad exchanges and micro transaction systems. Any application requiring high database throughput, linear scaling and uncompromising data accuracy will benefit from VoltDB.

How does VoltDB differ from MySQL used with memcached?

Memcached is a distributed in-memory cache. It provides none of the reliability or consistency of an ACID-compliant SQL database. Memcached is often used as a cache in front of MySQL to improve performance of read operations. This requires the client application to manage the hash algorithms for both memcached and MySQL, as well as to handle the chores of cache synchronization.

VoltDB automates all of these functions with none of the penalties, while providing wildly superior performance. In addition, caching can help improve read performance for products such as MySQL, but does not help scale write performance. VoltDB scales linearly for both read and write operations.

How does VoltDB differ from Key-Value stores?

Key-Value stores are a mechanism for storing arbitrary data (i.e. values) based on individual keys. Distributing Key-Value stores is simple, since there is only one key. However, there is no structure within the data store and no transactional reliability provided by the system.

VoltDB provides the ability to store either structured or unstructured data, while at the same time providing full transactional consistency and reliability. VoltDB can even define a transaction that includes reads and writes across multiple keys. Finally, VoltDB provides comparable or better performance in terms of throughput.

What server platforms does VoltDB support?

The target platform for VoltDB is CentOS V5.4. However, the product is designed for compatibility with any 64-bit POSIX-compliant Linux platform. Development kits are also available for Ubuntu (9.04 and later) and Macintosh (OS X 10.5 and 10.6). Source code and build scripts are available for those who want to build the software on other platforms.

What SQL does VoltDB support?

For Version 1, VoltDB supports a basic subset of ANSI-standard SQL, including the CREATE INDEX, CREATE TABLE, and CREATE VIEW statements for schema definition and SELECT, INSERT, UPDATE, and DELETE for data manipulation. Additional SQL syntax will be added over time as the needs of users and customers dictate.

See the Using VoltDB manual for details on the specific SQL syntax that the current version of VoltDB supports.

Why doesn’t VoltDb support JDBC/ODBC?

JDBC and ODBC are conversational interfaces that require a significant number of network interactions to service database requests. This is fine for servicing low to average database traffic but not good for handling high velocity data sources and massive database throughput.

The primary interaction with a VoltDB database is through stored procedures, and the stored procedure interface (callProcedure) is easy to understand and interpret for those who are familiar with procedure calls through ODBC.

How do you increase the size of a VoltDB database cluster?

The size of the database cluster is defined when you compile the application catalog and start the database. To increase the cluster size, you simply need to:

  1. Save the current database contents to disk (using the SnapshotSave system procedure).
  2. Edit the deployment file, specifying the increased number of cluster nodes (in the hostcount attribute).
  3. Restart the database cluster, using the new deployment file.
  4. Reload the data from disk (using the SnapshotRestore system procedure).

What is the maximum cluster size that VoltDB supports?

There is no architectural limit to the number of nodes in a VoltDB cluster. That said, people often think of performance as a comparative value proportional to cluster size. And although it is true that VoltDB’s throughput scales linearly, it is also true that VoltDB’s initial performance on a single node is 30 to 50 times greater than any comparable database product. As a consequence, it is possible to achieve throughput rates of over a million transactions per second on a cluster with as few as 12 nodes.

VoltDB is regularly tested on and tuned for clusters of 6 to 12 nodes and has scaled linearly to 3.4 million TPS on 30 nodes. If you are considering running a large VoltDB cluster, please contact us – we’d love to help users design high-scaling VoltDB infrastructures.

Does VoltDB come with management tools?

The VoltDB Enterprise Edition includes a browser-based management tool called the VoltDB Enterprise Manager. The VoltDB Enterprise Manager helps you deploy and control a VoltDB database in a cluster environment. You can start and stop the database, update the schema and stored procedures, and manage disk-based snapshots of the data from a single console interface. See the VoltDB Management Guide for details.

What database monitoring tools are available?

The VoltDB Enterprise Manager provides performance and activity monitoring, in addition to its management and control functionality. The browser-based console interface provides real-time statistics on the number of records in each partition, as well as measurements of throughput and latency. For those using the Ganglia monitoring tool, the VoltDB Enterprise Manager also exports performance data to Ganglia automatically.

For those using the VoltDB Community Edition, there are system procedures that can provide similar information through the callable interface, such as @Statistics and @SystemInformation.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s