What do you get when you take a SQL database and start a new implementation from scratch, taking advantage of the latest research and modern hardware? Mike Stonebraker, the sword wielding Johnny Appleseed of the database world, hopes you get something like his new database, VoltDB: a pure SQL, pure ACID, pure OLTP, shared nothing, sharded, scalable, lockless, open source, in-memory DBMS, purpose-built for running hundreds of thousands of transactions a second. VoltDB claims to be 100 times faster than MySQL, up to 13 times faster than Cassandra, and 45 times faster than Oracle, with near-linear scaling.
Will VoltDB kill off the new NoSQL upstarts? Will VoltDB cause a mass extinction of ancient databases? Probably no and no to both questions, but it’s a product with a definite point-of-view and is worth a look as the transaction component in your system. But will it be right for you? Let’s see…
I first heard the details about VoltDB at Gluecon, where Mr. Stonebraker presented his highly entertaining Urban Myths About SQL (slides) talk. The hallways were buzzing long afterwards debating his in-your-face challenge of the existing order. With a refreshing take no prisoners style he lambastes existing SQL products as not good enough, saying they should either get better or die. NoSQL products fair no better as they simply aren’t necessary, their whole existence based on a perception of the weakness of current SQL implementations rather than what SQL could be, if properly implemented.
As an argument against NoSQL, VoltDB simply asks: if you can get a relational database with all the scalable ACIDy goodness, why would you ever stoop to using a NoSQL database that might only ever be eventually reliable?
The attack against existing relational databases systems is to position them as legacy systems that suffer from archaic architectures that are slow by design. They have to deal with disks, threads, and other performance killing constructs that must not be so much evolved as transcended. You see, VoltDB isn’t just competing against NoSQL, it’s aiming squarely at existing relational database vendors by using the patented technological leap play.
It’s a bold two prong strategy, but will the compromises that are part of the reconceptualization of SQL engine architectures prove too limiting for the majority?
The bulk of the rest of this article is about the SQL Myths, but I think touching a bit on Volt’s architecture before we address the myths will help frame the discussion a little better.
John Hugg, from VoltDB Engineering, says:
VoltDB is designed to solve OLTP at internet scale. If you don’t need the scale of VoltDB, then of course you’re going to be much happier with a general system like Postgres that offers so many features and is compatible with so much.
Some other quotes indicating the spirit behind VoltDB:
VoltDB is designed to make difficult or impossible problems manageable.
When we set out to build VoltDB, we figured it wasn’t worth the tradeoffs unless it’s MUCH faster. So it is.
John is not kidding. What matters to VoltDB is: speed at scale, speed at scale, speed at scale, SQL, and ACID. If that matches your priorities then you’ll probably be happy. Otherwise, as you’ll see, everything is sacrificed for speed at scale and what is sacrificed is often ease of use, generality, and error checking. It’s likely we’ll see ease of use improve over time, but for now it looks like rough going, unless of course, you are a going for speed at scale.
Some of the more interesting features are:
- Main-memory storage. VoltDB stores all its data in RAM. This means there are no buffer pools to manage, so that source of overhead is removed, and there are no blocking disk stalls.
- Run transactions to completion –single threaded –in timestamp order. Based on the model that 200 record updates is a hefty transaction, you might as well run them to completion. By single threading all locking overhead is removed. On multi-core systems they allocate chunks of memory to each CPU and run each CPU single threaded.
- Replicas. Persistence is guaranteed by having the data reside in multiple main memories. There are no logs, so there are no disks, which removes the disk overhead. For high availability an active-active architecture is used. Each transaction is run twice, once on each node in the same time-stamp order, which means the replicas are ACID consistent. Data can be asynchronously shipped to disk for a 5% performance hit. VoltDB replication is not a master/slave or primary/backup replication system. Each replica is a first class, fully capable instance of a partition.
- Tables are partitioned across multiple servers. Partitions are defined in a project XML configuration file that defines the partition keys. Clients can connect through any node. Partitioning is automatic, but you have to decide on the sharding key. They are working on an automated system to give advice on which key to use. If new nodes are added then the data must reloaded to cause the new nodes to be used, the data is not automatically distributed to the new nodes. Not all tables need to be partitioned. Small, mostly read-only tables can be replicated across all of the partitions of a VoltDB database.
- Stored procedures, written in Java, are the unit of transaction. Only data changed within a single invocation of a stored procedure is in a transaction, transactions can’t span multiple rounds of communication with a client. Also, from the same source as the previous link, The vast majority (almost 100%) of your stored procedure invocations must be single partition for VoltDB to be useful to you. If, for example, you partitioned by graph node, updating multiple nodes in a single transaction/procedure will not be a single partition transaction. These and other restrictions show the tradeoff between speed and generality.
- A limited subset of SQL ’99 is supported. DDL operations like ALTER and DROP aren’t supported. Operations having to do with users, groups and security have been moved into XML configuration files. Updating table structure on the fly is convenient, but it’s not fast, so it’s out. You are also discouraged from doing SUM operations because it would take a long time and block other transactions. Single threading means you must quantize your work into small enough chunks that don’t stall the work pipeline. The goal is to have transactions run in under 50 milliseconds. This is all done for speed.
- Design a schema and workflow to use single-sited procedures. Data for a table is stored in a partition that is split onto different nodes. Each set of data on a node is called a slice. When a query can run on a single node it it is said to be single-sited. Performance is clearly best when a query can run on just one node against a limited set of data.
- Challenging operations model. Changing the database schema or reconfiguring the cluster hardware requires first saving and shutting down the database. An exception are stored procedures which can be updated on the fly. In general, choosing speed as the primary design point has made the development and deployment process complicated and limiting. VoltDB, for example, does not support bringing a node back into the cluster while the database is running. All clients must be stopped, the database must be snapshotted, the database must be restarted in a special mode, the data is reloaded, and then clients can be restarted. See Using VoltDB for more details.
- No WAN support. In the case of a network partition VoltDB chooses consistency over availability, so you will see a hiccup until connectivity can be restored. Out of all the possible failures, Mr. Stonebraker argues, network partitioning is one of the least likely failures, especially compared to programmer error, so choosing strong consistency over availability is the right engineering call. Future versions of VoltDB will do more to address this single-data-center catastrophe scenario.
- OLAP is purposefully kept separate. VoltDB is only for OLTP. It’s not for reporting or OLAP because those uses require locks be taken which destroys performance. Every update is spooled to a (optional) companion warehouse system that is a second or two behind the main transaction system (yet is perfectly consistent). The companion system could be something like Vertica, an analytics RDBMS also built by Mr. Stonebraker. The justification for this split of responsibilities is that one size does not fit all. You should run transactions on something that is good at transactions and run reporting on something that’s good at reporting. An specialized transaction architecture will run circles around (50 times to 100 times faster) a one size fits all solution.
VoltDB is different because it has consciously been architected to remove what research shows are the four common sources of overhead in database management systems: logging (19%), latching (19%), locking (17%), B-tree, and buffer management operations (35%). By removing all removing all four sources overhead VoltDB can be really fast while still being ACID. How fast?
VoltDB claims to be 100 times faster than MySQL, up to 13 times faster than Cassandra, and 45 times faster than Oracle, with near-linear scaling. Though I think linear scaling only applies when you are not using distributed transactions. Two-phase transactions with an in-memory database will be relatively fast, but they will still be slow given the protocol overhead.
Keep in mind VoltDB is in-memory, so it should be fast, that should be a given. But VoltDB says they have gone beyond what other in-memory databases have done, they haven’t just improved buffer management. By removing locks, latching, and threading overhead it’s that much faster than other in-memory databases. You could argue that it’s a waste of RAM, that only hot data should be kept in RAM, but the contention is that RAM can hold the entire data set, so there’s no reason to compromise anymore.
The performance comparison against databases like Cassandra is somewhat of a strawman as they are designed for a much different purpose. Cassandra can store petabytes of data, across hundreds of nodes, across multiple data centers, and new nodes can be added at will. Operationally there’s no comparison. Though I realize the purpose of the benchmarks is to show SQL is not natively slow, can work well for key-value usage patterns, and compares favorably with other industry leaders.
I really like the parallelism of the origin of relational theory with the origin of VoltDB’s architecture. Relational theory was invented to remove update anomalies that occur when storing duplicate data. When data is duplicated, either within a record or in different tables, then it’s easy to cause inconsistencies when performing updates, deletes, and adds. The normalization process makes it much harder to have inconsistencies because facts are stored once and only once. It may be a stretch, but I think of the process of creating a SQL engine architecture based on removing performance anomalies as fascinatingly analogous.