SQL or NoSQL?

Dave Kellogg:

We have two major forces disrupting the comfortable stasis that has developed over the past 30 years.

One force is DBMS specialization: While the general-purpose RDBMS is useful for a broad range of applications, it is optimal for few of them.  The RDBMS has slowly become expensive bloatware that is functionally a jack of all trades, master of none.  MIT’s Michael Stonebraker calls the RDBMS a one size fits all solution.

Table-oriented, 1960s-era database technology:  RDBMSs were designed for handling data and short-text fields, necessitate mapping programmatic objects to tables (i.e., the impedance mismatch), and require the use of an increasingly stone-age query language, SQL.

Scalability:  relational databases were not designed to handle and do not generally cope well with Internet-scale, “big data” applications.  Most of the big Internet companies (e.g., Google, Yahoo, Facebook) do not rely on RDBMS technology for this reason.

The other force is NoSQL, an organic and rapidly-growing industry movement away from relational databases, driven by a number of factors including both technology and cost.

Facebook explained the rationale for developing the NoSQL storage system Cassandra:

Cassandra is a distributed storage system for managing structured data that is designed to scale to a very large size across many commodity servers, with no single point of failure. Reliability at massive scale is a very big challenge. Outages in the service can have significant negative impact. Hence Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different datacenters). At this scale, small and large components fail continuously; the way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. Cassandra has achieved several goals – scalability, high performance, high availability and applicability. In many ways Cassandra resembles a database and shares many design and implementation strategies with databases. Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format.

Not to be outdone, Twitter and Digg also moved to Cassandra made open source by Facebook.

SQL or NoSQL? That question is similar to “Open source or proprietary software?” Dave’s answer is simple:

Take a breath.  Look at all your alternatives.  Study total costs and technology applicability.  And make your best decision.

Update:  The H has an interesting speed guide to NoSQL.

Update 2: read Linux Journal’s post “SQL vs. NoSQL

The rivalry between SQL and NoSQL has been building during the past year to the point where some people are predicting the end of the SQL era. Actually, the two camps are largely complementary, because they’re designed to solve different problems.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s