Wednesday, 19 March 2014

Why and what are Next Generation Databases?

Currently organizations are working with relational databases to store their structured data. But soon we will be in big data era (where we have store and handle huge amount of unstructured data), these relational databases are not sufficient to handle those unstructured data. We know these relational databases created revolution in computing field, but now we are missing some important characteristics for the big data era.

When you talk about Unstructured Data, the lack of structure makes it time, money and energy consuming task.

As per Wikipedia:
·      In 1998, Merrill Lynch quoted a rule of thumb that somewhere around 80-90% of all potentially usable business information may originate in unstructured form.

·      More recently, multiple analysts have estimated that data will grow 800% over the next five years.

·      Computer World states that unstructured information might account for more than 70%–80% of all data in organizations.

Next Generation Databases (also called NoSQL databases) are the answer that solves many of above problems. It is a completely new way of thinking about databases.

Next Generation Databases have some important benefits; they are schema less, fast, agile, distributed, open-source, horizontally scalable and they can work with non-relational distributed and unstructured data. Nowadays many organizations are collecting unstructured data from different sources (include email of different types; corporate contracts with multiple vendors, employees, customers and more; human resource files; medical records, financial reports; and corporate memos).

Relational databases give you too much. They force you to twist your data to fit into RDBMS but NoSQL-based alternatives "just give you what you need.

How it started?
The movement had been started by some Web and Java developers; those want to build their own data storage solutions, emulating those being built by Google Inc. and Amazon.com Inc. They started to release these products as open source. Now these open source data stores manage hundreds of terabytes or even petabytes of data for Web 2.0 and cloud computing vendors.

Facebook, for instance, created its Cassandra data store to power a new search feature on its Web site rather than use its existing database, MySQL. According to a presentation by Facebook engineer Avinash Lakshman (PDF document), Cassandra can write to a data store taking up 50GB on disk in just 0.12 milliseconds, more than 2,500 times faster than MySQL.

There are different types of NoSQL databases and each focuses on different applications:
·         Document databases
·         Graph store Databases
·         Key-value Databases
·         Wide-column stores
·         Multimodal Databases
·         Object Databases
·         Grid & Cloud Database Solutions

The amount of available NoSQL databases is growing rapidly and currently there are, as this website shows, over 150 of them.

Today’s world is changing rapidly and new tools are being developed constantly that take full benefit of the new data opportunities and that are capable of dealing with vast amounts of unstructured data.  Researchers come up with new ways of managing data to satisfy special requirements: either requirement to handle data relationships that don't fit into the relational model, or else requirements of high-scale volume or speed that demand data processing be done on distributed collections of servers, instead of central database servers.

Even though these advanced technologies do great things to solve the specialized problem they were designed for, relational databases are still a good general-purpose solution for most business needs. SQL isn't going away.

Big question is, will enterprises take open-source alternatives seriously?

Few Interesting articles on Next Generation Databases:

No comments:

Post a Comment