Currently
organizations are working with relational databases to store their structured
data. But soon we will be in big data era (where we have store and handle huge
amount of unstructured data), these relational databases are not sufficient to
handle those unstructured data. We know these relational databases created
revolution in computing field, but now we are missing some important
characteristics for the big data era.
When you
talk about Unstructured Data, the lack of structure makes it time, money and
energy consuming task.
As per Wikipedia:
· In
1998, Merrill Lynch quoted a rule of thumb that somewhere around 80-90% of all
potentially usable business information may originate in unstructured form.
· More
recently, multiple analysts have estimated that data will grow 800% over the
next five years.
· Computer
World states that unstructured information might account for more than 70%–80%
of all data in organizations.
Next
Generation Databases (also called NoSQL databases) are the answer that solves
many of above problems. It is a completely new way of thinking about databases.
Next
Generation Databases have some important benefits; they are schema less, fast,
agile, distributed, open-source, horizontally scalable and they can work with
non-relational distributed and unstructured data. Nowadays many organizations
are collecting unstructured data from different sources (include email of different types; corporate contracts with multiple
vendors, employees, customers and more; human resource files; medical records,
financial reports; and corporate memos).
Relational
databases give you too much. They force you to twist your data to fit into
RDBMS but NoSQL-based alternatives "just give you what you need.
How it started?
The movement
had been started by some Web and Java developers; those want to build their own
data storage solutions, emulating those being built by Google Inc. and
Amazon.com Inc. They started to release these products as open source. Now these
open source data stores manage hundreds of terabytes or even petabytes of data
for Web 2.0 and cloud computing vendors.
Facebook,
for instance, created its Cassandra
data store to power a new search feature on its Web site rather than use its
existing database, MySQL. According to a presentation by Facebook engineer Avinash Lakshman (PDF
document), Cassandra can write to a data store taking up 50GB on disk in just 0.12
milliseconds, more than 2,500 times faster than MySQL.
There are
different types of NoSQL databases and each focuses on different applications:
·
Document
databases
·
Graph
store Databases
·
Key-value
Databases
·
Wide-column
stores
·
Multimodal
Databases
·
Object
Databases
·
Grid
& Cloud Database Solutions
The amount
of available NoSQL databases is growing rapidly and currently there are, as
this website shows, over 150 of them.
Today’s
world is changing rapidly and new tools are being developed constantly that
take full benefit of the new data opportunities and that are capable of dealing
with vast amounts of unstructured data.
Researchers come up with new ways of managing data to satisfy special
requirements: either requirement to handle data relationships that don't fit
into the relational model, or else requirements of high-scale volume or speed
that demand data processing be done on distributed collections of servers,
instead of central database servers.
Even though
these advanced technologies do great things to solve the specialized problem
they were designed for, relational
databases are still a good general-purpose solution for most business
needs. SQL isn't going away.
Big question
is, will enterprises take open-source alternatives seriously?
Few
Interesting articles on Next Generation Databases: