In: Computer Science
Discuss impact of Big data on databases and database design (Hadoop). Give examples of application.
Databases are traditional techniques to handle data. The traditional Databases like Relational Database Management System perform very well with transaction update functions with small data. These days the data size has increased to the range of petabytes(1 PB = 1024 TB). For Relational Database Management System it is challenging to handle such huge data volumes. But Relational Database Management System trying hard by adding more central processing units (CPUs) ,more memory to the database management system. The data problem is of three V's
Volume: to store large amounts of data cost effectively
Velocity: Good ingestion rates.
Variety: Data of different types and structures. Most of the data comes in a semi-structured or unstructured pattern from social media, audio, video and emails
Relational databases and data warehouses are too slow and too expensive to work in such environment
Big data is generated at a very high velocity. RDBMS is designed for steady data retention. Hence it is too expensive for Relational Database Management System to handle such data
Example:
NoSQL and Hadoop refers to cloud friendly platforms , that are distributes all over the world to handle semi-structured data.They don't use traditional database techniques like sql, mysql e.t.c. Hadoop is an open source , distributed database to store and process the big data across several distributed nodes
Google and few other browsers ran search engines. They have to process the query within milliseconds to produce search results. And they are dealing with huge data of millions of users. Hence to deal with this type of unstructured data , they developed few distributed databases across multiple servers all over the world