Data Lake is a hot topic in big data nowadays. There are many practices in many big technologies, such as AWS Lake Formation and DataBricks Delta Lake. The Delta Lake aims at table format that interacts with the SPARK and storage platform (HDFS, S3, Azure, e.g). The Lake Formation is denoted as the middle layer between S3 and EMR. However, many people are still confused about its definition, each company has its definition by themselves.
Even many Hadoop companies consider that their Hadoop product is the Data Lake because they have solved the storage and computation problem of structured data, semi-structured data, and unstructured data. For those cloud companies (AWS,e.g), then think that Data Lake is a method or tool to manage data that has been stored in the cloud platform.
So the Data Lake is still absent in its standardization.
As we talk about data lake, it is easy for us to compare it with traditional data warehouse. Maybe someone have doubts why we need data lake after a wide application of data warehouse.
In general, there are some differences between data lake and data warehouse.
|item||data lake||data warehouse|
|computing||strong,support all formats of data\n transforming and computingn||weak, only structured data computing|
|data model||better, more diverse||normal, simple|
In fact, data lake is a revolution of data warehouse. Traditional data warehouse emphasizes more on data formatting and paradigm processing, however data lake emphasizes on data clusive ability.
Hence a data lake is quicker and better than data warehouse in data landing and data adapting business changes. Maybe if only from the view of data lake concept, Hadoop meets all requirements of data lake. However, as a general platform, storage is just one aspect, it still requires many other sides to improve the user experience, such as data governance and data discovery. So the data lake is more advantageous than data warehouse.
If you want to reprint, please mark origin author. Please let me know if you have any doubts about the article.Welcome to comment here or email to email@example.com