The Artima Developer Community
Sponsored Link

Java Buzz Forum
How to create a data lake for fun and profit

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
News Manager

Posts: 47623
Nickname: newsman
Registered: Apr, 2003

News Manager is the force behind the news at Artima.com.
How to create a data lake for fun and profit Posted: Jul 24, 2014 5:19 PM
Reply to this message Reply

This post originated from an RSS feed registered with Java Buzz by News Manager.
Original Post: How to create a data lake for fun and profit
Feed Title: JavaWorld
Feed URL: http://www.javaworld.com/index.rss
Feed Description: JavaWorld.com: Fueling Innovation
Latest Java Buzz Posts
Latest Java Buzz Posts by News Manager
Latest Posts From JavaWorld

Advertisement

Most credit James Dixon of the open source BI vendor Pentaho with coining the phrase "data lake." Think of a data lake as an unstructured data warehouse, a place where you pull in all of your different sources into one large "pool" of data.

In contrast to a data mart, a data lake won't "wash" the data or try to structure it or limit the use cases. Sure, you should have some use cases in mind, but the architecture of a data lake is simple: a Hadoop File System (HDFS) with lots of directories and files on it.

Why would you want a data lake?
The answers are both technical and political. Usually, when you start up any new project that involves analyzing your company's data -- especially when the data is stored across functional areas -- you're in for trouble. For example, if the business unit that wants the data isn't part of the unit providing the data, what kind of priority do you think the unit providing the data likely assign to the effort? How is it budgeted? Who does the integration and how much needs to be done? How do you structure the data and for what purposes?

To read this article in full or to leave a comment, please click here

Read: How to create a data lake for fun and profit

Topic: Writing Tests for Data Access Code – Don’t Test the Framework Previous Topic   Next Topic Topic: Converting XML to CSV using XSLT 1.0

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use