The Artima Developer Community
Sponsored Link

Java Buzz Forum
The 7 most common Hadoop and Spark projects

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
News Manager

Posts: 47623
Nickname: newsman
Registered: Apr, 2003

News Manager is the force behind the news at Artima.com.
The 7 most common Hadoop and Spark projects Posted: Aug 18, 2015 2:15 PM
Reply to this message Reply

This post originated from an RSS feed registered with Java Buzz by News Manager.
Original Post: The 7 most common Hadoop and Spark projects
Feed Title: JavaWorld
Feed URL: http://www.javaworld.com/index.rss
Feed Description: JavaWorld.com: Fueling Innovation
Latest Java Buzz Posts
Latest Java Buzz Posts by News Manager
Latest Posts From JavaWorld

Advertisement

There's an old axiom that goes something like this: If you offer someone your full support and financial backing to do something different and innovative, they’ll end up doing what everyone else is doing.

So it goes with Hadoop, Spark, and Storm. Everyone thinks they're doing something special with these new big data technologies, but it doesn't take long to encounter the same patterns over and over. Specific implementations may differ somewhat, but based on my experience, here are the seven most common projects.

Project No. 1: Data consolidation

Call it an "enterprise data hub" or "data lake." The idea is you have disparate data sources, and you want to perform analysis across them. This type of project consists of getting feeds from all the sources (either real time or as a batch) and shoving them into Hadoop. Sometimes this is step one to becoming a “data-driven company”; sometimes you simply want pretty reports. Data lakes usually materialize as files on HDFS and tables in Hive or Impala. There's a bold, new world where much of this shows up in HBase -- and Phoenix, in the future, because Hive is slow.

To read this article in full or to leave a comment, please click here

Read: The 7 most common Hadoop and Spark projects

Topic: How to unit test Java servlets Previous Topic   Next Topic Topic: Installing PostgreSQL PL/Java as a PostgreSQL Extension

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use