Java Buzz Forum - The 7 most common Hadoop and Spark projects

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Buzz Forum
The 7 most common Hadoop and Spark projects

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

News Manager

Posts: 47623
Nickname: newsman
Registered: Apr, 2003

News Manager is the force behind the news at Artima.com.

The 7 most common Hadoop and Spark projects

Posted: Aug 18, 2015 2:15 PM

This post originated from an RSS feed registered with Java Buzz by News Manager.
Original Post: The 7 most common Hadoop and Spark projects Feed Title: JavaWorld Feed URL: http://www.javaworld.com/index.rss Feed Description: JavaWorld.com: Fueling Innovation	Latest Java Buzz Posts Latest Java Buzz Posts by News Manager Latest Posts From JavaWorld

There's an old axiom that goes something like this: If you offer someone your full support and financial backing to do something different and innovative, they’ll end up doing what everyone else is doing.

So it goes with Hadoop, Spark, and Storm. Everyone thinks they're doing something special with these new big data technologies, but it doesn't take long to encounter the same patterns over and over. Specific implementations may differ somewhat, but based on my experience, here are the seven most common projects.

Project No. 1: Data consolidation

Call it an "enterprise data hub" or "data lake." The idea is you have disparate data sources, and you want to perform analysis across them. This type of project consists of getting feeds from all the sources (either real time or as a batch) and shoving them into Hadoop. Sometimes this is step one to becoming a “data-driven company”; sometimes you simply want pretty reports. Data lakes usually materialize as files on HDFS and tables in Hive or Impala. There's a bold, new world where much of this shows up in HBase -- and Phoenix, in the future, because Hive is slow.

To read this article in full or to leave a comment, please click here

Read: The 7 most common Hadoop and Spark projects

Previous Topic

Next Topic


	Web Artima.com