The biggest thing you need to know about Hadoop is that it isn’t Hadoop anymore.
Between Cloudera sometimes swapping out HDFS for Kudu while declaring Spark the center of its universe (thus replacing MapReduce everywhere it is found) and Hortonworks joining the Spark party, the only item you can be sure of in a “Hadoop” cluster is YARN. Oh, but Databricks, aka the Spark people, prefer Mesos over YARN -- and by the way, Spark doesn’t require HDFS.
Yet distributed filesystems are still useful. Business intelligence is a great use case for Cloudera’s Impala and Kudu, a distributed columnar store, is optimized for it. Spark is great for many tasks, but sometimes you need an MPP (massively parallel processing) solution like Impala to do the trick -- and Hive remains a useful file-to-table management system. Even when you’re not using Hadoop because you’re focused on in-memory, real-time analytics with Spark, you still may end up using pieces of Hadoop here and there.