Apache Spark, the big data processing framework that is a fixture of many Hadoop installs, has reached its 1.4 incarnation. With it comes support for R and Python 3 -- two languages in wide use by data crunchers -- as well as better leveraging of containers and cluster management tools used to manage distributed work.
The R programming language, mainly used for statistical analysis and data science, is a perfect fit for driving a data-processing framework like Spark. SparkR, the Spark 1.4 package that adds R support, allows R programmers to write code that scales out across multiple cores or Spark nodes, and to read and write all the data formats supported in Spark. (Also supported in R is Spark SQL for allowing SQL queries of Spark data.)