Parallel R: Data Analysis In The Distributed World

by Q. Ethan Mccallum

2021-01-31 19:09:19

It’s tough to argue with R as a high-quality, cross-platform, open source statistical software product—unless you’re in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyz... Read more

It’s tough to argue with R as a high-quality, cross-platform, open source statistical software product—unless you’re in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets, including three chapters on using R and Hadoop together. You’ll learn the basics of Snow, Multicore, Parallel, Segue, RHIPE, and Hadoop Streaming, including how to find them, how to use them, when they work well, and when they don’t.

With these packages, you can overcome R’s single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R’s memory barrier.

  • Snow:works well in a traditional cluster environment
  • Multicore:popular for multiprocessor and multicore computers
  • Parallel:part of the upcoming R 2.14.0 release
  • R+Hadoop:provides low-level access to a popular form of cluster computing
  • RHIPE:uses Hadoop’s power with R’s language and interactive shell
  • Segue:lets you use Elastic MapReduce as a backend for lapply-style operations
Less

Book Details

ISBN9781449309923

Compare Prices

Store Availability Book Format Condition Price
Indigo Books & Music In Stock Buy CAD 23.99
Indigo Books & MusicIn Stock
Format
Condition
Buy CAD 23.99
Available Discount
No Discount available

Join us and get access to all
your favourite books

Sign up for free and start exploring thousands of eBooks today.

Sign up for free