Big Data analytics is the process of examining large and complex data sets that often exceed the computational capabilities. R is a leading programming language of data science, consisting of powerful functions to tackle all problems related to Big Data processing.
The book begins with a brief introduction to the Big Data world and its current industry standards, with an introduction to the R language presenting its development, structure, applications in the real world, and its shortcomings. The book then progresses towards the revision of major R functions for data management and transformations. You'll be introduced to Cloud based Big Data solutions (e.g. Amazon EC2 instances and Amazon RDS, Microsoft Azure and its HDInsight clusters), and be given guidance on R connectivity with relational and non-relational databases such as MongoDB and HBase etc. In addition to this, Big Data Analytics with R expands to include Big Data tools such as Apache Hadoop ecosystem, HDFS and MapReduce frameworks, including other R compatible tools such as Apache Spark, its machine learning library Spark MLlib, as well as H2O.
Simon Walkowiak is a cognitive neuroscientist and a managing director of Mind Project Ltd - a Big Data and Predictive Analytics consultancy based in London, United Kingdom. As a former data curator at the UK Data Service (UKDS, University of Essex) European largest socio-economic data repository, Simon has an extensive experience in processing and managing large-scale datasets such as censuses, sensor and smart meter data, telecommunication data and well-known governmental and social surveys such as the British Social Attitudes survey, Labour Force surveys, Understanding Society, National Travel survey, and many other socio-economic datasets collected and deposited by Eurostat, World Bank, Office for National Statistics, Department of Transport, NatCen and International Energy Agency, to mention just a few. Simon has delivered numerous data science and R training courses at public institutions and international companies. He has also taught a course in Big Data Methods in R at major UK universities and at the prestigious Big Data and Analytics Summer School organized by the Institute of Analytics and Data Science (IADS).