Book: Introducing Microsoft Azure HDInsight Technical Overview

Microsoft has released a free book titled: "Introducing Microsoft Azure HDInsight Technical Overview". Microsoft Azure HDInsight is Microsoft’s 100 percent compliant distribution of Apache Hadoop on Microsoft Azure. This means that standard Hadoop concepts and technologies apply, so learning the Hadoop stack helps you learn the HDInsight service. At the time of this writing, HDInsight (version 3.0) uses Hadoop version 2.2 and Hortonworks Data Platform 2.0.

clip_image001

The book contains the following chapters:

  • Introduction
  • Big data, quick intro
  • Getting started with HDInsight
  • Programming HDInsight
  • Working with HDInsight data
  • What’s next

This book consists of one conceptual chapter and four hands-on chapters. Chapter 1, “Big data, quick overview,” introduces the topic of big data, with definitions of terms and descriptions of tools and technologies. Chapter 2, “Getting started with HDInsight,” takes you through the steps to deploy a cluster and shows you how to use the HDInsight Emulator. After your cluster is deployed, it’s time for Chapter 3, “Programming HDInsight.” Chapter 3 continues where Chapter 2 left off, showing you how to run MapReduce jobs and turn your data into insights. Chapter 4, “Working with HDInsight data,” teaches you how to work more effectively with your data with the help of Apache Hive, Apache Pig, Excel and Power BI, and Sqoop. Finally, Chapter 5, “What next?,” covers practical topics such as integrating HDInsight into the rest of your stack and the different options for Hadoop deployment on Windows. Chapter 5 finishes up with a discussion of future plans for HDInsight and provides links to additional learning resources.