Take your programming skills to the next level with our Java e-learning programme

Hadoop for Java Developers

the quickest and easiest way to learn Hadoop
  • This Hadoop Training course is the easiest and quickest way to learn to program using the Map-Reduce programming model.
  • If you are a Java developer looking to learn how to design and build big-data applications, this course will both get you up and running quickly, and provide you with the core skills to produce production-quality functioning applications.
  • The course contains approx. 13 hours of video tutorials, together with guidance notes and lots of sample code.
  • Two real world case studies and processing sizeable amounts of data as you progress through the training material.
  • All the software you will need is either included or we’ll show you where you can download it.
  • The tutorials cover how to install and configure Hadoop for a typical development environment – all you need to get started with the training is a working computer capable of running Java, the Eclipse IDE and watching videos.

Pre-requisites

The course is designed to be accessible to anyone with a reasonable knowledge of basic Java. You will need to be able to write classes and create objects. Our Java Fundamentals course covers all the Java knowledge you need for this course.

Important note for Windows users: Hadoop is difficult to install on Windows, so in the course we show you to how set up a virtual machine running Linux. No prior knowledge of Linux is needed.

Contents - contains over 13 hours of video - equivalent to 4 days of live training.

 

Having problems? check the errata for this course.

1

Welcome


10 m 49 s
A brief overview chapter, with a preview of the work we're going to be doing.

2

Introducing Hadoop


16 m 12 s
An overview of what Hadoop is and introduction to the concept of map-reduce.

3

The map-reduce programming model


20 m 45 s
A deeper look at the map-reduce programming model.

4

Operating modes & installation environment


25 m 10 s
Understanding the operating modes of Hadoop, getting ready to install (including setting up a virtual machine if needed)

5

Installing Hadoop


40 m 0 s
Installing Hadoop and configuring for both standalone and pseudo-distributed modes.

6

Writing our first map-reduce job


52 m 36 s
Using a generic map-reduce template to create a real Hadoop job.

7

HDFS


24 m 49 s
Understanding the Hadoop file system and how to put files into and out of it from the command line.

8

Running in Pseudo-Distributed Mode


11 m 26 s
Running larger jobs in pseudo-distributed mode. Viewing the Hadoop Web User Interface.

9

Map-reduce process flow 1


40 m 36 s
Look at the steps in a map-reduce job in more detail. Learn about the shuffle process and adding a combine class.

10

Map-reduce process flow 2


14 m 38 s
An exercise to practice with the full map-reduce workflow.

11

Enhancing Map and Reduce


23 m 41 s
An overview of the built in map and reduce functions, and learning to create custom key and value data types.

12

Job Configuration


25 m 11 s
Understanding Hadoop file formats, and using the tool runner template to set command line parameters.

13

Case Study 1 - Part 1


53 m 8 s
An explanation of the first major case study, using real-world data, together with a walk through of the first 2 tasks.

14

Case Study 1 - Part 2


9 m 16 s
Walk through of task 3 in our case study.

15

Case Study 1 - Part 3


9 m 13 s
Walk through of task 4 in our case study.

16

Chaining Multiple Map-Reduce Jobs


27 m 27 s
Learning to automate the chaining of jobs with the JobControl object. Using the sequence file format

17

Pre and Post Processing


47 m 39 s
Using the ChainMapper and ChainReducer objects to add additional Map steps.

18

Optimising Map-Reduce jobs


29 m 46 s
Looking at multiple ways to improve the efficiency of Map-Reduce jobs

19

Log Files & Counters


36 m 28 s
Learning to use log files and counters as a tool to debug map-reduce code.

20

Working with relational databases


56 m 11 s
Reading and writing from relational databases using JDBC

21

Unit testing


40 m 56 s
Using Junit to test map-reduce code with the MRUnit library.

22

Secondary Sorting


36 m 11 s
Understanding how to sort the values before the reduce phase.

23

Joining data


51 m 56 s
Joining 2 data sets together with a reduce-side join.

24

Using Amazon Elastic Map Reduce


40 m 38 s
Using the Amazon EMR cloud based Hadoop platform to run map-reduce jobs.

25

Case Study 2


42 m 45 s
Our second major case study based on a real world use of Hadoop.

26

Course Summary


14 m 47 s
Review of what we've learned, and ideas of where to go next.

Let the Course Come to You

About Us Pricing Frequently Asked Questions Contact Privacy T&Cs Affiliates and Resellers
Facebook Twitter YouTube