Only 3 days
Classroom
31/12/2024 (Tuesday)
Overview
Get the skills you need to operate and manage Hadoop clusters. The Cloudera Certified Administrator for Apache Hadoop (CCAH) certification proves that you have this knowledge.
From installation and configuration, to load-balancing and tuning your cluster; this accelerated course covers everything 33% faster than traditional training.
What will you learn?
You'll get the knowledge you need to pass the CCAH exam. Through lectures and hands-on exercises, you'll cover the following topics:
- The internals of YARN, MapReduce, and Hadoop Distributed File System (HDFS)
- Determining the correct hardware and infrastructure for your cluster
- Proper cluster configuration and deployment to integrate with the data centre
- How to load data into the cluster from dynamically-generated files using Flume and from a Relational Database Management System (RDMS) using Sqoop
- Configuring the FairScheduler to provide service-level agreements for multiple users of a cluster
- Best practices for preparing and maintaining Apache Hadoop in production
- Troubleshooting, diagnosing, tuning, and solving Hadoop issues
Note: this course will cover content and practical tests to cover preparation to the exam. Firebrand cannot deliver the exam at our centre. Students will be provided with an exam voucher to take the exam.
Curriculum
The Case for Apache Hadoop
- Why Hadoop?
- Core Hadoop components
- Fundamental concepts HDFS
HDFS Features
- Writing and reading Files
- NameNode memory considerations
- Overview of HDFS cecurity
- Using the Namenode Web UI
- Using the Hadoop File Shell
Getting Data into HDFS
- Ingesting Data from external sources with Flume
- Ingesting Data from relational databases with Sqoop
- REST Interfaces
- Best practices for importing data
YARN and MapReduce
- What Is MapReduce?
- Basic MapReduce concepts
- YARN cluster architecture
- Resource allocation
- Failure recovery
- Using the YARN Web UI
- MapReduce Version 1
Planning Your Hadoop Cluster
- General planning considerations
- Choosing the right hardware
- Network considerations
- Configuring nodes
- Planning for cluster management
Hadoop Installation and Initial Configuration
- Deployment Types
- Installing Hadoop
- Specifying the Hadoop configuration
- Performing Initial HDFS configuration
- Performing Initial YARN and MapReduce configuration
- Hadoop Logging
Installing and Configuring Hive, Impala, and Pig
- Hive
- Impala
- Pig
Hadoop Clients
- What is a Hadoop Client?
- Installing and configuring Hadoop Clients
- Installing and configuring Hue
- Hue authentication and authorization
Cloudera Manager
- The cotivation for Cloudera Manager
- Cloudera Manager features
- Express and Enterprise versions
- Cloudera Manager Topology
- Installing Cloudera Manager
- Installing Hadoop using Cloudera Manager
- Performing basic administration tasks using Cloudera Manager
Advanced Cluster Configuration
- Advanced configuration parameters
- Configuring Hadoop Ports
- Explicitly including and excluding hosts
- Configuring HDFS for rack awareness
- Configuring HDFS high availability
Hadoop Security
- Why Hadoop security is important
- Hadoop’s security system concepts
- What Kerberos is and how it works
- Securing a Hadoop Cluster with Kerberos
Managing and Scheduling Jobs
- Managing running jobs
- Scheduling hadoop jobs
- Configuring the FairScheduler
- Impala query scheduling
Cluster Maintenance
- Checking HDFS Status
- Copying data between clusters
- Adding and removing cluster nodes
- Rebalancing the cluster
- Cluster upgrading
Cluster Monitoring and Troubleshooting
- General system monitoring
- Monitoring Hadoop clusters
- Common troubleshooting Hadoop clusters
- Common misconfigurations
Exam Track
As part of this accelerated course, you'll receive the following exam voucher:
- Cloudera Certified Administrator for Apache Hadoop CCAH CDH 5 (CCA-500)
The exam consists of 60 questions and must be completed within 90 minutes. You must have a passing score of at least 70% to get your certification.
Note: this course will cover content and practical tests to cover preparation to the exam. Firebrand cannot deliver the exam at our centre. Delegates will be provided with an exam voucher to take the exam.
What's included
Included:
- Official Cloudera courseware
Prerequisites
This course is best suited to Systems Administrators and IT managers who have basic Linux experience. Prior knowledge of Apache Hadoop is not required.
Benefits
Seven reasons why you should sit your course with Firebrand Training
- Two options of training. Choose between residential classroom-based, or online courses
- You'll be certified fast. With us, you’ll be trained in record time
- Our course is all-inclusive. A one-off fee covers all course materials, exams**, accommodation* and meals*. No hidden extras.
- Pass the first time or train again for free. This is our guarantee. We’re confident you’ll pass your course the first time. But if not, come back within a year and only pay for accommodation, exams and incidental costs
- You’ll learn more. A day with a traditional training provider generally runs from 9 am – 5 pm, with a nice long break for lunch. With Firebrand Training you’ll get at least 12 hours/day of quality learning time, with your instructor
- You’ll learn faster. Chances are, you’ll have a different learning style to those around you. We combine visual, auditory and tactile styles to deliver the material in a way that ensures you will learn faster and more easily
- You’ll be studying with the best. We’ve been named in the Training Industry’s “Top 20 IT Training Companies of the Year” every year since 2010. As well as winning many more awards, we’ve trained and certified over 135,000 professionals
*For residential training only. Doesn't apply for online courses
**Some exceptions apply. Please refer to the Exam Track or speak with our experts
Think you are ready for the course? Take a FREE practice test to assess your knowledge! Free Practice Test