Cloudera

Administrator Training: CDP Private Cloud Base

28 hours
1840,00 €
Classroom or Live Virtual Class
Classroom or Live Virtual Class

Description

Cloudera's administrator training course for CDP Private Cloud Base provides participants with a comprehensive understanding of all the steps necessary to operate and maintain on-premises clusters using Cloudera Manager in production or development environments.

From installation and configuration through load balancing and tuning, this Cloudera training course is the best preparation for the real-world challenges faced by administrators who run CDP Private Cloud Base.

PUE, Cloudera Strategic Partner, is authorized by this multinational to provide official training in Cloudera technologies.

PUE is also accredited and recognized to carry out consulting and mentoring services in the implementation of Cloudera solutions in the business field with the added value in the practical and business approach to knowledge that is translated in its official courses.

Audience and prerequisites

This course is best suited to systems administrators and IT managers who have basic Linux experience.

Prior knowledge of Apache Hadoop is not required.

Objectives

This course teaches participants the following skills:

  • About the topology of a typical Cloudera cluster and the role the major components play in the cluster
  • How to install Cloudera Manager and CDP
  • How to use Cloudera Manager to create, configure, deploy, and monitor a cluster
  • What tools Cloudera provides to ingest data from outside sources into a cluster
  • How to configure cluster components for optimal performance
  • What routine tasks are necessary to maintain a cluster, including updating to a new version of CDP
  • About detecting, troubleshooting, and repairing problems
  • Key Cloudera security features

Topics

Module 1: Cloudera Data Platform

  • Industry Trends for Big Data
  • The Challenge to Become Data-Driven
  • The Enterprise Data Cloud
  • CDP Overview
  • CDP Form Factors
  • Hands-On Exercise: Configure the Exercise Network

Module 2: CDP Private Cloud Base Installation

  • Installation Overview
  • Cloudera Manager Installation
  • Hands-On Exercise: Installing Cloudera Manager Server
  • CDP Runtime Overview
  • Cloudera Manager Introduction
  • Instructor-Led Demonstration: Cloudera Manager
  • Hands-On Exercise: Cluster Installation

Module 3: Cluster Configuration

  • Overview
  • Configuration Settings
  • Modifying Service Configurations
  • Configuration Files
  • Managing Role Instances
  • Adding New Services
  • Adding and Removing Hosts
  • Hands-On Exercise: Configuring a Hadoop Cluster

Module 4: Data Storage

  • Overview
  • HDFS Topology and Roles
  • HDFS Performance and Fault Tolerance
  • HDFS and Hadoop Security Overview
  • Working with Namenode UI
  • Instructor-Led Demonstration: Namenode User Interface
  • Working with HDFS
  • Hands-On Exercise: Working with HDFS
  • HBase Overview
  • Kudu Overview
  • Cloud Storage Overview
  • Hands-On Exercise: Storing Data in Amazon S3

Module 5: Data Ingest

  • Data Ingest Overview
  • File Formats
  • Ingesting Data using File Transfer or REST Interfaces
  • Importing Data from Relational Databases with Apache Sqoop
  • Hands-On Exercise: Importing Data Using Sqoop
  • Ingesting Data Using NiFi
  • Instructor-Led Demonstration: NiFi User Interface
  • Best Practices for Importing Data
  • Hands-On Exercise: NiFi Verification

Module 6: Data Flow

  • Overview of Cloudera Flow Management and NiFi
  • NiFi Architecture
  • Cloudera Edge Flow Management and MiNiFi
  • Instructor-Led Demonstration: NiFi Usage
  • Apache Kafka Overview
  • Apache Kafka Cluster Architecture
  • Apache Kafka Command Line Tools
  • Hands-On Exercise: Working with Kafka

Module 7: Data Access and Discovery

  • Apache Hive
  • Apache Impala
  • Apache Impala Tuning
  • Hands-On Exercise: Install Impala and Hue
  • Search Overview
  • Hue Overview
  • Managing and Configuring Hue
  • Hue Authentication and Authorization
  • CDSW Overview
  • Hands-On Exercise: Using Hue, Hive and Impala

Module 8: Data Compute

  • YARN Overview
  • Running Applications on YARN
  • Viewing YARN Applications
  • YARN Application Logs
  • MapReduce Applications
  • YARN Memory and CPU Settings
  • Hands-On Exercise: Running YARN Applications
  • Tez Overview
  • ACID for Hive
  • Spark Overview
  • How Spark Applications Run on YARN
  • Monitoring Spark Applications
  • Hands-On Exercise: Running Spark Applications

Module 9: Managing Resources

  • Managing Resources Overview
  • Node Labels
  • Configuring cgroups
  • The Capacity Scheduler
  • Managing Queues
  • Impala Query Scheduling
  • Hands-On Exercise: Using The Capacity Scheduler

Module 10: Planning Your Cluster

  • General Planning Considerations
  • Choosing the Right Hardware
  • Network Considerations
  • CDP Private Cloud Considerations
  • Configuring Nodes

Module 11: Advanced Cluster Configuration

  • Configuring Service Ports
  • Tuning HDFS and MapReduce
  • Managing Cluster Growth
  • Erasure Coding
  • Enabling High Availability for HDFS and YARN
  • Hands-On Exercise: Configuring HDFS for High Availability

Module 12: Cluster Maintenance

  • Checking HDFS Status
  • Copying Data Between Clusters
  • Rebalancing Data in HDFS
  • HDFS Directory Snapshots
  • Hands-On Exercise: Creating and Using a Snapshot
  • Host Maintenance
  • Upgrading a Cluster
  • Hands-On Exercise: Upgrade the Cluster

Module 13: Cluster Monitoring

  • Cloudera Manager Monitoring Features
  • Health Tests
  • Hands-On Exercise: Breaking the Cluster
  • Events and Alerts
  • Charts and Reports
  • Monitoring Recommendations
  • Hands-On Exercise: Confirm Cluster Healing and Configuring Email Alerts

Module 14: Cluster Troubleshooting

  • Overview
  • Troubleshooting Tools
  • Misconfiguration Examples
  • Hands-On Exercise: Troubleshooting a Cluster

Module 15: Security

  • Data Governance with SDX
  • Hadoop Security Concepts
  • Hadoop Authentication Using Kerberos
  • Hadoop Authorization
  • Hadoop Encryption
  • Securing a Hadoop Cluster
  • Apache Ranger
  • Apache Atlas
  • Backup and Recovery

Module 16: Private Cloud / Public Cloud

  • CDP Overview
  • Private Cloud Capabilities
  • Public Cloud Capabilities
  • What is Kubernetes?
  • Workload XM Overview
  • Auto-scaling

Module 17: Conclusion

Module 18: Appendix: Cloudera Manager API

  • Cloudera Manager API
  • Installation and Setup
  • Code Examples

Module 19: Appendix: Ozone Overview

  • Ozone Overview
  • Working with Ozone

Open calls