Description
Cloudera's administrator training course for CDP Private Cloud Base provides participants with a comprehensive understanding of all the steps necessary to operate and maintain on-premises clusters using Cloudera Manager in production or development environments.
From installation and configuration through load balancing and tuning, this Cloudera training course is the best preparation for the real-world challenges faced by administrators who run CDP Private Cloud Base.
PUE, Cloudera Strategic Partner, is authorized by this multinational to provide official training in Cloudera technologies.
PUE is also accredited and recognized to carry out consulting and mentoring services in the implementation of Cloudera solutions in the business field with the added value in the practical and business approach to knowledge that is translated in its official courses.
Audience and prerequisites
This course is best suited to systems administrators and IT managers who have basic Linux experience.
Prior knowledge of Apache Hadoop is not required.
Objectives
This course teaches participants the following skills:
- About the topology of a typical Cloudera cluster and the role the major components play in the cluster
- How to install Cloudera Manager and CDP
- How to use Cloudera Manager to create, configure, deploy, and monitor a cluster
- What tools Cloudera provides to ingest data from outside sources into a cluster
- How to configure cluster components for optimal performance
- What routine tasks are necessary to maintain a cluster, including updating to a new version of CDP
- About detecting, troubleshooting, and repairing problems
- Key Cloudera security features
Topics
Module 1: Cloudera Data Platform
- Industry Trends for Big Data
- The Challenge to Become Data-Driven
- The Enterprise Data Cloud
- CDP Overview
- CDP Form Factors
- Hands-On Exercise: Configure the Exercise Network
Module 2: CDP Private Cloud Base Installation
- Installation Overview
- Cloudera Manager Installation
- Hands-On Exercise: Installing Cloudera Manager Server
- CDP Runtime Overview
- Cloudera Manager Introduction
- Instructor-Led Demonstration: Cloudera Manager
- Hands-On Exercise: Cluster Installation
Module 3: Cluster Configuration
- Overview
- Configuration Settings
- Modifying Service Configurations
- Configuration Files
- Managing Role Instances
- Adding New Services
- Adding and Removing Hosts
- Hands-On Exercise: Configuring a Hadoop Cluster
Module 4: Data Storage
- Overview
- HDFS Topology and Roles
- HDFS Performance and Fault Tolerance
- HDFS and Hadoop Security Overview
- Working with Namenode UI
- Instructor-Led Demonstration: Namenode User Interface
- Working with HDFS
- Hands-On Exercise: Working with HDFS
- HBase Overview
- Kudu Overview
- Cloud Storage Overview
- Hands-On Exercise: Storing Data in Amazon S3
Module 5: Data Ingest
- Data Ingest Overview
- File Formats
- Ingesting Data using File Transfer or REST Interfaces
- Importing Data from Relational Databases with Apache Sqoop
- Hands-On Exercise: Importing Data Using Sqoop
- Ingesting Data Using NiFi
- Instructor-Led Demonstration: NiFi User Interface
- Best Practices for Importing Data
- Hands-On Exercise: NiFi Verification
Module 6: Data Flow
- Overview of Cloudera Flow Management and NiFi
- NiFi Architecture
- Cloudera Edge Flow Management and MiNiFi
- Instructor-Led Demonstration: NiFi Usage
- Apache Kafka Overview
- Apache Kafka Cluster Architecture
- Apache Kafka Command Line Tools
- Hands-On Exercise: Working with Kafka
Module 7: Data Access and Discovery
- Apache Hive
- Apache Impala
- Apache Impala Tuning
- Hands-On Exercise: Install Impala and Hue
- Search Overview
- Hue Overview
- Managing and Configuring Hue
- Hue Authentication and Authorization
- CDSW Overview
- Hands-On Exercise: Using Hue, Hive and Impala
Module 8: Data Compute
- YARN Overview
- Running Applications on YARN
- Viewing YARN Applications
- YARN Application Logs
- MapReduce Applications
- YARN Memory and CPU Settings
- Hands-On Exercise: Running YARN Applications
- Tez Overview
- ACID for Hive
- Spark Overview
- How Spark Applications Run on YARN
- Monitoring Spark Applications
- Hands-On Exercise: Running Spark Applications
Module 9: Managing Resources
- Managing Resources Overview
- Node Labels
- Configuring cgroups
- The Capacity Scheduler
- Managing Queues
- Impala Query Scheduling
- Hands-On Exercise: Using The Capacity Scheduler
Module 10: Planning Your Cluster
- General Planning Considerations
- Choosing the Right Hardware
- Network Considerations
- CDP Private Cloud Considerations
- Configuring Nodes
Module 11: Advanced Cluster Configuration
- Configuring Service Ports
- Tuning HDFS and MapReduce
- Managing Cluster Growth
- Erasure Coding
- Enabling High Availability for HDFS and YARN
- Hands-On Exercise: Configuring HDFS for High Availability
Module 12: Cluster Maintenance
- Checking HDFS Status
- Copying Data Between Clusters
- Rebalancing Data in HDFS
- HDFS Directory Snapshots
- Hands-On Exercise: Creating and Using a Snapshot
- Host Maintenance
- Upgrading a Cluster
- Hands-On Exercise: Upgrade the Cluster
Module 13: Cluster Monitoring
- Cloudera Manager Monitoring Features
- Health Tests
- Hands-On Exercise: Breaking the Cluster
- Events and Alerts
- Charts and Reports
- Monitoring Recommendations
- Hands-On Exercise: Confirm Cluster Healing and Configuring Email Alerts
Module 14: Cluster Troubleshooting
- Overview
- Troubleshooting Tools
- Misconfiguration Examples
- Hands-On Exercise: Troubleshooting a Cluster
Module 15: Security
- Data Governance with SDX
- Hadoop Security Concepts
- Hadoop Authentication Using Kerberos
- Hadoop Authorization
- Hadoop Encryption
- Securing a Hadoop Cluster
- Apache Ranger
- Apache Atlas
- Backup and Recovery
Module 16: Private Cloud / Public Cloud
- CDP Overview
- Private Cloud Capabilities
- Public Cloud Capabilities
- What is Kubernetes?
- Workload XM Overview
- Auto-scaling
Module 17: Conclusion
Module 18: Appendix: Cloudera Manager API
- Cloudera Manager API
- Installation and Setup
- Code Examples
Module 19: Appendix: Ozone Overview
- Ozone Overview
- Working with Ozone