We've noticed this is not your region.
Redirect me to my region
What do you want to learn today?

HRDF Funded Course - Apache Hadoop Big Data Analyst Training

Training by  Tertiary Infotech
Inquire Now
On-Site / Training

Details

This four-day developer training course delivers the key concepts and expertise participants need to create robust data processing applications using Apache Hadoop.

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as: 

  • The internals of MapReduce and HDFS and how to write MapReduce code
  • Best practices for Hadoop development, debugging, and implementation of workflows and common algorithms
  • How to leverage Hive, Pig, Sqoop, Flume, Oozie, Mahout, and other Hadoop ecosystem projects
  • Optimal hardware configurations and network considerations for integrating a Hadoop cluster with the data center
  • Writing and executing joins to link data sets in MapReduce
  • Advanced Hadoop API topics required for real-world data analysis

Upon completion of the course, attendees receive a Cloudera Certified Developer for Apache Hadoop (CCDH) practice test. Certification is a great differentiator; it helps establish you as a leader in the field, providing employers and customers with tangible evidence of your skills and expertise.

For more details, please visit 
https://www.tertiarycourses.com.my/big-data-analyst-administrator-training.html

Outline

Day 1

Module 1 Hadoop Introduction

  • Why we need Hadoop
  • Why Hadoop is in demand in market now a days
  • Where expensive SQL based tools are failing
  • Key points , Why Hadoop is leading tool in current It Industry Definition of BigData
  • Hadoop nodes
  • Introduction to Hadoop Release-1
  • Hadoop Daemons in Hadoop Release-1
  • Introduction to Hadoop Release-2
  • Hadoop Daemons in Hadoop Release-2
  • Hadoop Cluster and Racks
  • Hadoop Cluster Demo
  • New projects on Hadoop
  • How Open Source tools is capable to run jobs in lesser time Hadoop Storage – HDFS (Hadoop Distributed file system) Hadoop Processing Framework (Map Reduce / YARN) Alternates of Map Reduce
  • Why NOSQL is in much demand instead of SQL
  • Distributed warehouse for HDFS
  • Hadoop Ecosystem and its usages
  • Data import/Export tools

Module 2 : Hadoop Installation and Hands-on on Hadoop machine

  • Hadoop installation
  • Introduction to Hadoop FS and Processing Environment’s UIs How to read and write files
  • Basic Unix commands for Hadoop
  • Hadoop FS shell
  • Hadoop releases practical
  • Hadoop daemons practical
Day 2

Module 3: ETL Tool (Pig) Introduction Level-1 

  • Pig Introduction
  • Why Pig if Map Reduce is there?
  • How Pig is different from Programming languages Pig Data flow Introduction
  • How Schema is optional in Pig
  • Pig Data types
  • Pig Commands – Load, Store , Describe , Dump Map Reduce job started by Pig Commands
  • Execution plan

Module 4 :ETL Tool (Pig) Level-2 

  • Pig- UDFs
  • Pig Use cases
  • Pig Assignment
  • Complex Use cases on Pig
  • Real time scenarios on Pig
  • When we should use Pig
  • When we shouldn’t use Pig
Day 3

Module 5: Hive Warehouse

  • Hive Introduction
  • Meta storage and meta store
  • Introduction to Derby Database
  • Hive Data types
  • HQL
  • DDL, DML and sub languages of Hive
  • Internal , external and Temp tables in Hive
  • Differentiation between SQL based Datawarehouse and Hive

Module 6 : Hive Level-2

  • Hive releases
  • Why Hive is not best solution for OLTP OLAP in Hive
  • Partitioning
  • Bucketing
  • Hive Architecture
  • Thrift Server
  • Hue Interface for Hive
  • How to analyze data using Hive script Differentiation between Hive and Impala UDFs in Hive
  • Complex Use cases in Hive
  • Hive Advanced Assignment

Module 7: Introduction to Map Reduce

  • How Map Reduce works as Processing Framework End to End execution flow of Map Reduce job Different tasks in Map Reduce job
  • Why Reducer is optional while Mapper is mandatory? Introduction to Combiner
  • Introduction to Partitioner
  • Programming languages for Map Reduce
  • Why Java is preferred for Map Reduce programming

Module 8 : NOSQL Databases and Introduction to HBase

  • Introduction to NOSQL
  • Why NOSQL if SQL is in market since several years
  • Databases in market based on NOSQL
  • CAP Theorem
  • ACID Vs. CAP
  • OLTP Solutions with different capabilities
  • Which Nosql based solution is capable to handle specific requirements Examples of companies that uses NOSQL based databases
  • HBase Architecture of column families
Day 4  

Module 9: Zookeeper and SQOOP

  • Introduction to Zookeeper
  • How Zookeeper helps in Hadoop Ecosystem
  • How to load data from Relational storage in Hadoop Sqoop basics
  • Sqoop practical implementation
  • Sqoop alternative
  • Sqoop connector

Module 9 : Flume , Oozie and YARN

  • How to load data streaming data without fixe schema
  • How to load unstructured and semi structured data in Hadoop Introduction to Flume
  • Hands-on on Flume
  • How to load Twitter data in HDFS using Hadoop
  • Introduction to Oozie
  • How to schedule jobs using Oozie
  • What kind of jobs can be scheduled using Oozie
  • How to schedule jobs which are time based
  • Hadoop releases
  • From where to get Hadoop and other components to install
  • Introduction to YARN
  • Significance of YARN


Module 10 : Apache Spark Basics 

  • Introduction to Spark
  • Basics Features of SPARK and Scala available in Hue Why Spark demand is increasing in market
  • How can we use Spark with Hadoop Eco System Datasets for practice purpose

Module 11 : Emerging Trends of Big Data

  • YARN
  • Emerging Technologies of Big Data
  • Emerging use cases e.g IoT, Industrial Internet, New Applications
  • Certifications and
  • Job Opportunities

Speaker/s

Jason is a native of Kuala Lumpur, Malaysia; studied Bachelor’s Degree in Accounting and Finance from the London School of Economics Program, University of London. Raised in a typical Chinese family with entrepreneurial business background that is involved in manufacturing and real estate development. Worked as an Executive at the Asset and License Management Department in Standard Chartered, Malaysia; promoted to Data Analyst six months later. Later joined Tune Hotels Regional Services, a hotel management and hotel chain operator; served as Senior Revenue Executive. Served as Research Analyst with Wealth-X, a company that provides prospecting, intelligence and wealth due diligence on ultra-high net worth individuals. Thereafter served as Senior Data Analyst with Xchanging Malaysia, a joint venture between Xchanging and YTL Communications to develop and deliver enhanced mobile internet and cloud-based hosting offerings in Malaysia. Currently working as a Data Analyst with GoQuO, a full service e-commerce solutions provider to airlines and OTAs. Community Organizer of Big Data Malaysia, a professional network for individuals with interest in all aspects of Big Data, and Member of the Founder Institute for Malaysian Chapter, the world’s largest entrepreneur training and startup launch program. Occasionally participates in marathons and is an avid off-road cyclist. Passionate about technology, economics and enjoys social events.
Reviews
Be the first to write a review about this course.
Write a Review
Tertiary Courses Malaysia is a HRDF Approved Training Provider in Malaysia. We offers wide range of classroom instructor-led technical training courses for working professionals and executives in Malaysia.

All our courses and trainings are funded by HRDF (Human Resources Development Fund Malaysia). Our courses include Infocomm, Digital Media, Robotics, Semiconductor,Telecommunication, Life Science, Horticulture Industries , and Business Administration . Below are some of our popular courses

  1. Python Programming
  2. R Programming
  3. Tableau
  4. Machine Learning
  5. Raspberry Pi
  6. Arduino
  7. 3D Printing
  8. iOS Apps Development
  9. Android Apps Development
  10. Magento eCommerce
  11. Wordpress
  12. Joomla
  13. Search Engine Optimizatoin
  14. Web Design
  15. Google Analytics
  16. Facebook Marketing
Sending Message
Please wait...
× × Speedycourse.com uses cookies to deliver our services. By continuing to use the site, you are agreeing to our use of cookies, Privacy Policy, and our Terms & Conditions.