Packt Master Big Data Ingestion and Analytics with Flume Sqoop Hive and Spark

Packt – Master Big Data Ingestion and Analytics with Flume Sqoop Hive and Spark-XQZT
English | Size:
Category: Tutorial
Complete course on Sqoop, Flume, and Hive: Great for CCA175 and Hortonworks Spark Certification preparation
Learn
Hadoop Distributed File System (HDFS) and commands
Lifecycle of Sqoop command
Sqoop import command to migrate data from Mysql to HDFS
Sqoop import command to migrate data from Mysql to Hive
Understand split-by and boundary queries
Use incremental mode to migrate the data from Mysql to HDFS
Using Sqoop export to migrate data from HDFS to MySQL
Spark Data frames – working with diff File Formats & Compression
Spark SQL

About
In this course, you will start by learning about the Hadoop Distributed File System (HDFS) and the most common Hadoop commands required to work with HDFS. Then, you’ll be introduced to Sqoop Import, through which will gain knowledge of the lifecycle of the Sqoop command and how to use the import command to migrate data from Mysql to HDFS, and from Mysql to Hive-and much more.

In addition, you will learn about Sqoop Export to migrate data effectively, and about Apache Flume to ingest data. The section Apache Hive introduces Hive, alongside external and managed tables; working with different files, and Parquet and Avro-and more. You will learn about Spark Dataframes, Spark SQL and lot more in the last sections.

All the codes and supporting files are available at: https://github.com/PacktPublishing/Master-Big-Data-Ingestion-and-Analytics-with-Flume-Sqoop-Hive-and-Spark

Features

Learn Sqoop, Flume, and Hive and prepare successfully for CCA175 and the Hortonworks Spark Certification
Learn about the Hadoop Distributed File System (HDFS), and Hadoop commands to work effectively with HDFS

DOWNLOAD
(Buy premium account for maximum speed and resuming ability)