Apache Spark培训

Apache Spark培训

Apache Spark培训


Spark for Developers

Richard is very calm and methodical, with an analytical insight - exactly the qualities needed to present this sort of course

Kieran Mac Kenna - BAE Systems Applied Intelligence

Apache Spark大纲

代码 名字 期限 概览
sparkdev Spark for Developers 21小时 OBJECTIVE: This course will introduce Apache Spark. The students will learn how  Spark fits  into the Big Data ecosystem, and how to use Spark for data analysis.  The course covers Spark shell for interactive data analysis, Spark internals, Spark APIs, Spark SQL, Spark streaming, and machine learning and graphX. AUDIENCE : Developers / Data Analysts Scala primer A quick introduction to Scala Labs : Getting know Scala Spark Basics Background and history Spark and Hadoop Spark concepts and architecture Spark eco system (core, spark sql, mlib, streaming) Labs : Installing and running Spark First Look at Spark Running Spark in local mode Spark web UI Spark shell Analyzing dataset – part 1 Inspecting RDDs Labs: Spark shell exploration RDDs RDDs concepts Partitions RDD Operations / transformations RDD types Key-Value pair RDDs MapReduce on RDD Caching and persistence Labs : creating & inspecting RDDs;   Caching RDDs Spark API programming Introduction to Spark API / RDD API Submitting the first program to Spark Debugging / logging Configuration properties Labs : Programming in Spark API, Submitting jobs Spark SQL SQL support in Spark Dataframes Defining tables and importing datasets Querying data frames using SQL Storage formats : JSON / Parquet Labs : Creating and querying data frames; evaluating data formats MLlib MLlib intro MLlib algorithms Labs : Writing MLib applications GraphX GraphX library overview GraphX APIs Labs : Processing graph data using Spark Spark Streaming Streaming overview Evaluating Streaming platforms Streaming operations Sliding window operations Labs : Writing spark streaming applications Spark and Hadoop Hadoop Intro (HDFS / YARN) Hadoop + Spark architecture Running Spark on Hadoop YARN Processing HDFS files using Spark Spark Performance and Tuning Broadcast variables Accumulators Memory management & caching Spark Operations Deploying Spark in production Sample deployment templates Configurations Monitoring Troubleshooting


Apache Spark,培训,课程,培训课程, Apache Spark远程教育,Apache Spark课程,企业Apache Spark培训,小组Apache Spark课程,学Apache Spark班,学习Apache Spark ,Apache Spark辅导班,一对一Apache Spark课程,Apache Spark周末培训,Apache Spark老师,Apache Spark讲师,Apache Spark私教,Apache Spark培训师,Apache Spark晚上培训,Apache Sparks辅导,Apache Spark训练,短期Apache Spark培训