
Module M6
The “Real-Time Processing with Spark Streaming” training is designed to train developers and architects in real-time data processing with Spark Streaming technologies.
This course is aimed at computer training populations (developers) with a solid knowledge of Java and at ease with Java development tools such as Eclipse or IntelliJ, Maven etc.
Real-Time Processing with Spark Streaming
- Hadoop ecosystem quick reminders
- Quick reminders about Scala and Spark
- The concepts of Spark Streaming : StreamingContext, DStreams
- DStreams input with TP
- Transformations on DStreams with TP
- Output operations on DStreams with TP
- The variables accumulators and broadcasts
- Data Frames and SQL on these Data Frames with TP
- MLLib operations on DStreams with TP
- Checkpointing
- Deploy, monitor and optimize its Spark Streaming application
Prerequisites : Module M1 & Strong Knowledge of Java and Associated Development Environments