Training “Becoming a Data Scientist in 3 days” aims at training future analysts or even “data scientists” ie specialists in the science of data analysis on big volumetrics. She brings the trainee into a world of algorithms reality quite understandable by all and can be quickly put into practice thanks to the Spark platform.
It is aimed at technical training populations (computer scientists, mathematicians, physicists, economists or any other field) who had at least a development experience in any programming language and at comfortable with mathematical notions of the terminal level S (vectors, matrices, probabilities etc.).
With very little pre-requisites it is the ideal training to tackle Big Data in all ease and show the enormous power.
Day 1: learning Scala and Spark
- The data structures and instructions of the Scala language with TP
- Spark language instructions (functions, RDDs, Data Frames) with TP
- Spark and Hadoop: how to use them together
Day 2 and 3: Data Science with Spark MLLib, TensorFlow, PyTorch
- Basic statistics with TP
- Clustering with TP
- Classification and regressions with TP
- The prediction with TP
- Collaborative filtering (or recommendations) with TP
- Pattern mining with TP feature extraction
- Frequent item-sets
- Dimensional reduction with TP
- Deep Learning (RNN, CNN, LSTM) with TP
- Evaluate the performance of his model with TP