Data - Data Scientist
A data scientist analyzes and interprets complex data to aid in strategic decision-making and problem-solving.

Flexible 100% online training
Start your new career at any time! Available part-time? No problem, study at your own pace.

Professional projects
You will develop your professional skills by working on concrete projects inspired by business reality. No problem, study at your own pace.

Personalized support
Benefit from weekly mentoring sessions with a business expert.

Earn certificates and diplomas
Earning certificates and degrees can enhance your career, broaden your horizons, and provide you with increased personal satisfaction.
- Preview
- Projects
- Accompaniement
Objectives of Data Scientist training
Operational objective:
Know how to understand Data Scientist.
Educational objectives:
More concretely, at the end of this Data Scientist Fundamentals training you will have acquired the knowledge and skills necessary to:
- Discover the job of Data Scientist and the main families of problems
- Know how to model a Data Science problem
- Create your first variables
- Building Your Data Scientist Toolbox
- Participate in a first competition.
Who is this training for?
Audience :
This internship is aimed at Analysts, Statisticians, Architects, Developers.
Prerequisites:
To follow this course in the best possible conditions, you need to have some basic knowledge of programming or scripting, as well as some memory of statistics which can be a plus.

A pedagogy based on practice

- Acquire essential skills by validating professional projects.
- Progress with the help of a professional expert.
- Gain real know-how as well as a portfolio to demonstrate it.
Data Scientist Course Content:

Introduction to Big Data:
What is Big Data?
The Big Data Technology Ecosystem














Introduction to Data Science, the job of Data Scientist:
The vocabulary of a Data Science problem
From statistical analysis to machine learning
Overview of the possibilities of machine learning














Modeling a problem:
Input / output of a machine learning problem
“OCR” Practical Work:
How to model the optical character recognition problem.














Identify machine learning algorithm families:
Supervised analysis
Unsupervised analysis
Classification / regression














Under the hood of algorithms: linear regression:
Some reminders: hypothesis function, convex function, optimization
Construction of the cost function
Minimization method: gradient descent














Under the hood of algorithms: logistic regression:
Decision boundary
Construction of a convex cost function for classification














The Data Scientist's Toolbox:
Introduction to tools
Introduction to Python, Pandas and Scikit-learn
Practical case n°1: “Predicting Titanic survivors”
Statement of the problem
First manipulation in Python














The pitfalls of machine learning
Overfitting or overlearning
Bias vs. variance
Regularization: Ridge and Lasso regression














Data Cleaning
Data types: categorical, continuous, ordered, temporal
Detection of statistical outliers, aberrant values
Strategy for missing values
Practical work:
“Fill in missing values”














Feature Engineering
Strategies for non-continuous variables
Detect and create discriminant variables
Practical case n°2: “Predicting Titanic survivors”
Identifying and creating the right variables
Creation of a first model
Submission on Kaggle














Data visualization
Visualization to understand data: histogram, scatter plot, etc.
Visualization to understand algorithms: train/test loss, feature importance, etc.














Introduction to set methods
The basic model: the decision tree, its advantages and its limits
Presentation of the different ensemble strategies: bagging, boosting, etc.
Practical work “Return to the Titanic”:
Using a set method based on the previous model














Semi-supervised learning
The major classes of unsupervised algorithms: clustering, PCA, etc.
Practical work “Detecting anomalies in betting”:
How does an unsupervised algorithm detect fraud in betting?
Individual and privileged supervision.
- Benefit from weekly individual sessions with an expert mentor in the field
- quickly in your projects thanks to its excellence in sharing its know-how




The Empire Training community
- Count on a close-knit community of students ready to help you 24/7.
Online pre-registration
Please fill out the form
Please fill out the form
How does an Empire Training course work?
From the chosen training to their entry into their new career, our students recount each stage of their experience and the support they received.

