Votre navigateur est obsolète !

Pour une expériencenet et une sécurité optimale, mettez à jour votre navigateur. Mettre à jour maintenant


Olivier Doubiani

Olivier Doubiani

Data Scientist / Data Engineer consultant

39 years old
Driving License
RUEIL MALMAISON (92500) France
Consultant Open to opportunities
Engineer in Medical Informatics and Big Data.
Multi-skills: Statistics, Computer Science and Biology and Data Sciences / Data Engineering

MSc in Data Engineering

DSTI Institute

October 2018 to June 2019
Data Science
Applied Mathematics for Data Science (25hrs)
– Calculus – Differentiation – Trigonometry & Complex Numbers
Foundations of Statistical Analysis & Machine Learning (25hrs)
– Probabilities and distribution – Descriptive Statistics – Introduction to Inference
Big Data Processing (25hrs)
Statistical Analysis of Massive and High-dimensional Data (25hrs)
Deep Learning on GPU with pyTorch (25hrs)
Recurrent Neural Networks – LSTM – Residual Networks
IT Fundamentals
Computer Systems (25hrs)
Computer Architecture – Operating Systems & Vistualisation – Networking – Storage
Cloud Computing – Amazon AWS (50 hrs)
Preparation to AWS Certified Solutions Architect – Associate Certification – Comparative overview of Microsoft Azure
Cloud Computing – Microsoft Azure (25 hrs)
Comparative overview with Amazon AWS on core services (Networking, Compute, Storage, Database) & focus on Azure“Data Managed Services” (chiefly Azure Machine Learning Studio, Cognitive Services, Data Lake, Databricks, Stream Analytics)
Semantic Web technologies for Data Science developments (25 hrs)
Representing and querying web-rich data (RDF, SPARQL), Introducing Semantics in Data (RDFS, Ontologies), Tracing and following data history (VOiD, DCAT, PROV-O)
Data management
Advanced SQL for Data Wrangling (25 hrs)
Complex joins & subqueries, stored procedures & triggers

Relational Databases Management Systems (25 hrs)
Using MySQL & Microsoft SQL Server: stand-alone and cluster deployments, integration in software, ETL, persistence frameworks

NoSQL databases (25 hrs)
Key-value store, Document store, Graph database , hybrid approaches with Apache Cassandra

The Hadoop & Spark Ecosystem (50 hrs)
HDFS, scheduling & resources management – Workflow management & ETL, Dataflow management, Scalable Enterprise Serial Bus – Realtime processing, Machine Learning, Data Exploration & Visualisation

Data Pipeline (25 hrs)
XML dataflow, DTD & Schemas, XLS Transformation, JSON & Transformations – Cloud-based solutions with Glue in AWS & AWS Kinesis – Open-source solutions with Apache Kafka & Beam