Unlock the Data – The world of Data Science
About Course
Data is the new oil and Data Science is its combustion engine! While there are many definitions as to what data science really is, we have found it best to describe it as a field revolving around 5 data-related operations.Description
-
- Data Science is the art of generating insight, knowledge and predictions by processing of data
gathered about a system or a process. - Computational Science is the art of developing validated (simulation) models in order to gain a
better understanding of a phenomenon (system’s or processes). - Computational sciences focus on development of causal models using latent patterns in the
observed data, rather than only extracting patterns or knowledge from data by statistical
methods.
- Data Science is the art of generating insight, knowledge and predictions by processing of data
Objective of the Program :
To produce professionals with deep knowledge and innovative analytical and computational research
skills to handle problems in a variety of domains including governance, finance, security, transportation,
healthcare, energy management, agriculture, population studies, weather prediction, economics, social
sciences, predictive maintenance, structural health monitoring, smart manufacturing and computational
structural biology.
What is Data Science?
Data is the new oil and Data Science is its combustion engine! While there are many definitions as to
what data science really is, we have found it best to describe it as a field revolving around 5 data-related
operations.
Collection | Storing | Processing |Describing | Modeling
- Collection
Data Collection is the process of gathering data (Numerical, text, video, audio etc), influenced by two
major factors namely, the question that needs to be answered by the data scientist and the
environment that the data scientist is working in!
- Storing
Storing data involves maintaining the collected data for use during the data science pipeline. Structured
data is typically stored in relational-databases and aggregated in data-warehouses. With the advent of
Big-Data, Data Lakes are now used to store multimodal structured and unstructured data.
- Processing
Data Processing is a set of 3 main sub-processes. Data Wrangling (Extraction, transformation, and
loading of the data), Data Cleaning (Handling Missing Values, Outliers, etc) and Data Scaling,
Normalization and Standardization.
- Describing
Data Description has two aspects. Data Visualizing involves representing processed data using graphs,
charts, diagrams, and other visualizations. Data Summarization involves calculating various summary
statistics like the mean, median, mode, standard deviation, and variance.
- Modelling
Statistical Modelling of data involves modelling the underlying data distribution and relations in the data
and then making inferences on top of the model. Algorithmic modelling involves using large volumes of
data and optimization techniques to best estimate the distribution and relations of the data, eg Machine
Learning and Deep Learning.
Expected Graduate Attributes :
-
- Skill set to clean, process, analyze, manage and handle security and privacy aspects of structured and
unstructured data. - Ability to identify, design and apply appropriate pattern recognition and data mining methods for
generating relevant insight from data - Knowledge and capability to develop and apply machine learning techniques for data driven
modelling. - Ability to develop models and simulation schemes based upon domain knowledge in chosen domains
and possible combination with data driven models - Capability to follow uniquely interdisciplinary approach for solving problems, using knowledge of
mathematics, statistics, computing and one or more selected domains among physics, chemistry,
biology and engineering sciences. - Skill to use and design appropriate visualization techniques for representation and presentation of
insights and solutions. - Ability to innovate and contribute towards next generation data driven technology development.
- High quality technical communication skills.
- Appreciation and adherence to norms of professional ethics.
- Ability to plan and manage technical projects.
- Skill set to clean, process, analyze, manage and handle security and privacy aspects of structured and
Learning Outcome :
-
- Strong Understanding of fundamentals of Data Mining, Machine Learning, Modelling & Simulation,
Optimization and Numerical Techniques. - Knowledge about basics and use of visual analytics.
- Skill set to develop applications using Big Data.
- Advanced analytical and data driven modelling and simulation skills to address technological
challenges in one or more specialized knowledge domains like physics, chemistry, biology and
engineering sciences. - Demonstrate skills to communicate scientific ideas and/or application systems.
- Acquire project management skills.
- Strong Understanding of fundamentals of Data Mining, Machine Learning, Modelling & Simulation,
What Will I Learn?
- 1. Participants will be able to gain an overview of Data science, Machine Learning,
- Deep Learning and Artificial Intelligence.
- 2. Participants will be able to code using Python.
- 3. Participants will be able to understand Data science concepts like Data analysis,
- Data interpretation and Data visualization.
- 4. Participants will be able to understand Basics of (EDA) Exploratory Data Analysis.
Topics for this course
Module 1: Python for Data Science
Basic building blocks
Conditional statements
Loop statements
String
List
Dictionary
Tuple
File handling
Function
Module 2: Data Science Library and data visualization Using Python
Module 3: Maths Behind Data Science: Descriptive Statistics
Module 4: Maths Behind Data Science : Inferential Statistics
Module 5: Hypothesis Testing
Module 6: Exploratory data analysis /Data Cleaning Techniques/ Data Preparation Techniques
Industry Relevant Projects:
About the instructor
3 Courses
5 students
fgszg