DATA SCIENCE WITH PYTHON
Introduction to Data Science with Python
This section introduces the students to Python statistical computing environment.
1
8 0
Viewing time: 01:11 hrs 0 MCQ
0 case Study
An understanding of analytics and data mining concepts
Familiarity with some common terms in analytics
Familiarity with pandas, numpy, matplotlib and scikitlearn
An understanding of basic python data structures such as lists, tuples
and strings
Be able to use lambda functions, map() and filter()
DATA SCIENCE WITH PYTHON
Scientific distributions used in Python for Data Science:
This class introduces the students to numpy, pandas and matplotlib library.
16
Viewing time: 3:19 hrs 1 Non-Graded assignment
Understanding how to work with numpy arrays, doing tasks such as slicing etc.
Introduction to pandas Data Frames and Series objects
Learning how to read in different flat files in python using pandas
Learning how to work with Web APIs, HTML and XML files
Learning how to run SQL queries in python
Be able to use basic charting functions using Pandas and Matplotlib
DATA SCIENCE WITH PYTHON
Machine Learning
In this class the basic thought process behind machine learning is introduced.
3
5
Viewing time: 0:50 hrs No Case Study
Appreciation of basic thought process behind machine learning tasks
Introduction to common Machine Learning tasks
Be able to build common machine learning models using Scikit learn API
An understanding of common classification error metrics
Understand the use of confusion matrix in the context of classification task
DATA SCIENCE WITH PYTHON
Practical Applications of Machine Learning
In this class building of machine learning models is demonstrated.
4
10 No Case Study
Viewing time: 1:44 hrs
Understanding how to pre=process data to build a machine learning model
Be able to use one hot encoding to pre-process categorical variables
Be able to use pipelines to automate steps in model building
Understand the use of ensembles such as Random Forests
Understand the notion of in-sample and out-sample error using bias variance trade-off
Be able to interpret ROC curves and AUC metric to compare the performance of different
classifiers
Be able to handle multiclass problems using SVM, Random Forests