MSc: Machine Learning

From IU
Jump to navigation Jump to search

Machine Learning

  • Course name: Machine Learning
  • Code discipline: R-01
  • Subject area:

Short Description

This course covers the following concepts: Machine learning paradigms; Machine Learning approaches, and algorithms.

Prerequisites

Prerequisite subjects

  • CSE117 — Data Structures and Algorithms: python, numpy, basic object-oriented concepts, memory management
  • CSE201 — Mathematical Analysis I
  • CSE203 — Mathematical Analysis II
  • CSE202 — Analytical Geometry and Linear Algebra I
  • CSE204 — Analytic Geometry And Linear Algebra II

Prerequisite topics

Course Topics

Course Sections and Topics
Section Topics within the section
Supervised Learning
  1. Introduction to Machine Learning
  2. Derivatives and Cost Function
  3. Data Pre-processing
  4. Linear Regression
  5. Multiple Linear Regression
  6. Gradient Descent
  7. Polynomial Regression
  8. Splines
  9. Bias-varaince Tradeoff
  10. Difference between classification and regression
  11. Logistic Regression
  12. Naive Bayes
  13. Bayesian Network
  14. KNN
  15. Confusion Metrics
  16. Performance Metrics
  17. Regularization
  18. Hyperplane Based Classification
  19. Perceptron Learning Algorithm
  20. Max-Margin Classification
  21. Support Vector Machines
  22. Slack Variables
  23. Lagrangian Support Vector Machines
  24. Kernel Trick
Decision Trees and Ensemble Methods
  1. Decision Trees
  2. Bagging
  3. Boosting
  4. Random Forest
  5. Adaboost
Unsupervised Learning
  1. K-means Clustering
  2. K-means++
  3. Hierarchical Clustering
  4. DBSCAN
  5. Mean-shift
Deep Learning
  1. Artificial Neural Networks
  2. Back-propagation
  3. Convolutional Neural Networks
  4. Autoencoder
  5. Variatonal Autoencoder
  6. Generative Adversairal Networks

Intended Learning Outcomes (ILOs)

What is the main purpose of this course?

There is a growing business need of individuals skilled in artificial intelligence, data analytics, and machine learning. Therefore, the purpose of this course is to provide students with an intensive treatment of a cross-section of the key elements of machine learning, with an emphasis on implementing them in modern programming environments, and using them to solve real-world data science problems.

ILOs defined at three levels

Level 1: What concepts should a student know/remember/explain?

By the end of the course, the students should be able to ...

  • Different learning paradigms
  • A wide variety of learning approaches and algorithms
  • Various learning settings
  • Performance metrics
  • Popular machine learning software tools

Level 2: What basic practical skills should a student be able to perform?

By the end of the course, the students should be able to ...

  • Difference between different learning paradigms
  • Difference between classification and regression
  • Concept of learning theory (bias/variance tradeoffs and large margins etc.)
  • Kernel methods
  • Regularization
  • Ensemble Learning
  • Neural or Deep Learning

Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?

By the end of the course, the students should be able to ...

  • Classification approaches to solve supervised learning problems
  • Clustering approaches to solve unsupervised learning problems
  • Ensemble learning to improve a model’s performance
  • Regularization to improve a model’s generalization
  • Deep learning algorithms to solve real-world problems

Grading

Course grading range

Grade Range Description of performance
A. Excellent 90-100 -
B. Good 75-89 -
C. Satisfactory 60-74 -
D. Poor 0-59 -

Course activities and grading breakdown

Activity Type Percentage of the overall course grade
Labs/seminar classes 0
Interim performance assessment 40
Exams 60

Recommendations for students on how to succeed in the course

Resources, literature and reference materials

Open access resources

  • T. Hastie, R. Tibshirani, D. Witten and G. James. An Introduction to Statistical Learning. Springer 2013.
  • T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer 2011.
  • Tom M Mitchel. Machine Learning, McGraw Hill
  • Christopher M. Bishop. Pattern Recognition and Machine Learning, Springer

Closed access resources

Software and tools used within the course

Teaching Methodology: Methods, techniques, & activities

Activities and Teaching Methods

Activities within each section
Learning Activities Section 1 Section 2 Section 3 Section 4
Development of individual parts of software product code 1 1 1 1
Homework and group projects 1 1 1 1
Midterm evaluation 1 1 1 1
Testing (written or computer based) 1 1 1 1
Discussions 1 1 1 1

Formative Assessment and Course Activities

Ongoing performance assessment

Section 1

Activity Type Content Is Graded?
Question Is it true that in simple linear regression and the squared correlation between X and Y are identical? 1
Question What are the two assumptions that the Linear regression model makes about the Error Terms? 1
Question Fit a regression model to a given data problem, and support your choice of the model. 1
Question In a list of given tasks, choose which are regression and which are classification tasks. 1
Question In a given graphical model of binary random variables, how many parameters are needed to define the Conditional Probability Distributions for this Bayes Net? 1
Question Write the mathematical form of the minimization objective of Rosenblatt’s perceptron learning algorithm for a two-dimensional case. 1
Question What is perceptron learning algorithm? 1
Question Write the mathematical form of its minimization objective for a two-dimensional case. 1
Question What is a max-margin classifier? 1
Question Explain the role of slack variable in SVM. 1
Question How to implement various regression models to solve different regression problems? 0
Question Describe the difference between different types of regression models, their pros and cons, etc. 0
Question Implement various classification models to solve different classification problems. 0
Question Describe the difference between Logistic regression and naive bayes. 0
Question Implement perceptron learning algorithm, SVMs, and its variants to solve different classification problems. 0
Question Solve a given optimization problem using the Lagrange multiplier method. 0

Section 2

Activity Type Content Is Graded?
Question What are pros and cons of decision trees over other classification models? 1
Question Explain how tree-pruning works. 1
Question What is the purpose of ensemble learning? 1
Question What is a bootstrap, and what is its role in Ensemble learning? 1
Question Explain the role of slack variable in SVM. 1
Question Implement different variants of decision trees to solve different classification problems. 0
Question Solve a given classification problem problem using an ensemble classifier. 0
Question Implement Adaboost for a given problem. 0

Section 3

Activity Type Content Is Graded?
Question Which implicit or explicit objective function does K-means implement? 1
Question Explain the difference between k-means and k-means++. 1
Question Whaat is single-linkage and what are its pros and cons? 1
Question Explain how DBSCAN works. 1
Question Implement different clustering algorithms to solve to solve different clustering problems. 0
Question Implement Mean-shift for video tracking 0

Section 4

Activity Type Content Is Graded?
Question What is a fully connected feed-forward ANN? 1
Question Explain different hyperparameters of CNNs. 1
Question Calculate KL-divergence between two probability distributions. 1
Question What is a generative model and how is it different from a discriminative model? 1
Question Implement different types of ANNs to solve to solve different classification problems. 0
Question Calculate KL-divergence between two probability distributions. 0
Question Implement different generative models for different problems. 0

Final assessment

Section 1

  1. What does it mean for the standard least squares coefficient estimates of linear regression to be scale equivariant?
  2. Given a fitted regression model to a dataset, interpret its coefficients.
  3. Explain which regression model would be a better fit to model the relationship between response and predictor in a given data.
  4. If the number of training examples goes to infinity, how will it affect the bias and variance of a classification model?
  5. Given a two dimensional classification problem, determine if by using Logistic regression and regularization, a linear boundary can be estimated or not.
  6. Explain which classification model would be a better fit to for a given classification problem.
  7. Consider the Leave-one-out-CV error of standard two-class SVM. Argue that under a given value of slack variable, a given mathematical statement is either correct or incorrect.
  8. How does the choice of slack variable affect the bias-variance tradeoff in SVM?
  9. Explain which Kernel would be a better fit to be used in SVM for a given data.

Section 2

  1. When a decision tree is grown to full depth, how does it affect tree’s bias and variance, and its response to noisy data?
  2. Argue if an ensemble model would be a better choice for a given classification problem or not.
  3. Given a particular iteration of boosting and other important information, calculate the weights of the Adaboost classifier.

Section 3

  1. K-Means does not explicitly use a fitness function. What are the characteristics of the solutions that K-Means finds? Which fitness function does it implicitly minimize?
  2. Suppose we clustered a set of N data points using two different specified clustering algorithms. In both cases we obtained 5 clusters and in both cases the centers of the clusters are exactly the same. Can 3 points that are assigned to different clusters in one method be assigned to the same cluster in the other method?
  3. What are the characterics of noise points in DBSCAN?

Section 4

  1. Explain what is ReLU, what are its different variants, and what are their pros and cons?
  2. Calculate the number of parameters to be learned during training in a CNN, given all important information.
  3. Explain how a VAE can be used as a generative model.

The retake exam

Section 1

Section 2

Section 3

Section 4