Latest revision as of 17:09, 13 July 2022

Machine Learning

Course name: Machine Learning
Code discipline: R-01
Subject area:

Short Description

This course covers the following concepts: Machine learning paradigms; Machine Learning approaches, and algorithms.

Prerequisites

Prerequisite subjects

CSE117 — Data Structures and Algorithms: python, numpy, basic object-oriented concepts, memory management
CSE201 — Mathematical Analysis I
CSE203 — Mathematical Analysis II
CSE202 — Analytical Geometry and Linear Algebra I
CSE204 — Analytic Geometry And Linear Algebra II

Prerequisite topics

Course Topics

Course Sections and Topics
Section	Topics within the section
Supervised Learning	Introduction to Machine Learning Derivatives and Cost Function Data Pre-processing Linear Regression Multiple Linear Regression Gradient Descent Polynomial Regression Splines Bias-varaince Tradeoff Difference between classification and regression Logistic Regression Naive Bayes Bayesian Network KNN Confusion Metrics Performance Metrics Regularization Hyperplane Based Classification Perceptron Learning Algorithm Max-Margin Classification Support Vector Machines Slack Variables Lagrangian Support Vector Machines Kernel Trick
Decision Trees and Ensemble Methods	Decision Trees Bagging Boosting Random Forest Adaboost
Unsupervised Learning	K-means Clustering K-means++ Hierarchical Clustering DBSCAN Mean-shift
Deep Learning	Artificial Neural Networks Back-propagation Convolutional Neural Networks Autoencoder Variatonal Autoencoder Generative Adversairal Networks

Intended Learning Outcomes (ILOs)

What is the main purpose of this course?

There is a growing business need of individuals skilled in artificial intelligence, data analytics, and machine learning. Therefore, the purpose of this course is to provide students with an intensive treatment of a cross-section of the key elements of machine learning, with an emphasis on implementing them in modern programming environments, and using them to solve real-world data science problems.

ILOs defined at three levels

Level 1: What concepts should a student know/remember/explain?

By the end of the course, the students should be able to ...

Different learning paradigms
A wide variety of learning approaches and algorithms
Various learning settings
Performance metrics
Popular machine learning software tools

Level 2: What basic practical skills should a student be able to perform?

By the end of the course, the students should be able to ...

Difference between different learning paradigms
Difference between classification and regression
Concept of learning theory (bias/variance tradeoffs and large margins etc.)
Kernel methods
Regularization
Ensemble Learning
Neural or Deep Learning

Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?

By the end of the course, the students should be able to ...

Classification approaches to solve supervised learning problems
Clustering approaches to solve unsupervised learning problems
Ensemble learning to improve a model’s performance
Regularization to improve a model’s generalization
Deep learning algorithms to solve real-world problems

Grading

Course grading range


Grade	Range	Description of performance
A. Excellent	90-100	-
B. Good	75-89	-
C. Satisfactory	60-74	-
D. Poor	0-59	-

Course activities and grading breakdown


Activity Type	Percentage of the overall course grade
Labs/seminar classes	0
Interim performance assessment	40
Exams	60

Recommendations for students on how to succeed in the course

Resources, literature and reference materials

Open access resources

T. Hastie, R. Tibshirani, D. Witten and G. James. An Introduction to Statistical Learning. Springer 2013.
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer 2011.
Tom M Mitchel. Machine Learning, McGraw Hill
Christopher M. Bishop. Pattern Recognition and Machine Learning, Springer

Closed access resources

Software and tools used within the course

Teaching Methodology: Methods, techniques, & activities

Activities and Teaching Methods

Activities within each section
Learning Activities	Section 1	Section 2	Section 3	Section 4
Development of individual parts of software product code	1	1	1	1
Homework and group projects	1	1	1	1
Midterm evaluation	1	1	1	1
Testing (written or computer based)	1	1	1	1
Discussions	1	1	1	1

Formative Assessment and Course Activities

Ongoing performance assessment

Section 1


Activity Type	Content	Is Graded?
Question	Is it true that in simple linear regression ${\textstyle R^{2}}$ and the squared correlation between X and Y are identical?	1
Question	What are the two assumptions that the Linear regression model makes about the Error Terms?	1
Question	Fit a regression model to a given data problem, and support your choice of the model.	1
Question	In a list of given tasks, choose which are regression and which are classification tasks.	1
Question	In a given graphical model of binary random variables, how many parameters are needed to define the Conditional Probability Distributions for this Bayes Net?	1
Question	Write the mathematical form of the minimization objective of Rosenblatt’s perceptron learning algorithm for a two-dimensional case.	1
Question	What is perceptron learning algorithm?	1
Question	Write the mathematical form of its minimization objective for a two-dimensional case.	1
Question	What is a max-margin classifier?	1
Question	Explain the role of slack variable in SVM.	1
Question	How to implement various regression models to solve different regression problems?	0
Question	Describe the difference between different types of regression models, their pros and cons, etc.	0
Question	Implement various classification models to solve different classification problems.	0
Question	Describe the difference between Logistic regression and naive bayes.	0
Question	Implement perceptron learning algorithm, SVMs, and its variants to solve different classification problems.	0
Question	Solve a given optimization problem using the Lagrange multiplier method.	0

Section 2


Activity Type	Content	Is Graded?
Question	What are pros and cons of decision trees over other classification models?	1
Question	Explain how tree-pruning works.	1
Question	What is the purpose of ensemble learning?	1
Question	What is a bootstrap, and what is its role in Ensemble learning?	1
Question	Explain the role of slack variable in SVM.	1
Question	Implement different variants of decision trees to solve different classification problems.	0
Question	Solve a given classification problem problem using an ensemble classifier.	0
Question	Implement Adaboost for a given problem.	0

Section 3


Activity Type	Content	Is Graded?
Question	Which implicit or explicit objective function does K-means implement?	1
Question	Explain the difference between k-means and k-means++.	1
Question	Whaat is single-linkage and what are its pros and cons?	1
Question	Explain how DBSCAN works.	1
Question	Implement different clustering algorithms to solve to solve different clustering problems.	0
Question	Implement Mean-shift for video tracking	0

Section 4


Activity Type	Content	Is Graded?
Question	What is a fully connected feed-forward ANN?	1
Question	Explain different hyperparameters of CNNs.	1
Question	Calculate KL-divergence between two probability distributions.	1
Question	What is a generative model and how is it different from a discriminative model?	1
Question	Implement different types of ANNs to solve to solve different classification problems.	0
Question	Calculate KL-divergence between two probability distributions.	0
Question	Implement different generative models for different problems.	0

Final assessment

Section 1

What does it mean for the standard least squares coefficient estimates of linear regression to be scale equivariant?
Given a fitted regression model to a dataset, interpret its coefficients.
Explain which regression model would be a better fit to model the relationship between response and predictor in a given data.
If the number of training examples goes to infinity, how will it affect the bias and variance of a classification model?
Given a two dimensional classification problem, determine if by using Logistic regression and regularization, a linear boundary can be estimated or not.
Explain which classification model would be a better fit to for a given classification problem.
Consider the Leave-one-out-CV error of standard two-class SVM. Argue that under a given value of slack variable, a given mathematical statement is either correct or incorrect.
How does the choice of slack variable affect the bias-variance tradeoff in SVM?
Explain which Kernel would be a better fit to be used in SVM for a given data.

Section 2

When a decision tree is grown to full depth, how does it affect tree’s bias and variance, and its response to noisy data?
Argue if an ensemble model would be a better choice for a given classification problem or not.
Given a particular iteration of boosting and other important information, calculate the weights of the Adaboost classifier.

Section 3

K-Means does not explicitly use a fitness function. What are the characteristics of the solutions that K-Means finds? Which fitness function does it implicitly minimize?
Suppose we clustered a set of N data points using two different specified clustering algorithms. In both cases we obtained 5 clusters and in both cases the centers of the clusters are exactly the same. Can 3 points that are assigned to different clusters in one method be assigned to the same cluster in the other method?
What are the characterics of noise points in DBSCAN?

Section 4

Explain what is ReLU, what are its different variants, and what are their pros and cons?
Calculate the number of parameters to be learned during training in a CNN, given all important information.
Explain how a VAE can be used as a generative model.

The retake exam

Section 1

Section 2

Section 3

Section 4

Difference between revisions of "MSc: Machine Learning"

Latest revision as of 17:09, 13 July 2022

Contents

Machine Learning

Short Description

Prerequisites

Prerequisite subjects

Prerequisite topics

Course Topics

Intended Learning Outcomes (ILOs)

What is the main purpose of this course?

ILOs defined at three levels

Level 1: What concepts should a student know/remember/explain?

Level 2: What basic practical skills should a student be able to perform?

Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?

Grading

Course grading range

Course activities and grading breakdown

Recommendations for students on how to succeed in the course

Resources, literature and reference materials

Open access resources

Closed access resources

Software and tools used within the course

Teaching Methodology: Methods, techniques, & activities

Activities and Teaching Methods

Formative Assessment and Course Activities

Ongoing performance assessment

Section 1

Section 2

Section 3

Section 4

Final assessment

The retake exam

Navigation menu

Search

@@ Line 1: / Line 1: @@
 = Machine Learning =
+* '''Course name''': Machine Learning
+* '''Code discipline''': R-01
+* '''Subject area''':
+== Short Description ==
-<span id="C:CourseTitle" label="C:CourseTitle">[C:CourseTitle]</span>
+This course covers the following concepts: Machine learning paradigms; Machine Learning approaches, and algorithms.
-* <span>'''Course name:'''</span> Machine Learning
-* <span>'''Course number:'''</span> R-01
 == Prerequisites ==
-* [https://eduwiki.innopolis.university/index.php/BSc:_Data_Structures_Algorithms CSE117] — Data Structures and Algorithms: python, numpy, basic object-oriented concepts, memory management
-* [https://eduwiki.innopolis.university/index.php/BSc:_Mathematical_Analysis_I CSE201] — Mathematical Analysis I
-* [https://eduwiki.innopolis.university/index.php/BSc:_Mathematical_Analysis_II CSE203] — Mathematical Analysis II
-* [https://eduwiki.innopolis.university/index.php/BSc:_Analytic_Geometry_And_Linear_Algebra_I1 CSE202] — Analytical Geometry and Linear Algebra I
-* [https://eduwiki.innopolis.university/index.php/BSc:_Analytic_Geometry_And_Linear_Algebra_I1 CSE204] — Analytic Geometry And Linear Algebra II
-== Course characteristics ==
-=== Key concepts of the class ===
+=== Prerequisite subjects ===
+* CSE117 — Data Structures and Algorithms: python, numpy, basic object-oriented concepts, memory management
+* CSE201 — Mathematical Analysis I
+* CSE203 — Mathematical Analysis II
+* CSE202 — Analytical Geometry and Linear Algebra I
+* CSE204 — Analytic Geometry And Linear Algebra II
+=== Prerequisite topics ===
-* Machine learning paradigms
-* Machine Learning approaches, and algorithms
-=== What is the purpose of this course? ===
+== Course Topics ==
+{| class="wikitable"
+|+ Course Sections and Topics
+|-
+! Section !! Topics within the section
+|-
+| Supervised Learning ||
+# Introduction to Machine Learning
+# Derivatives and Cost Function
+# Data Pre-processing
+# Linear Regression
+# Multiple Linear Regression
+# Gradient Descent
+# Polynomial Regression
+# Splines
+# Bias-varaince Tradeoff
+# Difference between classification and regression
+# Logistic Regression
+# Naive Bayes
+# Bayesian Network
+# KNN
+# Confusion Metrics
+# Performance Metrics
+# Regularization
+# Hyperplane Based Classification
+# Perceptron Learning Algorithm
+# Max-Margin Classification
+# Support Vector Machines
+# Slack Variables
+# Lagrangian Support Vector Machines
+# Kernel Trick
+|-
+| Decision Trees and Ensemble Methods ||
+# Decision Trees
+# Bagging
+# Boosting
+# Random Forest
+# Adaboost
+|-
+| Unsupervised Learning ||
+# K-means Clustering
+# K-means++
+# Hierarchical Clustering
+# DBSCAN
+# Mean-shift
+|-
+| Deep Learning ||
+# Artificial Neural Networks
+# Back-propagation
+# Convolutional Neural Networks
+# Autoencoder
+# Variatonal Autoencoder
+# Generative Adversairal Networks
+|}
+== Intended Learning Outcomes (ILOs) ==
+=== What is the main purpose of this course? ===
 There is a growing business need of individuals skilled in artificial intelligence, data analytics, and machine learning. Therefore, the purpose of this course is to provide students with an intensive treatment of a cross-section of the key elements of machine learning, with an emphasis on implementing them in modern programming environments, and using them to solve real-world data science problems.
+=== ILOs defined at three levels ===
+==== Level 1: What concepts should a student know/remember/explain? ====
+By the end of the course, the students should be able to ...
-== Course Objectives Based on Bloom’s Taxonomy ==
-==== - What should a student remember at the end of the course? ====
-By the end of the course, the students should be able to recognize and define
 * Different learning paradigms
 * A wide variety of learning approaches and algorithms
@@ Line 36: / Line 89: @@
 * Popular machine learning software tools
-==== What should a student be able to understand at the end of the course? ====
+==== Level 2: What basic practical skills should a student be able to perform? ====
+By the end of the course, the students should be able to ...
-By the end of the course, the students should be able to describe and explain (with examples)
 * Difference between different learning paradigms
 * Difference between classification and regression
@@ Line 48: / Line 99: @@
 * Neural or Deep Learning
-==== What should a student be able to apply at the end of the course? ====
+==== Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios? ====
+By the end of the course, the students should be able to ...
-By the end of the course, the students should be able to apply
 * Classification approaches to solve supervised learning problems
 * Clustering approaches to solve unsupervised learning problems
 * Ensemble learning to improve a model’s performance
 * Regularization to improve a model’s generalization
 * Deep learning algorithms to solve real-world problems
+== Grading ==
-=== Course evaluation ===
+=== Course grading range ===
+{| class="wikitable"
-{|
+|+
-|+ Course grade breakdown
-!
-!
-!align="center"| '''Proposed points'''
 |-
+! Grade !! Range !! Description of performance
-| Labs/seminar classes
-| 20
-|align="center"| 0
 |-
+| A. Excellent || 90-100 || -
-| Interim performance assessment
-| 30
-|align="center"| 40
 |-
+| B. Good || 75-89 || -
-| Exams
-| 50
+|-
+| C. Satisfactory || 60-74 || -
-|align="center"| 60
+|-
+| D. Poor || 0-59 || -
 |}
+=== Course activities and grading breakdown ===
-If necessary, please indicate freely your course’s features in terms of students’ performance assessment: None
+{| class="wikitable"
+|+
-=== Grades range ===
-{|
-|+ Course grading range
-!
-!
-!align="center"| '''Proposed range'''
 |-
+! Activity Type !! Percentage of the overall course grade
-| A. Excellent
-| 90-100
-|align="center"|
 |-
+| Labs/seminar classes || 0
-| B. Good
-| 75-89
-|align="center"|
 |-
+| Interim performance assessment || 40
-| C. Satisfactory
-| 60-74
-|align="center"|
 |-
-| D. Poor
+| Exams || 60
-| 0-59
-|align="center"|
 |}
+=== Recommendations for students on how to succeed in the course ===
-If necessary, please indicate freely your course’s grading features: The semester starts with the default range as proposed in the Table [[#tab:MLCourseGradingRange|[tab:MLCourseGradingRange]]], but it may change slightly (usually reduced) depending on how the semester progresses.
-=== Resources and reference material ===
+== Resources, literature and reference materials ==
-* T. Hastie, R. Tibshirani, D. Witten and G. James. ''<span>An Introduction to Statistical Learning. Springer 2013.</span>''
-* T. Hastie, R. Tibshirani, and J. Friedman. ''<span>The Elements of Statistical Learning. Springer 2011.</span>''
-* Tom M Mitchel. <span>''Machine Learning, McGraw Hill''</span>
-* Christopher M. Bishop. <span>''Pattern Recognition and Machine Learning, Springer''</span>
-== Course Sections ==
+=== Open access resources ===
+* T. Hastie, R. Tibshirani, D. Witten and G. James. An Introduction to Statistical Learning. Springer 2013.
+* T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer 2011.
+* Tom M Mitchel. Machine Learning, McGraw Hill
+* Christopher M. Bishop. Pattern Recognition and Machine Learning, Springer
+=== Closed access resources ===
-The main sections of the course and approximate hour distribution between them is as follows:
-{|
+=== Software and tools used within the course ===
-|+ Course Sections
-!align="center"| '''Section'''
+= Teaching Methodology: Methods, techniques, & activities =
-! '''Section Title'''
-!align="center"| '''Teaching Hours'''
+== Activities and Teaching Methods ==
+{| class="wikitable"
+|+ Activities within each section
+|-
+! Learning Activities !! Section 1 !! Section 2 !! Section 3 !! Section 4
 |-
+| Development of individual parts of software product code || 1 || 1 || 1 || 1
-|align="center"| 1
-| Supervised Learning
-|align="center"| 24
 |-
+| Homework and group projects || 1 || 1 || 1 || 1
-|align="center"| 2
-| Decision Trees and Ensemble Learning
-|align="center"| 8
 |-
+| Midterm evaluation || 1 || 1 || 1 || 1
-|align="center"| 3
-| Unsupervised Learning
-|align="center"| 8
 |-
+| Testing (written or computer based) || 1 || 1 || 1 || 1
-|align="center"| 4
+|-
-| Deep Learning
+| Discussions || 1 || 1 || 1 || 1
-|align="center"| 12
 |}
+== Formative Assessment and Course Activities ==
-=== Section 1 ===
+=== Ongoing performance assessment ===
-==== Section title: ====
+==== Section 1 ====
+{| class="wikitable"
+|+
-Supervised Learning
+|-
+! Activity Type !! Content !! Is Graded?
-==== Topics covered in this section: ====
+|-
+| Question || Is it true that in simple linear regression <math>{\textstyle R^{2}}</math>  and the squared correlation between X and Y are identical? || 1
-* Introduction to Machine Learning
+|-
-* Derivatives and Cost Function
+| Question || What are the two assumptions that the Linear regression model makes about the Error Terms? || 1
-* Data Pre-processing
+|-
-* Linear Regression
+| Question || Fit a regression model to a given data problem, and support your choice of the model. || 1
-* Multiple Linear Regression
+|-
-* Gradient Descent
+| Question || In a list of given tasks, choose which are regression and which are classification tasks. || 1
-* Polynomial Regression
+|-
-* Splines
+| Question || In a given graphical model of binary random variables, how many parameters are needed to define the Conditional Probability Distributions for this Bayes Net? || 1
-* Bias-varaince Tradeoff
+|-
-* Difference between classification and regression
+| Question || Write the mathematical form of the minimization objective of Rosenblatt’s perceptron learning algorithm for a two-dimensional case. || 1
-* Logistic Regression
+|-
-* Naive Bayes
+| Question || What is perceptron learning algorithm? || 1
-* Bayesian Network
+|-
-* KNN
+| Question || Write the mathematical form of its minimization objective for a two-dimensional case. || 1
-* Confusion Metrics
+|-
-* Performance Metrics
+| Question || What is a max-margin classifier? || 1
-* Regularization
+|-
-* Hyperplane Based Classification
+| Question || Explain the role of slack variable in SVM. || 1
-* Perceptron Learning Algorithm
+|-
-* Max-Margin Classification
+| Question || How to implement various regression models to solve different regression problems? || 0
-* Support Vector Machines
+|-
-* Slack Variables
+| Question || Describe the difference between different types of regression models, their pros and cons, etc. || 0
-* Lagrangian Support Vector Machines
+|-
-* Kernel Trick
+| Question || Implement various classification models to solve different classification problems. || 0
+|-
-==== What forms of evaluation were used to test students’ performance in this section? ====
+| Question || Describe the difference between Logistic regression and naive bayes. || 0
+|-
-<div class="tabular">
+| Question || Implement perceptron learning algorithm, SVMs, and its variants to solve different classification problems. || 0
+|-
-<span>|a|c|</span> &amp; '''Yes/No'''<br />
+| Question || Solve a given optimization problem using the Lagrange multiplier method. || 0
-Development of individual parts of software product code &amp; 1<br />
+|}
-Homework and group projects &amp; 1<br />
+==== Section 2 ====
-Midterm evaluation &amp; 1<br />
+{| class="wikitable"
-Testing (written or computer based) &amp; 1<br />
+|+
-Reports &amp; 0<br />
+|-
-Essays &amp; 0<br />
+! Activity Type !! Content !! Is Graded?
-Oral polls &amp; 0<br />
+|-
-Discussions &amp; 1<br />
+| Question || What are pros and cons of decision trees over other classification models? || 1
+|-
+| Question || Explain how tree-pruning works. || 1
+|-
-</div>
+| Question || What is the purpose of ensemble learning? || 1
-==== Typical questions for ongoing performance evaluation within this section ====
+|-
+| Question || What is a bootstrap, and what is its role in Ensemble learning? || 1
-# Is it true that in simple linear regression <math display="inline">R^2</math> and the squared correlation between X and Y are identical?
+|-
-# What are the two assumptions that the Linear regression model makes about the '''Error Terms'''?
+| Question || Explain the role of slack variable in SVM. || 1
-# Fit a regression model to a given data problem, and support your choice of the model.
+|-
-# In a list of given tasks, choose which are regression and which are classification tasks.
+| Question || Implement different variants of decision trees to solve different classification problems. || 0
-# In a given graphical model of binary random variables, how many parameters are needed to define the Conditional Probability Distributions for this Bayes Net?
+|-
-# Write the mathematical form of the minimization objective of Rosenblatt’s perceptron learning algorithm for a two-dimensional case.
+| Question || Solve a given classification problem problem using an ensemble classifier. || 0
-# What is perceptron learning algorithm?
+|-
-# Write the mathematical form of its minimization objective for a two-dimensional case.
+| Question || Implement Adaboost for a given problem. || 0
-# What is a max-margin classifier?
+|}
-# Explain the role of slack variable in SVM.
+==== Section 3 ====
+{| class="wikitable"
-==== Typical questions for seminar classes (labs) within this section ====
+|+
+|-
-# How to implement various regression models to solve different regression problems?
+! Activity Type !! Content !! Is Graded?
-# Describe the difference between different types of regression models, their pros and cons, etc.
+|-
-# Implement various classification models to solve different classification problems.
+| Question || Which implicit or explicit objective function does K-means implement? || 1
-# Describe the difference between Logistic regression and naive bayes.
+|-
-# Implement perceptron learning algorithm, SVMs, and its variants to solve different classification problems.
+| Question || Explain the difference between k-means and k-means++. || 1
-# Solve a given optimization problem using the Lagrange multiplier method.
+|-
+| Question || Whaat is single-linkage and what are its pros and cons? || 1
-==== Test questions for final assessment in this section ====
+|-
+| Question || Explain how DBSCAN works. || 1
-# What does it mean for the standard least squares coefficient estimates of linear regression to be ''scale equivariant''?
+|-
+| Question || Implement different clustering algorithms to solve to solve different clustering problems. || 0
+|-
+| Question || Implement Mean-shift for video tracking || 0
+|}
+==== Section 4 ====
+{| class="wikitable"
+|+
+|-
+! Activity Type !! Content !! Is Graded?
+|-
+| Question || What is a fully connected feed-forward ANN? || 1
+|-
+| Question || Explain different hyperparameters of CNNs. || 1
+|-
+| Question || Calculate KL-divergence between two probability distributions. || 1
+|-
+| Question || What is a generative model and how is it different from a discriminative model? || 1
+|-
+| Question || Implement different types of ANNs to solve to solve different classification problems. || 0
+|-
+| Question || Calculate KL-divergence between two probability distributions. || 0
+|-
+| Question || Implement different generative models for different problems. || 0
+|}
+=== Final assessment ===
+'''Section 1'''
+# What does it mean for the standard least squares coefficient estimates of linear regression to be scale equivariant?
 # Given a fitted regression model to a dataset, interpret its coefficients.
 # Explain which regression model would be a better fit to model the relationship between response and predictor in a given data.
@@ Line 225: / Line 283: @@
 # How does the choice of slack variable affect the bias-variance tradeoff in SVM?
 # Explain which Kernel would be a better fit to be used in SVM for a given data.
+'''Section 2'''
-=== Section 2 ===
-==== Section title: ====
-Decision Trees and Ensemble Methods
-==== Topics covered in this section: ====
-* Decision Trees
-* Bagging
-* Boosting
-* Random Forest
-* Adaboost
-==== What forms of evaluation were used to test students’ performance in this section? ====
-<div class="tabular">
-<span>|a|c|</span> &amp; '''Yes/No'''<br />
-Development of individual parts of software product code &amp; 1<br />
-Homework and group projects &amp; 1<br />
-Midterm evaluation &amp; 1<br />
-Testing (written or computer based) &amp; 1<br />
-Reports &amp; 0<br />
-Essays &amp; 0<br />
-Oral polls &amp; 0<br />
-Discussions &amp; 1<br />
-</div>
-==== Typical questions for ongoing performance evaluation within this section ====
-# What are pros and cons of decision trees over other classification models?
-# Explain how tree-pruning works.
-# What is the purpose of ensemble learning?
-# What is a bootstrap, and what is its role in Ensemble learning?
-# Explain the role of slack variable in SVM.
-==== Typical questions for seminar classes (labs) within this section ====
-# Implement different variants of decision trees to solve different classification problems.
-# Solve a given classification problem problem using an ensemble classifier.
-# Implement Adaboost for a given problem.
-==== Test questions for final assessment in this section ====
 # When a decision tree is grown to full depth, how does it affect tree’s bias and variance, and its response to noisy data?
 # Argue if an ensemble model would be a better choice for a given classification problem or not.
 # Given a particular iteration of boosting and other important information, calculate the weights of the Adaboost classifier.
+'''Section 3'''
-=== Section 3 ===
-==== Section title: ====
-Unsupervised Learning
-==== Topics covered in this section: ====
-* K-means Clustering
-* K-means++
-* Hierarchical Clustering
-* DBSCAN
-* Mean-shift
-==== What forms of evaluation were used to test students’ performance in this section? ====
-<div class="tabular">
-<span>|a|c|</span> &amp; '''Yes/No'''<br />
-Development of individual parts of software product code &amp; 1<br />
-Homework and group projects &amp; 1<br />
-Midterm evaluation &amp; 1<br />
-Testing (written or computer based) &amp; 1<br />
-Reports &amp; 0<br />
-Essays &amp; 0<br />
-Oral polls &amp; 0<br />
-Discussions &amp; 1<br />
-</div>
-==== Typical questions for ongoing performance evaluation within this section ====
-# Which implicit or explicit objective function does K-means implement?
-# Explain the difference between k-means and k-means++.
-# Whaat is single-linkage and what are its pros and cons?
-# Explain how DBSCAN works.
-==== Typical questions for seminar classes (labs) within this section ====
-# Implement different clustering algorithms to solve to solve different clustering problems.
-# Implement Mean-shift for video tracking
-==== Test questions for final assessment in this section ====
 # K-Means does not explicitly use a fitness function. What are the characteristics of the solutions that K-Means finds? Which fitness function does it implicitly minimize?
 # Suppose we clustered a set of N data points using two different specified clustering algorithms. In both cases we obtained 5 clusters and in both cases the centers of the clusters are exactly the same. Can 3 points that are assigned to different clusters in one method be assigned to the same cluster in the other method?
 # What are the characterics of noise points in DBSCAN?
+'''Section 4'''
-=== Section 4 ===
-==== Section title: ====
-Deep Learning
-==== Topics covered in this section: ====
-* Artificial Neural Networks
-* Back-propagation
-* Convolutional Neural Networks
-* Autoencoder
-* Variatonal Autoencoder
-* Generative Adversairal Networks
-==== What forms of evaluation were used to test students’ performance in this section? ====
-<div class="tabular">
-<span>|a|c|</span> &amp; '''Yes/No'''<br />
-Development of individual parts of software product code &amp; 1<br />
-Homework and group projects &amp; 1<br />
-Midterm evaluation &amp; 1<br />
-Testing (written or computer based) &amp; 1<br />
-Reports &amp; 0<br />
-Essays &amp; 0<br />
-Oral polls &amp; 0<br />
-Discussions &amp; 1<br />
-</div>
-==== Typical questions for ongoing performance evaluation within this section ====
-# What is a fully connected feed-forward ANN?
-# Explain different hyperparameters of CNNs.
-# Calculate KL-divergence between two probability distributions.
-# What is a generative model and how is it different from a discriminative model?
-==== Typical questions for seminar classes (labs) within this section ====
-# Implement different types of ANNs to solve to solve different classification problems.
-# Calculate KL-divergence between two probability distributions.
-# Implement different generative models for different problems.
-==== Test questions for final assessment in this section ====
 # Explain what is ReLU, what are its different variants, and what are their pros and cons?
 # Calculate the number of parameters to be learned during training in a CNN, given all important information.
 # Explain how a VAE can be used as a generative model.
-== Exams and retake planning ==
+=== The retake exam ===
+'''Section 1'''
-=== Exam ===
-Exams will be paper-based and will be conducted in a form of problem solving, where the problems will be similar to those mentioned above and will based on the contents taught in lecture slides, lecture discussions (including white-board materials), lab materials, reading materials (including the text books), etc. Students will be given 1-3 hours to complete the exam.
-=== Retake 1 ===
+'''Section 2'''
-First retake will be conducted in the same form as the final exam. The weight of the retake exam will be 5% larger than the passing threshold of the course.
+'''Section 3'''
-=== Retake 2 ===
+'''Section 4'''
-Second retake will be conducted in the same form as the final exam. The weight of the retake exam will be 5% larger than the passing threshold of the course.