Difference between revisions of "IU:TestPage"

From IU
Jump to navigation Jump to search
Line 1: Line 1:
   
  +
= Practical Machine Learning and Deep Learning =
= Analytical Geometry & Linear Algebra – II =
 
* '''Course name''': Analytical Geometry & Linear Algebra – II
+
* '''Course name''': Practical Machine Learning and Deep Learning
 
* '''Code discipline''':
 
* '''Code discipline''':
* '''Subject area''': fundamental principles of linear algebra,; concepts of linear algebra objects and their representation in vector-matrix form
+
* '''Subject area''': Practical aspects of deep learning (DL); Practical applications of DL in Natural Language Processing, Computer Vision and generation.
   
 
== Short Description ==
 
== Short Description ==
Line 11: Line 11:
   
 
=== Prerequisite subjects ===
 
=== Prerequisite subjects ===
  +
* CSE202 — Analytical Geometry and Linear Algebra I / []: Manifolds "Linear Alg./Calculus: Manifolds
 
  +
* CSE203 — Mathematical Analysis II: Basics of optimisation
  +
* CSE201 — Mathematical Analysis I: integration and differentiation.
  +
* CSE103 — Theoretical Computer Science: Graph theory basics, Spectral decomposition.
  +
* CSE206 — Probability And Statistics: Multivariate normal dist.
  +
* CSE504 — Digital Signal Processing: convolution, cross-correlation"
   
 
=== Prerequisite topics ===
 
=== Prerequisite topics ===
Line 22: Line 27:
 
! Section !! Topics within the section
 
! Section !! Topics within the section
 
|-
 
|-
  +
| Review. CNNs and RNNs ||
| Linear equation system solving by using the vector-matrix approach ||
 
  +
# Image processing, FFNs, CNNs
# The geometry of linear equations. Elimination with matrices.
 
  +
# Training Deep NNs
# Matrix operations, including inverses. <math>{\textstyle LU}</math> and <math>{\textstyle LDU}</math> factorization.
 
  +
# RNNs, LSTM, GRU, Embeddings
# Transposes and permutations. Vector spaces and subspaces.
 
  +
# Bidirectional RNNs
# The null space: Solving <math>{\textstyle Ax=0}</math> and <math>{\textstyle Ax=b}</math> . Row reduced echelon form. Matrix rank.
 
  +
# Seq2seq
|-
 
  +
# Encoder-Decoder Networks
| Linear regression analysis and decomposition <math>{\textstyle A=QR}</math> .
 
  +
# Attention
||
 
  +
# Memory Networks
# Independence, basis and dimension. The four fundamental subspaces.
 
# Orthogonal vectors and subspaces. Projections onto subspaces
 
# Projection matrices. Least squares approximations. Gram-Schmidt and A = QR.
 
 
|-
 
|-
  +
| Team Data Science Processes ||
| Fast Fourier Transform. Matrix Diagonalization. ||
 
  +
# Team Data Science Processes
# Complex Numbers. Hermitian and Unitary Matrices.
 
  +
# Team Data Science Roles
# Fourier Series. The Fast Fourier Transform
 
  +
# Team Data Science Tools (MLFlow, KubeFlow)
# Eigenvalues and eigenvectors. Matrix diagonalization.
 
  +
# CRISP-DM
  +
# Productionizing ML systems
 
|-
 
|-
  +
| VAEs, GANs ||
| Symmetric, positive definite and similar matrices. Singular value decomposition. ||
 
  +
# Autoencoders
# Linear differential equations.
 
  +
# Variational Autoencoders
# Symmetric matrices. Positive definite matrices.
 
  +
# GANs, DCGAN
# Similar matrices. Left and right inverses, pseudoinverse. Singular value decomposition (SVD).
 
 
|}
 
|}
 
== Intended Learning Outcomes (ILOs) ==
 
== Intended Learning Outcomes (ILOs) ==
   
 
=== What is the main purpose of this course? ===
 
=== What is the main purpose of this course? ===
  +
The course is about the practical aspects of deep learning. In addition to frontal lectures, the flipped classes and student project presentations will be organized. During lab sessions the working language is Python. The primary framework for deep learning is PyTorch. Usage of TensorFlow and Keras is possible, usage of Docker is highly appreciated.
This course covers matrix theory and linear algebra, emphasizing topics useful in other disciplines. Linear algebra is a branch of mathematics that studies systems of linear equations and the properties of matrices. The concepts of linear algebra are extremely useful in physics, data sciences, and robotics. Due to its broad range of applications, linear algebra is one of the most widely used subjects in mathematics.
 
   
 
=== ILOs defined at three levels ===
 
=== ILOs defined at three levels ===
Line 53: Line 58:
 
==== Level 1: What concepts should a student know/remember/explain? ====
 
==== Level 1: What concepts should a student know/remember/explain? ====
 
By the end of the course, the students should be able to ...
 
By the end of the course, the students should be able to ...
  +
* to apply deep learning methods to effectively solve practical (real-world) problems;
* List basic notions of linear algebra
 
  +
* to work in data science team;
* Understand key principles involved in solution of linear equation systems and the properties of matrices
 
  +
* to understand of principles and a lifecycle of data science projects.
* Linear regression analysis
 
* Fast Fourier Transform
 
* How to find eigenvalues and eigenvectors for matrix diagonalization and single value decomposition
 
   
 
==== Level 2: What basic practical skills should a student be able to perform? ====
 
==== Level 2: What basic practical skills should a student be able to perform? ====
 
By the end of the course, the students should be able to ...
 
By the end of the course, the students should be able to ...
  +
* to understand modern deep NN architectures;
* Key principles involved in solution of linear equation systems and the properties of matrices
 
  +
* to compare modern deep NN architectures;
* Become familiar with the four fundamental subspaces
 
  +
* to create a prototype of a data-driven product.
* Linear regression analysis
 
* Fast Fourier Transform
 
* How to find eigenvalues and eigenvectors for matrix diagonalization and single value decomposition
 
   
 
==== Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios? ====
 
==== Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios? ====
 
By the end of the course, the students should be able to ...
 
By the end of the course, the students should be able to ...
  +
* to apply techniques for efficient training of deep NNs;
* Linear equation system solving by using the vector-matrix approach
 
  +
* to apply methods for data science team organisation;
* Make linear regression analysis
 
  +
* to apply deep NNs in NLP and computer vision.
* Fast Fourier Transform
 
* To find eigenvalues and eigenvectors for matrix diagonalization and single value decomposition
 
 
== Grading ==
 
== Grading ==
   
Line 81: Line 81:
 
! Grade !! Range !! Description of performance
 
! Grade !! Range !! Description of performance
 
|-
 
|-
| A. Excellent || 85-100 || -
+
| A. Excellent || 90-100 || -
 
|-
 
|-
| B. Good || 65-84 || -
+
| B. Good || 75-89 || -
 
|-
 
|-
| C. Satisfactory || 50-64 || -
+
| C. Satisfactory || 60-74 || -
 
|-
 
|-
| D. Poor || 0-49 || -
+
| D. Poor || 0-59 || -
 
|}
 
|}
   
Line 109: Line 109:
   
 
=== Open access resources ===
 
=== Open access resources ===
  +
* Goodfellow et al. Deep Learning, MIT Press. 2017
* Gilbert Strang. Linear Algebra and Its Applications, 4th Edition, Brooks Cole, 2006. ISBN: 9780030105678
 
  +
* Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 2017.
* Gilbert Strang. Introduction to Linear Algebra, 4th Edition, Wellesley, MA: Wellesley-Cambridge Press, 2009. ISBN: 9780980232714
 
  +
* Osinga, Douwe. Deep Learning Cookbook: Practical Recipes to Get Started Quickly. O’Reilly Media, 2018.
* Gilbert Strang, Brett Coonley, Andrew Bulman-Fleming. Student Solutions Manual for Strang’s Linear Algebra and Its Applications, 4th Edition, Thomson Brooks, 2005. ISBN-13: 9780495013259
 
   
 
=== Closed access resources ===
 
=== Closed access resources ===
Line 124: Line 124:
 
|+ Activities within each section
 
|+ Activities within each section
 
|-
 
|-
! Learning Activities !! Section 1 !! Section 2 !! Section 3 !! Section 4
+
! Learning Activities !! Section 1 !! Section 2 !! Section 3
 
|-
 
|-
| Development of individual parts of software product code || 1 || 1 || 1 || 1
+
| Development of individual parts of software product code || 1 || 1 || 1
 
|-
 
|-
| Homework and group projects || 1 || 1 || 1 || 1
+
| Homework and group projects || 1 || 1 || 1
 
|-
 
|-
| Midterm evaluation || 1 || 1 || 1 || 1
+
| Midterm evaluation || 1 || 1 || 1
 
|-
 
|-
| Testing (written or computer based) || 1 || 1 || 1 || 1
+
| Testing (written or computer based) || 1 || 1 || 1
 
|-
 
|-
| Discussions || 1 || 1 || 1 || 1
+
| Discussions || 1 || 1 || 1
 
|}
 
|}
 
== Formative Assessment and Course Activities ==
 
== Formative Assessment and Course Activities ==
Line 146: Line 146:
 
! Activity Type !! Content !! Is Graded?
 
! Activity Type !! Content !! Is Graded?
 
|-
 
|-
  +
| Question || Suppose you use Batch Gradient Descent and you plot the validation error at every epoch. If you notice that the validation error consistently goes up, what is likely going on? How can you fix this? || 1
| Question || How to perform Gauss elimination? || 1
 
 
|-
 
|-
  +
| Question || Is it a good idea to stop Mini-batch Gradient Descent immediately when the validation error goes up? || 1
| Question || How to perform matrices multiplication? || 1
 
 
|-
 
|-
| Question || How to perform LU factorization? || 1
+
| Question || List the optimizers that you know (except SGD) and explain one of them || 1
 
|-
 
|-
| Question || How to find complete solution for any linear equation system Ax=b? || 1
+
| Question || Describe Xavier (or Glorot) initialization. Why do you need it? || 1
 
|-
 
|-
| Question || Find the solution for the given linear equation system <math>{\textstyle Ax=b}</math> by using Gauss elimination. || 0
+
| Question || Name advantages of the ELU activation function over ReLU. || 0
 
|-
 
|-
  +
| Question || Can you name the main innovations in AlexNet, compared to LeNet-5? What about the main innovations in GoogLeNet and ResNet? || 0
| Question || Perform <math>{\textstyle A=LU}</math> factorization for the given matrix <math>{\textstyle A}</math> . || 0
 
 
|-
 
|-
  +
| Question || What is the difference between LSTM and GRU cells? || 0
| Question || Factor the given symmetric matrix <math>{\textstyle A}</math> into <math>{\textstyle A=LDL^{T}}</math> with the diagonal pivot matrix <math>{\textstyle D}</math> . || 0
 
|-
 
| Question || Find inverse matrix <math>{\textstyle A^{-}1}</math> for the given matrix <math>{\textstyle A}</math> . || 0
 
 
|}
 
|}
 
==== Section 2 ====
 
==== Section 2 ====
Line 168: Line 166:
 
! Activity Type !! Content !! Is Graded?
 
! Activity Type !! Content !! Is Graded?
 
|-
 
|-
| Question || What is linear independence of vectors? || 1
+
| Question || What is CRISP-DM? || 1
 
|-
 
|-
| Question || Define the four fundamental subspaces of a matrix? || 1
+
| Question || What is TDSP? || 1
 
|-
 
|-
| Question || How to define orthogonal vectors and subspaces? || 1
+
| Question || How to use MLflow? || 1
 
|-
 
|-
| Question || How to define orthogonal complements of the space? || 1
+
| Question || What is TensorBoard? || 1
 
|-
 
|-
| Question || How to find vector projection on a subspace? || 1
+
| Question || How to apply Kubeflow in practice? || 1
 
|-
 
|-
| Question || How to perform linear regression for the given measurements? || 1
+
| Question || Explain issues in distributed learning of deep NNs. || 0
 
|-
 
|-
| Question || How to find an orthonormal basis for the subspace spanned by the given vectors? || 1
+
| Question || How do you organize your data science project? || 0
 
|-
 
|-
| Question || Check out linear independence of the given vectors || 0
+
| Question || Recall a checklist for organization of a typical data science project. || 0
|-
 
| Question || Find four fundamental subspaces of the given matrix. || 0
 
|-
 
| Question || Check out orthogonality of the given subspaces. || 0
 
|-
 
| Question || Find orthogonal complement for the given subspace. || 0
 
|-
 
| Question || Find vector projection on the given subspace. || 0
 
|-
 
| Question || Perform linear regression for the given measurements. || 0
 
|-
 
| Question || Find an orthonormal basis for the subspace spanned by the given vectors. || 0
 
 
|}
 
|}
 
==== Section 3 ====
 
==== Section 3 ====
Line 202: Line 188:
 
! Activity Type !! Content !! Is Graded?
 
! Activity Type !! Content !! Is Graded?
 
|-
 
|-
| Question || Make the definition of Hermitian Matrix. || 1
+
| Question || What is an Autoencoder? Can you list the structure and types of Autoencoders? || 1
 
|-
 
|-
| Question || Make the definition of Unitary Matrix. || 1
+
| Question || Can you describe ways to train Stacked AEs? || 1
 
|-
 
|-
| Question || How to find matrix for the Fourier transform? || 1
+
| Question || What is Denoising AE? Can you describe what is sparsity loss and why it can be useful? || 1
 
|-
 
|-
| Question || When we can make fast Fourier transform? || 1
+
| Question || Can you make a distinction between AE and VAE? || 1
 
|-
 
|-
  +
| Question || If an autoencoder perfectly reconstructs the inputs, is it necessarily a good autoencoder? How can you evaluate the performance of an autoencoder? || 0
| Question || How to find eigenvalues and eigenvectors of a matrix? || 1
 
 
|-
 
|-
| Question || How to diagonalize a square matrix? || 1
+
| Question || How do you tie weights in a stacked autoencoder? What is the point of doing so? || 0
 
|-
 
|-
| Question || Check out is the given matrix Hermitian. || 0
+
| Question || What about the main risk of an overcomplete autoencoder? || 0
 
|-
 
|-
| Question || Check out is the given matrix Unitary. || 0
+
| Question || How the loss function for VAE is defined? What is ELBO? || 0
 
|-
 
|-
| Question || Find the matrix for the given Fourier transform. || 0
+
| Question || Can you list the structure and types of a GAN? || 0
 
|-
 
|-
| Question || Find eigenvalues and eigenvectors for the given matrix. || 0
+
| Question || How would you train a GAN? || 0
 
|-
 
|-
| Question || Find diagonalize form for the given matrix. || 0
+
| Question || How would you estimate the quality of a GAN? || 0
|}
 
==== Section 4 ====
 
{| class="wikitable"
 
|+
 
|-
 
! Activity Type !! Content !! Is Graded?
 
|-
 
| Question || How to solve linear differential equations? || 1
 
|-
 
| Question || Make the definition of symmetric matrix? || 1
 
|-
 
| Question || Make the definition of positive definite matrix? || 1
 
|-
 
| Question || Make the definition of similar matrices? || 1
 
|-
 
| Question || How to find left and right inverses matrices, pseudoinverse matrix? || 1
 
|-
 
| Question || How to make singular value decomposition of the matrix? || 1
 
|-
 
| Question || Find solution of the linear differential equation. || 0
 
|-
 
| Question || Make the definition of symmetric matrix. || 0
 
|-
 
| Question || Check out the given matrix on positive definess || 0
 
|-
 
| Question || Check out the given matrices on similarity. || 0
 
|-
 
| Question || For the given matrix find left and right inverse matrices, pseudoinverse matrix. || 0
 
 
|-
 
|-
| Question || Make the singular value decomposition of the given matrix. || 0
+
| Question || Can you describe cost function of a Discriminator? || 0
 
|}
 
|}
 
=== Final assessment ===
 
=== Final assessment ===
 
'''Section 1'''
 
'''Section 1'''
  +
# Explain what the Teacher Forcing is.
# Find linear independent vectors (exclude dependent): <math>{\textstyle {\overrightarrow {a}}=[4,0,3,2]^{T}}</math> , <math>{\textstyle {\overrightarrow {b}}=[1,-7,4,5]^{T}}</math> , <math>{\textstyle {\overrightarrow {c}}=[7,1,5,3]^{T}}</math> , <math>{\textstyle {\overrightarrow {d}}=[-5,-3,-3,-1]^{T}}</math> , <math>{\textstyle {\overrightarrow {e}}=[1,-5,2,3]^{T}}</math> . Find <math>{\textstyle rank(A)}</math> if <math>{\textstyle A}</math> is a composition of this vectors. Find <math>{\textstyle rank(A^{T})}</math> .
 
  +
# Why do people use encoder–decoder RNNs rather than plain sequence-to-sequence RNNs for automatic translation?
# Find <math>{\textstyle E}</math> : <math>{\textstyle EA=U}</math> (<math>{\textstyle U}</math> – upper-triangular matrix). Find <math>{\textstyle L=E^{-}1}</math> , if <math>{\textstyle A=\left({\begin{array}{ccc}2&5&7\\6&4&9\\4&1&8\\\end{array}}\right)}</math> .
 
  +
# How could you combine a convolutional neural network with an RNN to classify videos?
# Find complete solution for the system <math>{\textstyle Ax=b}</math> , if <math>{\textstyle b=[7,18,5]^{T}}</math> and <math>{\textstyle A=\left({\begin{array}{cccc}6&-2&1&-4\\4&2&14&-31\\2&-1&3&-7\\\end{array}}\right)}</math> . Provide an example of vector b that makes this system unsolvable.
 
 
'''Section 2'''
 
'''Section 2'''
  +
# Can you explain what it means for a company to be ML-ready?
# Find the dimensions of the four fundamental subspaces associated with <math>{\textstyle A}</math> , depending on the parameters <math>{\textstyle a}</math> and <math>{\textstyle b}</math> : <math>{\textstyle A=\left({\begin{array}{cccc}7&8&5&3\\4&a&3&2\\6&8&4&b\\3&4&2&1\\\end{array}}\right)}</math> .
 
  +
# What a company can do to become ML-ready / Data driven?
# Find a vector <math>{\textstyle x}</math> orthogonal to the Row space of matrix <math>{\textstyle A}</math> , and a vector <math>{\textstyle y}</math> orthogonal to the <math>{\textstyle C(A)}</math> , and a vector <math>{\textstyle z}</math> orthogonal to the <math>{\textstyle N(A)}</math> : <math>{\textstyle A=\left({\begin{array}{ccc}1&2&2\\3&4&2\\4&6&4\\\end{array}}\right)}</math> .
 
  +
# Can you list approaches to structure DS-teams? Discuss their advantages and disadvantages.
# Find the best straight-line <math>{\textstyle y(x)}</math> fit to the measurements: <math>{\textstyle y(-2)=4}</math> , <math>{\textstyle y(-1)=3}</math> , <math>{\textstyle y(0)=2}</math> , <math>{\textstyle y(1)-0}</math> .
 
  +
# Can you list and define typical roles in a DS team?
# Find the projection matrix <math>{\textstyle P}</math> of vector <math>{\textstyle [4,3,2,0]^{T}}</math> onto the <math>{\textstyle C(A)}</math> : <math>{\textstyle A=\left({\begin{array}{cc}1&-2\\1&-1\\1&0\\1&1\\\end{array}}\right)}</math> .
 
  +
# What do you think about practical aspects of processes and roles in Data Science projects/teams?
# Find an orthonormal basis for the subspace spanned by the vectors: <math>{\textstyle {\overrightarrow {a}}=[-2,2,0,0]^{T}}</math> , <math>{\textstyle {\overrightarrow {b}}=[0,1,-1,0]^{T}}</math> , <math>{\textstyle {\overrightarrow {c}}=[0,1,0,-1]^{T}}</math> . Then express <math>{\textstyle A=[a,b,c]}</math> in the form of <math>{\textstyle A=QR}</math>
 
 
'''Section 3'''
 
'''Section 3'''
  +
# Can you make a distinction between Variational approximation of density and MCMC methods for density estimation?
# Find eigenvector of the circulant matrix <math>{\textstyle C}</math> for the eigenvalue = <math>{\textstyle {c}_{1}}</math> +<math>{\textstyle {c}_{2}}</math> +<math>{\textstyle {c}_{3}}</math> +<math>{\textstyle {c}_{4}}</math> : <math>{\textstyle C=\left({\begin{array}{cccc}{c}_{1}&{c}_{2}&{c}_{3}&{c}_{4}\\{c}_{4}&{c}_{1}&{c}_{2}&{c}_{3}\\{c}_{3}&{c}_{4}&{c}_{1}&{c}_{2}\\{c}_{2}&{c}_{3}&{c}_{4}&{c}_{1}\\\end{array}}\right)}</math> .
 
  +
# What is DCGAN? What is its purpose? What are main features of DCGAN?
# Diagonalize this matrix: <math>{\textstyle A=\left({\begin{array}{cc}2&1-i\\1+i&3\\\end{array}}\right)}</math> .
 
  +
# What is your opinion about Word Embeddings? What types do you know? Why are they useful?
# <math>{\textstyle A}</math> is the matrix with full set of orthonormal eigenvectors. Prove that <math>{\textstyle AA=A^{H}A^{H}}</math> .
 
  +
# How would you classify different CNN architectures?
# Find all eigenvalues and eigenvectors of the cyclic permutation matrix <math>{\textstyle P=\left({\begin{array}{cccc}0&1&0&0\\0&0&1&0\\0&0&0&1\\1&0&0&0\\\end{array}}\right)}</math> .
 
  +
# How would you classify different RNN architectures?
'''Section 4'''
 
  +
# Explain attention mechanism. What is self-attention?
# Find <math>{\textstyle det(e^{A})}</math> for <math>{\textstyle A=\left({\begin{array}{cc}2&1\\2&3\\\end{array}}\right)}</math> .
 
  +
# Explain the Transformer architecture. What is BERT?
# Write down the first order equation system for the following differential equation and solve it:
 
<math>{\textstyle d^{3}y/dx+d^{2}y/dx-2dy/dx=0}</math>
 
<math>{\textstyle y''(0)=6}</math> , <math>{\textstyle y'(0)=0}</math> , <math>{\textstyle y(0)=3}</math> .
 
Is the solution of this system will be stable?
 
# For which <math>{\textstyle a}</math> and <math>{\textstyle b}</math> quadratic form <math>{\textstyle Q(x,y,z)}</math> is positive definite:
 
<math>{\textstyle Q(x,y,z)=ax^{2}+y^{2}+2z^{2}+2bxy+4xz}</math>
 
# Find the SVD and the pseudoinverse of the matrix <math>{\textstyle A=\left({\begin{array}{ccc}1&0&0\\0&1&1\\\end{array}}\right)}</math> .
 
   
 
=== The retake exam ===
 
=== The retake exam ===
Line 286: Line 238:
   
 
'''Section 3'''
 
'''Section 3'''
 
'''Section 4'''
 

Revision as of 12:52, 20 April 2022

Practical Machine Learning and Deep Learning

  • Course name: Practical Machine Learning and Deep Learning
  • Code discipline:
  • Subject area: Practical aspects of deep learning (DL); Practical applications of DL in Natural Language Processing, Computer Vision and generation.

Short Description

Prerequisites

Prerequisite subjects

  • CSE202 — Analytical Geometry and Linear Algebra I / []: Manifolds "Linear Alg./Calculus: Manifolds
  • CSE203 — Mathematical Analysis II: Basics of optimisation
  • CSE201 — Mathematical Analysis I: integration and differentiation.
  • CSE103 — Theoretical Computer Science: Graph theory basics, Spectral decomposition.
  • CSE206 — Probability And Statistics: Multivariate normal dist.
  • CSE504 — Digital Signal Processing: convolution, cross-correlation"

Prerequisite topics

Course Topics

Course Sections and Topics
Section Topics within the section
Review. CNNs and RNNs
  1. Image processing, FFNs, CNNs
  2. Training Deep NNs
  3. RNNs, LSTM, GRU, Embeddings
  4. Bidirectional RNNs
  5. Seq2seq
  6. Encoder-Decoder Networks
  7. Attention
  8. Memory Networks
Team Data Science Processes
  1. Team Data Science Processes
  2. Team Data Science Roles
  3. Team Data Science Tools (MLFlow, KubeFlow)
  4. CRISP-DM
  5. Productionizing ML systems
VAEs, GANs
  1. Autoencoders
  2. Variational Autoencoders
  3. GANs, DCGAN

Intended Learning Outcomes (ILOs)

What is the main purpose of this course?

The course is about the practical aspects of deep learning. In addition to frontal lectures, the flipped classes and student project presentations will be organized. During lab sessions the working language is Python. The primary framework for deep learning is PyTorch. Usage of TensorFlow and Keras is possible, usage of Docker is highly appreciated.

ILOs defined at three levels

Level 1: What concepts should a student know/remember/explain?

By the end of the course, the students should be able to ...

  • to apply deep learning methods to effectively solve practical (real-world) problems;
  • to work in data science team;
  • to understand of principles and a lifecycle of data science projects.

Level 2: What basic practical skills should a student be able to perform?

By the end of the course, the students should be able to ...

  • to understand modern deep NN architectures;
  • to compare modern deep NN architectures;
  • to create a prototype of a data-driven product.

Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?

By the end of the course, the students should be able to ...

  • to apply techniques for efficient training of deep NNs;
  • to apply methods for data science team organisation;
  • to apply deep NNs in NLP and computer vision.

Grading

Course grading range

Grade Range Description of performance
A. Excellent 90-100 -
B. Good 75-89 -
C. Satisfactory 60-74 -
D. Poor 0-59 -

Course activities and grading breakdown

Activity Type Percentage of the overall course grade
Labs/seminar classes 20
Interim performance assessment 30
Exams 50

Recommendations for students on how to succeed in the course

Resources, literature and reference materials

Open access resources

  • Goodfellow et al. Deep Learning, MIT Press. 2017
  • Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 2017.
  • Osinga, Douwe. Deep Learning Cookbook: Practical Recipes to Get Started Quickly. O’Reilly Media, 2018.

Closed access resources

Software and tools used within the course

Teaching Methodology: Methods, techniques, & activities

Activities and Teaching Methods

Activities within each section
Learning Activities Section 1 Section 2 Section 3
Development of individual parts of software product code 1 1 1
Homework and group projects 1 1 1
Midterm evaluation 1 1 1
Testing (written or computer based) 1 1 1
Discussions 1 1 1

Formative Assessment and Course Activities

Ongoing performance assessment

Section 1

Activity Type Content Is Graded?
Question Suppose you use Batch Gradient Descent and you plot the validation error at every epoch. If you notice that the validation error consistently goes up, what is likely going on? How can you fix this? 1
Question Is it a good idea to stop Mini-batch Gradient Descent immediately when the validation error goes up? 1
Question List the optimizers that you know (except SGD) and explain one of them 1
Question Describe Xavier (or Glorot) initialization. Why do you need it? 1
Question Name advantages of the ELU activation function over ReLU. 0
Question Can you name the main innovations in AlexNet, compared to LeNet-5? What about the main innovations in GoogLeNet and ResNet? 0
Question What is the difference between LSTM and GRU cells? 0

Section 2

Activity Type Content Is Graded?
Question What is CRISP-DM? 1
Question What is TDSP? 1
Question How to use MLflow? 1
Question What is TensorBoard? 1
Question How to apply Kubeflow in practice? 1
Question Explain issues in distributed learning of deep NNs. 0
Question How do you organize your data science project? 0
Question Recall a checklist for organization of a typical data science project. 0

Section 3

Activity Type Content Is Graded?
Question What is an Autoencoder? Can you list the structure and types of Autoencoders? 1
Question Can you describe ways to train Stacked AEs? 1
Question What is Denoising AE? Can you describe what is sparsity loss and why it can be useful? 1
Question Can you make a distinction between AE and VAE? 1
Question If an autoencoder perfectly reconstructs the inputs, is it necessarily a good autoencoder? How can you evaluate the performance of an autoencoder? 0
Question How do you tie weights in a stacked autoencoder? What is the point of doing so? 0
Question What about the main risk of an overcomplete autoencoder? 0
Question How the loss function for VAE is defined? What is ELBO? 0
Question Can you list the structure and types of a GAN? 0
Question How would you train a GAN? 0
Question How would you estimate the quality of a GAN? 0
Question Can you describe cost function of a Discriminator? 0

Final assessment

Section 1

  1. Explain what the Teacher Forcing is.
  2. Why do people use encoder–decoder RNNs rather than plain sequence-to-sequence RNNs for automatic translation?
  3. How could you combine a convolutional neural network with an RNN to classify videos?

Section 2

  1. Can you explain what it means for a company to be ML-ready?
  2. What a company can do to become ML-ready / Data driven?
  3. Can you list approaches to structure DS-teams? Discuss their advantages and disadvantages.
  4. Can you list and define typical roles in a DS team?
  5. What do you think about practical aspects of processes and roles in Data Science projects/teams?

Section 3

  1. Can you make a distinction between Variational approximation of density and MCMC methods for density estimation?
  2. What is DCGAN? What is its purpose? What are main features of DCGAN?
  3. What is your opinion about Word Embeddings? What types do you know? Why are they useful?
  4. How would you classify different CNN architectures?
  5. How would you classify different RNN architectures?
  6. Explain attention mechanism. What is self-attention?
  7. Explain the Transformer architecture. What is BERT?

The retake exam

Section 1

Section 2

Section 3