Difference between revisions of "IU:TestPage"

From IU
Jump to navigation Jump to search
Line 1: Line 1:
   
  +
= Practical Machine Learning and Deep Learning =
= IT Product Development =
 
* '''Course name''': IT Product Development
+
* '''Course name''': Practical Machine Learning and Deep Learning
* '''Code discipline''': CSE807
+
* '''Code discipline''':
* '''Subject area''': S; o; f; t; w; a; r; e; ; E; n; g; i; n; e; e; r; i; n; g
+
* '''Subject area''': Practical aspects of deep learning (DL); Practical applications of DL in Natural Language Processing, Computer Vision and generation.
   
 
== Short Description ==
 
== Short Description ==
  +
This course has two parts: 1) building and launching a user-facing software product with the special emphasis on understanding user needs and 2) the application of data-driven product development techniques to iteratively improve the product. Students will learn how to transform an idea into software requirements through user research, prototyping and usability tests, then they will proceed to launch the MVP version of the product. In the second part of the course, the students will apply an iterative data-driven approach to developing a product, integrate event analytics, and run controlled experiments.
 
   
 
== Prerequisites ==
 
== Prerequisites ==
   
 
=== Prerequisite subjects ===
 
=== Prerequisite subjects ===
  +
* CSE202 — Analytical Geometry and Linear Algebra I / []: Manifolds "Linear Alg./Calculus: Manifolds
* CSE101: Introduction to Programming
 
* CSE112: Software Systems Analysis and Design
+
* CSE203 Mathematical Analysis II: Basics of optimisation
  +
* CSE201 — Mathematical Analysis I: integration and differentiation.
* CSE122 OR CSE804 OR CSE809 OR CSE812
 
  +
* CSE103 — Theoretical Computer Science: Graph theory basics, Spectral decomposition.
  +
* CSE206 — Probability And Statistics: Multivariate normal dist.
  +
* CSE504 — Digital Signal Processing: convolution, cross-correlation"
   
 
=== Prerequisite topics ===
 
=== Prerequisite topics ===
  +
* Basic programming skills.
 
* OOP, and software design.
 
* Familiarity with some development framework or technology (web or mobile)
 
   
 
== Course Topics ==
 
== Course Topics ==
Line 26: Line 27:
 
! Section !! Topics within the section
 
! Section !! Topics within the section
 
|-
 
|-
| From idea to MVP ||
+
| Review. CNNs and RNNs ||
  +
# Image processing, FFNs, CNNs
# Introduction to Product Development
 
  +
# Training Deep NNs
# Exploring the domain: User Research and Customer Conversations
 
  +
# RNNs, LSTM, GRU, Embeddings
# Documenting Requirements: MVP and App Features
 
  +
# Bidirectional RNNs
# Prototyping and usability testing
 
  +
# Seq2seq
  +
# Encoder-Decoder Networks
  +
# Attention
  +
# Memory Networks
 
|-
 
|-
  +
| Team Data Science Processes ||
| Development and Launch ||
 
  +
# Team Data Science Processes
# Product backlog and iterative development
 
  +
# Team Data Science Roles
# Estimation Techniques, Acceptance Criteria, and Definition of Done
 
  +
# Team Data Science Tools (MLFlow, KubeFlow)
# UX/UI Design
 
  +
# CRISP-DM
# Software Engineering vs Product Management
 
  +
# Productionizing ML systems
 
|-
 
|-
  +
| VAEs, GANs ||
| Hypothesis-driven development ||
 
  +
# Autoencoders
# Hypothesis-driven product development
 
  +
# Variational Autoencoders
# Measuring a product
 
  +
# GANs, DCGAN
# Controlled Experiments and A/B testing
 
 
|}
 
|}
 
== Intended Learning Outcomes (ILOs) ==
 
== Intended Learning Outcomes (ILOs) ==
   
 
=== What is the main purpose of this course? ===
 
=== What is the main purpose of this course? ===
  +
The course is about the practical aspects of deep learning. In addition to frontal lectures, the flipped classes and student project presentations will be organized. During lab sessions the working language is Python. The primary framework for deep learning is PyTorch. Usage of TensorFlow and Keras is possible, usage of Docker is highly appreciated.
The main purpose of this course is to enable a student to go from an idea to an MVP with the focus on delivering value to the customer and building the product in a data-driven evidence-based manner.
 
   
 
=== ILOs defined at three levels ===
 
=== ILOs defined at three levels ===
Line 52: Line 58:
 
==== Level 1: What concepts should a student know/remember/explain? ====
 
==== Level 1: What concepts should a student know/remember/explain? ====
 
By the end of the course, the students should be able to ...
 
By the end of the course, the students should be able to ...
  +
* to apply deep learning methods to effectively solve practical (real-world) problems;
* Describe the formula for stating a product idea and the importance of delivering value
 
  +
* to work in data science team;
* Remember the definition and main attributes of MVP
 
  +
* to understand of principles and a lifecycle of data science projects.
* Explain what are the main principles for building an effective customer conversation
 
* Describe various classification of prototypes and where each one is applied
 
* State the characteristics of a DEEP product backlog
 
* Elaborate on the main principles of an effective UI/UX product design (hierarchy, navigation, color, discoverability, understandability)
 
* List the key commonalities and differences between the mentality of a software engineer and a product manager
 
* Explain what is hypothesis-driven development
 
* Describe the important aspects and elements of a controlled experiment
 
   
 
==== Level 2: What basic practical skills should a student be able to perform? ====
 
==== Level 2: What basic practical skills should a student be able to perform? ====
 
By the end of the course, the students should be able to ...
 
By the end of the course, the students should be able to ...
  +
* to understand modern deep NN architectures;
* Formulate and assess the product ideas
 
  +
* to compare modern deep NN architectures;
* Perform market research for existing products
 
  +
* to create a prototype of a data-driven product.
* Design effective customer conversations
 
* Prototype UI, design and conduct usability tests
 
* Prototype user interface
 
* Design and conduct usability testing
 
* Populate and groom a product backlog
 
* Conduct Sprint Planning and Review
 
* Choose product metrics and apply GQM
 
* Integrate a third-party Analytics tools
 
* Design, run and conclude Controlled experiments
 
   
 
==== Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios? ====
 
==== Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios? ====
 
By the end of the course, the students should be able to ...
 
By the end of the course, the students should be able to ...
  +
* to apply techniques for efficient training of deep NNs;
* Conduct user and domain research to identify user needs and possible solutions
 
  +
* to apply methods for data science team organisation;
* Elicit and document software requirements
 
  +
* to apply deep NNs in NLP and computer vision.
* Organize a software process to swiftly launch an MVP and keep improving it in an iterative manner.
 
* Build a data pipeline to monitor metrics based on business goals and assess product progress in regards to design changes.
 
* Evolve and improve a product in a data-driven evidence-based iterative manner
 
 
== Grading ==
 
== Grading ==
   
Line 97: Line 87:
 
| C. Satisfactory || 60-74 || -
 
| C. Satisfactory || 60-74 || -
 
|-
 
|-
| D. Fail || 0-59 || -
+
| D. Poor || 0-59 || -
 
|}
 
|}
   
Line 106: Line 96:
 
! Activity Type !! Percentage of the overall course grade
 
! Activity Type !! Percentage of the overall course grade
 
|-
 
|-
| Assignment || 50
+
| Labs/seminar classes || 20
 
|-
 
|-
  +
| Interim performance assessment || 30
| Quizzes || 15
 
 
|-
 
|-
| Peer review || 15
+
| Exams || 50
|-
 
| Demo day || 20
 
 
|}
 
|}
   
 
=== Recommendations for students on how to succeed in the course ===
 
=== Recommendations for students on how to succeed in the course ===
  +
Participation is important. Showing up is the key to success in this course.<br>You will work in teams, so coordinating teamwork will be an important factor for success. This is also reflected in the peer review being a graded item.<br>Review lecture materials before classes to do well in quizzes.<br>Reading the recommended literature is optional, and will give you a deeper understanding of the material.
 
   
 
== Resources, literature and reference materials ==
 
== Resources, literature and reference materials ==
   
 
=== Open access resources ===
 
=== Open access resources ===
  +
* Goodfellow et al. Deep Learning, MIT Press. 2017
* Jackson, Michael. "The world and the machine." ICSE '95: Proceedings of the 17th international conference on Software engineeringApril 1995 Pages 283–292,
 
  +
* Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 2017.
* The Guide to Product Metrics:
 
  +
* Osinga, Douwe. Deep Learning Cookbook: Practical Recipes to Get Started Quickly. O’Reilly Media, 2018.
   
 
=== Closed access resources ===
 
=== Closed access resources ===
  +
* Fitzpatrick, R. (2013). The Mom Test: How to talk to customers & learn if your business is a good idea when everyone is lying to you. Robfitz Ltd.
 
* Reis, E. (2011). The lean startup. New York: Crown Business, 27.
 
* Rubin, K. S. (2012). Essential Scrum: A practical guide to the most popular Agile process. Addison-Wesley.
 
   
 
=== Software and tools used within the course ===
 
=== Software and tools used within the course ===
  +
* Firebase Analytics and A/B Testing, https://firebase.google.com/
 
* Amplitude Product Analytics, https://www.amplitude.com/
 
* MixPanel Product Analytics, https://mixpanel.com/
 
 
= Teaching Methodology: Methods, techniques, & activities =
 
= Teaching Methodology: Methods, techniques, & activities =
   
 
== Activities and Teaching Methods ==
 
== Activities and Teaching Methods ==
{| class="wikitable"
 
|+ Teaching and Learning Methods within each section
 
|-
 
! Teaching Techniques !! Section 1 !! Section 2 !! Section 3
 
|-
 
| Problem-based learning (students learn by solving open-ended problems without a strictly-defined solution) || 1 || 1 || 1
 
|-
 
| Project-based learning (students work on a project) || 1 || 1 || 1
 
|-
 
| Differentiated learning (provide tasks and activities at several levels of difficulty to fit students needs and level) || 1 || 1 || 1
 
|-
 
| развивающего обучения (задания и материал "прокачивают" ещё нераскрытые возможности студентов); || 1 || 1 || 1
 
|-
 
| концентрированного обучения (занятия по одной большой теме логически объединяются); || 1 || 1 || 1
 
|-
 
| inquiry-based learning || 1 || 1 || 1
 
|}
 
 
{| class="wikitable"
 
{| class="wikitable"
 
|+ Activities within each section
 
|+ Activities within each section
 
|-
 
|-
 
! Learning Activities !! Section 1 !! Section 2 !! Section 3
 
! Learning Activities !! Section 1 !! Section 2 !! Section 3
|-
 
| Lectures || 1 || 1 || 1
 
|-
 
| Interactive Lectures || 1 || 1 || 1
 
|-
 
| Lab exercises || 1 || 1 || 1
 
 
|-
 
|-
 
| Development of individual parts of software product code || 1 || 1 || 1
 
| Development of individual parts of software product code || 1 || 1 || 1
 
|-
 
|-
| Group projects || 1 || 1 || 1
+
| Homework and group projects || 1 || 1 || 1
 
|-
 
|-
| Quizzes (written or computer based) || 1 || 1 || 1
+
| Midterm evaluation || 1 || 1 || 1
 
|-
 
|-
| Peer Review || 1 || 1 || 1
+
| Testing (written or computer based) || 1 || 1 || 1
 
|-
 
|-
 
| Discussions || 1 || 1 || 1
 
| Discussions || 1 || 1 || 1
|-
 
| Presentations by students || 1 || 1 || 1
 
|-
 
| Written reports || 1 || 1 || 1
 
|-
 
| Experiments || 0 || 0 || 1
 
 
|}
 
|}
 
== Formative Assessment and Course Activities ==
 
== Formative Assessment and Course Activities ==
Line 190: Line 146:
 
! Activity Type !! Content !! Is Graded?
 
! Activity Type !! Content !! Is Graded?
 
|-
 
|-
  +
| Question || Suppose you use Batch Gradient Descent and you plot the validation error at every epoch. If you notice that the validation error consistently goes up, what is likely going on? How can you fix this? || 1
| Quiz || 1. What is a product? What are the techniques for describing a product idea in a clear concise manner?<br>2. What user research techniques do you know? In what situations are they applied?<br>3. What are the key customer conversation principles according to the Mom Test technique? Bring an example of bad and good questions to ask.<br>4. What are the 4 phases of the requirements engineering process? <br>5. How do we document requirements? What techniques do you know? || 1
 
 
|-
 
|-
  +
| Question || Is it a good idea to stop Mini-batch Gradient Descent immediately when the validation error goes up? || 1
| Presentation || Prepare a short 2-minutes pitch for your project idea (2-5 slides). <br><br>Suggested structure:<br>What problem you are solving:<br>- State the problem clearly in 2-3 short sentences.<br><br>Who are you solving it for:<br>- Who is your user/customer?<br>- Why will they be attracted to it?<br><br>What is your proposed solution to solve that problem:<br>- One sentence description<br>- What main feature(s) will it have? || 0
 
 
|-
 
|-
  +
| Question || List the optimizers that you know (except SGD) and explain one of them || 1
| Individual Assignments || A1: Product Ideation and Market Research<br>Formulate 3 project ideas in the following format:<br>X helps Y to do Z – where X is your product’s name, Y is the target user, and Z is what user activity product help with.<br><br>Submit Link to Screenshot board and Feature Analysis Table:<br>- Pick and explore 5 apps similar to your idea<br>- Take screenshots along the way and collect them on a board.<br>- Make a qualitative analysis table for app features.<br><br>Prepare a short 2-minutes pitch for your project idea (2-5 slides). <br><br>Suggested structure:<br>What problem you are solving:<br>- State the problem clearly in 2-3 short sentences.<br><br>Who are you solving it for:<br>- Who is your user/customer?<br>- Why will they be attracted to it?<br><br>What is your proposed solution to solve that problem:<br>- One sentence description<br>- What main feature(s) will it have? || 1
 
 
|-
 
|-
  +
| Question || Describe Xavier (or Glorot) initialization. Why do you need it? || 1
| Group Project Work || A2: Forming Teams and Identifying Stakeholders<br>Students are distributed into teams. <br>Meet your team <br>Discuss the idea<br>Agree on the roles<br>Setup task tracker (Trello or similar)<br>Identify 3-5 stakeholders and how to approach them<br>Compose a set of 5 most important questions you would ask from each stakeholder when interviewing them<br><br>Submit<br>A pdf with the idea description, roles distribution among the team, identified stakeholders, ways to approach them, a set of questions for each stakeholder.<br>An invite link to join your task tracker<br><br>A3: Domain Exploration and Requirements<br>User Research Process:<br>Compose the questionnaire for each stakeholder type. <br>Talk to 5-7 stakeholders.<br>Keep updating the questionnaire throughout the process<br>Compose an interview results table<br>Produce personas<br>Summarize most important learning points<br>Describe features your MVP will have (use case diagram + user story mapping)<br><br>Submit a pdf report with:<br>Personas + corresponding questionnaires<br>Interview results table (can provide a link to spreadsheet, make sure to open access)<br>Learning points summary<br>MVP features.<br><br>Optional: <br>Start implementation of the functionality you are certain about.<br><br>Assignment 4. UI design, Prototyping, MVP, and Usability Testing<br>Break down MVP features into phases and cut down the specification to implement MVP V1<br>Produce low and high fidelity designs for your product.<br>Review the phases breakdown.<br>Follow either the Prototyping or MVP path to complete the assignment.<br><br>Prototyping path:<br>Make a clickable prototype with Figma or a similar tool<br>Make 5-10 offline stakeholders use your prototype, observe them and gather feedback<br>Embed your prototype into an online usability testing tool (e.g. Maze).<br>Run an online usability test with 5-10 online stakeholders.<br>Summarize key learning points<br><br>MVP path:<br>Review your MVP phases.<br>Build MVP V1 <br>Make 5-10 offline stakeholders use your MVP, observe them and gather feedback<br>Integrate an online usability testing tool to observe user sessions (e.g. Smartlook).<br>Distribute the MVP to 5-10 online stakeholders and run an online usability test.<br>Summarize key learning points<br><br><br>Submit all of the below in one PDF:<br>Link to sketches and designs.<br>Link to your MVP/Clickable prototype.<br>Link to online usability test.<br>Names of people you conducted the tests with and which stakeholder type are they.<br>Key learning points summary.<br><br>Make sure all links are accessible/viewable. || 1
 
  +
|-
  +
| Question || Name advantages of the ELU activation function over ReLU. || 0
  +
|-
  +
| Question || Can you name the main innovations in AlexNet, compared to LeNet-5? What about the main innovations in GoogLeNet and ResNet? || 0
  +
|-
  +
| Question || What is the difference between LSTM and GRU cells? || 0
 
|}
 
|}
 
==== Section 2 ====
 
==== Section 2 ====
Line 204: Line 166:
 
! Activity Type !! Content !! Is Graded?
 
! Activity Type !! Content !! Is Graded?
 
|-
 
|-
  +
| Question || What is CRISP-DM? || 1
| Quiz || 1. What does the acronym MVP stand for? What types of MVP do you know of?<br>2. Define roles, activities, and artefacts of Scrum. What differentiates Scrum from other Agile frameworks, e.g. Kanban?<br>3. What does DEEP criteria stand for when discussing Product Backlog? Explain each of the aspects with examples.<br>4. Describe how Scrum activities are performed. Which of them are essential and which of them can vary depending on the product. || 1
 
  +
|-
  +
| Question || What is TDSP? || 1
  +
|-
  +
| Question || How to use MLflow? || 1
  +
|-
  +
| Question || What is TensorBoard? || 1
  +
|-
  +
| Question || How to apply Kubeflow in practice? || 1
 
|-
 
|-
  +
| Question || Explain issues in distributed learning of deep NNs. || 0
| Presentation || Prepare a 5-mins presentation describing your: <br>product backlog<br>sprint results<br>MVP-launch plan<br>Each team will present at the class. The assessment will be based on the presentation delivery, reasoning for decision making and asking questions and providing suggestions for other teams. || 0
 
 
|-
 
|-
  +
| Question || How do you organize your data science project? || 0
| Group Project Work || Assignment 5. Developing an MVP<br>1. Populate and groom product backlog: <br>Comply with the DEEP criteria. <br>2. Run two one-week sprints:<br>Conduct two Sprint plannings, i.e. pick the tasks for Sprint Backlog.<br>Conduct two Sprint reviews<br>Run one Sprint Retrospective<br>3. Make a launch plan and release:<br>You need to launch in the following two weeks.<br>Decide what functionality will go into the release.<br>Release your first version in Google Play.<br>Hint: Focus on a small set of features solving a specific problem for a specific user, i.e. MVP.<br>4. Prepare a 5-mins presentation describing your: <br>product backlog<br>sprint results<br>MVP-launch plan.<br>Demo for your launched MVP.<br>Each team will present at the class. The assessment will be based on the presentation delivery, reasoning for decision making and asking questions and providing suggestions for other teams.<br>5. Submit a PDF with:<br>Backlogs and Launch plan<br>Link to the launched product<br>Assignment 6. Launch your product, AC and DoD.<br>1. Improve the UX: Getting Started with the App.<br>2. Release in Google Play: Work on packaging it nicely<br>3. Design and deploy a landing page<br><br>4. Produce acceptance criteria for 3-5 most important user stories in your product.<br>5. Produce definition of done checklist<br>6. Estimate the items in your product backlog<br><br> || 1
 
 
|-
 
|-
  +
| Question || Recall a checklist for organization of a typical data science project. || 0
| Group presentation || Midterm Presentation<br>1. Prepare a midterm presentation for 10-mins in which you cover:<br>The problem you are trying to solve<br>Your users and customers (personas)<br>Your solution and it's core value proposition<br>Current state of your product<br>Clear plan for the upcoming weeks<br>Your team and distribution of responsibilities<br>Demo<br>Retrospective and learning points<br>Link to your app<br><br>Submit a pdf with:<br>Items 1, 2, 3<br>link to the presentation<br> || 0
 
 
|}
 
|}
 
==== Section 3 ====
 
==== Section 3 ====
Line 218: Line 188:
 
! Activity Type !! Content !! Is Graded?
 
! Activity Type !! Content !! Is Graded?
 
|-
 
|-
  +
| Question || What is an Autoencoder? Can you list the structure and types of Autoencoders? || 1
| Quiz || 1. What are common product hypotheses present? How can we formulate them as questions about our UX?<br>2. Explain what is hypothesis-driven development<br>3. Describe the important aspects and elements of a controlled experiment || 1
 
 
|-
 
|-
  +
| Question || Can you describe ways to train Stacked AEs? || 1
| Presentation || Prepare a short 2-minutes pitch for your project idea (2-5 slides). <br><br>Suggested structure:<br>What problem you are solving:<br>- State the problem clearly in 2-3 short sentences.<br><br>Who are you solving it for:<br>- Who is your user/customer?<br>- Why will they be attracted to it?<br><br>What is your proposed solution to solve that problem:<br>- One sentence description<br>- What main feature(s) will it have? || 0
 
 
|-
 
|-
  +
| Question || What is Denoising AE? Can you describe what is sparsity loss and why it can be useful? || 1
| Group project work || Assignment 7: Development, Observation, and Product Events.<br>1. Continue with your development process:<br>- Hold sprint planning and reviews.<br>- Revisit estimations and keep track for velocity calculation.<br>- Host demos and release new versions to your users<br><br>2. Observing users:<br>- Integrate a user sessions recording tool into your product<br>- As a team: watch 100 user sessions and outline common user behavior patterns.<br>- Each team member: give product to 3 new people and observe them use it.<br><br>3. Product events:<br>Create a product events table.<br>Integrate a free analytics tool that supports events reporting (e.g. Amplitude, MixPanel).<br><br>Write and submit a report:<br>- describe user behavior patterns (main ways how people use your product).<br>- learning points from the observations<br>- add the events table.<br>- describe which analytics tool you chose and why<br><br>Assignment 8: GQM, Metrics, and Hypothesis-testing.<br>1. GQM and Metrics Dashboard<br>- Compose a GQM for your product.<br>- Identify your focus and L1 metrics<br>- Setup an Analytics Dashboard with the metrics you chose.<br>- Add the instructors to your Analytics Dashboard.<br><br>Hypothesis-testing:<br>- answer clarity and hypotheses: do users understand your product, is it easy for them to get started, and do they return?<br>- suggest product improvements to increase clarity, ease of starting and retention.<br>- based on the suggestions formulate 3 falsifiable hypotheses<br>- design a simple test to check each of them<br>- pick one test that could be conducted by observing your users<br>- conduct the test<br><br>Submit:<br>- GQM, Focus and L1 Metrics breakdown.<br>- Report on the hypothesis-testing activities<br>- Access link to the dashboard.<br>Assignment 9: Running an A/B test<br>Compose an A/B test:<br>- Design a change in your product<br>- Hypothesis: Clearly state what you expect to improve as the result of the change.<br>- Parameter and Variants: Describe both A and B variants (and other if you have more).<br>- Intended sample size.<br>- OEC: Determine the target metric to run the experiment against.<br><br>Then do one of the two options:<br>Option 1: Conduct the A/B test using a remote control and A/B testing tool (Firebase, Optimizely or like)<br><br>Option 2: Do the statistical math yourself<br>Conduct an A/B test and collect data.<br>Do the math manually using the standard Student T-test.<br><br>Submit a PDF with:<br>- the A/B test description <br>- report on how the experiment went.<br>- either screenshots from the tool or math calculations. || 1
 
  +
|-
  +
| Question || Can you make a distinction between AE and VAE? || 1
  +
|-
  +
| Question || If an autoencoder perfectly reconstructs the inputs, is it necessarily a good autoencoder? How can you evaluate the performance of an autoencoder? || 0
  +
|-
  +
| Question || How do you tie weights in a stacked autoencoder? What is the point of doing so? || 0
  +
|-
  +
| Question || What about the main risk of an overcomplete autoencoder? || 0
  +
|-
  +
| Question || How the loss function for VAE is defined? What is ELBO? || 0
  +
|-
  +
| Question || Can you list the structure and types of a GAN? || 0
  +
|-
  +
| Question || How would you train a GAN? || 0
  +
|-
  +
| Question || How would you estimate the quality of a GAN? || 0
  +
|-
  +
| Question || Can you describe cost function of a Discriminator? || 0
 
|}
 
|}
 
=== Final assessment ===
 
=== Final assessment ===
 
'''Section 1'''
 
'''Section 1'''
  +
# Explain what the Teacher Forcing is.
# Grading criteria for the final project presentation:
 
  +
# Why do people use encoder–decoder RNNs rather than plain sequence-to-sequence RNNs for automatic translation?
# Problem: short clear statement on what you are solving, and why it’s important.
 
  +
# How could you combine a convolutional neural network with an RNN to classify videos?
# User: should be a specific user, can start from generic and then show how you narrowed it.
 
# Solution: how do you target the problem, what were the initial assumptions/hypotheses
 
# Elicitation process: interviews, how many people, what questions you asked, what you learnt.
 
 
'''Section 2'''
 
'''Section 2'''
  +
# Can you explain what it means for a company to be ML-ready?
# Arriving at MVP: how you chose features, describe prototyping and learning from it, when did you launch, and how it went.
 
  +
# What a company can do to become ML-ready / Data driven?
# Team and development process: how it evolved, what were the challenges, what fixes you made to keep progressing.
 
  +
# Can you list approaches to structure DS-teams? Discuss their advantages and disadvantages.
# Product demo: make it clear what your current product progress is.
 
  +
# Can you list and define typical roles in a DS team?
  +
# What do you think about practical aspects of processes and roles in Data Science projects/teams?
 
'''Section 3'''
 
'''Section 3'''
  +
# Can you make a distinction between Variational approximation of density and MCMC methods for density estimation?
# Hypothesis-driven development: how did you verify value and understandability of your product, what were the main hypotheses you had to check through MVP.
 
  +
# What is DCGAN? What is its purpose? What are main features of DCGAN?
# Measuring product: what metrics you chose, why, what funnels did you set for yourself, and what was the baseline for your MVP.
 
  +
# What is your opinion about Word Embeddings? What types do you know? Why are they useful?
# Experimentation: What usability tests and experiments you conducted, what did you learn, how did it affect your funnels and metrics.
 
  +
# How would you classify different CNN architectures?
  +
# How would you classify different RNN architectures?
  +
# Explain attention mechanism. What is self-attention?
  +
# Explain the Transformer architecture. What is BERT?
   
 
=== The retake exam ===
 
=== The retake exam ===
 
'''Section 1'''
 
'''Section 1'''
  +
# Grading criteria for the final project presentation:
 
# Problem: short clear statement on what you are solving, and why it’s important.
 
# User: should be a specific user, can start from generic and then show how you narrowed it.
 
# Solution: how do you target the problem, what were the initial assumptions/hypotheses
 
# Elicitation process: interviews, how many people, what questions you asked, what you learnt.
 
 
'''Section 2'''
 
'''Section 2'''
  +
# Arriving at MVP: how you chose features, describe prototyping and learning from it, when did you launch, and how it went.
 
# Team and development process: how it evolved, what were the challenges, what fixes you made to keep progressing.
 
# Product demo: make it clear what your current product progress is.
 
 
'''Section 3'''
 
'''Section 3'''
# Hypothesis-driven development: how did you verify value and understandability of your product, what were the main hypotheses you had to check through MVP.
 
# Measuring product: what metrics you chose, why, what funnels did you set for yourself, and what was the baseline for your MVP.
 

Revision as of 15:05, 28 April 2022

Practical Machine Learning and Deep Learning

  • Course name: Practical Machine Learning and Deep Learning
  • Code discipline:
  • Subject area: Practical aspects of deep learning (DL); Practical applications of DL in Natural Language Processing, Computer Vision and generation.

Short Description

Prerequisites

Prerequisite subjects

  • CSE202 — Analytical Geometry and Linear Algebra I / []: Manifolds "Linear Alg./Calculus: Manifolds
  • CSE203 — Mathematical Analysis II: Basics of optimisation
  • CSE201 — Mathematical Analysis I: integration and differentiation.
  • CSE103 — Theoretical Computer Science: Graph theory basics, Spectral decomposition.
  • CSE206 — Probability And Statistics: Multivariate normal dist.
  • CSE504 — Digital Signal Processing: convolution, cross-correlation"

Prerequisite topics

Course Topics

Course Sections and Topics
Section Topics within the section
Review. CNNs and RNNs
  1. Image processing, FFNs, CNNs
  2. Training Deep NNs
  3. RNNs, LSTM, GRU, Embeddings
  4. Bidirectional RNNs
  5. Seq2seq
  6. Encoder-Decoder Networks
  7. Attention
  8. Memory Networks
Team Data Science Processes
  1. Team Data Science Processes
  2. Team Data Science Roles
  3. Team Data Science Tools (MLFlow, KubeFlow)
  4. CRISP-DM
  5. Productionizing ML systems
VAEs, GANs
  1. Autoencoders
  2. Variational Autoencoders
  3. GANs, DCGAN

Intended Learning Outcomes (ILOs)

What is the main purpose of this course?

The course is about the practical aspects of deep learning. In addition to frontal lectures, the flipped classes and student project presentations will be organized. During lab sessions the working language is Python. The primary framework for deep learning is PyTorch. Usage of TensorFlow and Keras is possible, usage of Docker is highly appreciated.

ILOs defined at three levels

Level 1: What concepts should a student know/remember/explain?

By the end of the course, the students should be able to ...

  • to apply deep learning methods to effectively solve practical (real-world) problems;
  • to work in data science team;
  • to understand of principles and a lifecycle of data science projects.

Level 2: What basic practical skills should a student be able to perform?

By the end of the course, the students should be able to ...

  • to understand modern deep NN architectures;
  • to compare modern deep NN architectures;
  • to create a prototype of a data-driven product.

Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?

By the end of the course, the students should be able to ...

  • to apply techniques for efficient training of deep NNs;
  • to apply methods for data science team organisation;
  • to apply deep NNs in NLP and computer vision.

Grading

Course grading range

Grade Range Description of performance
A. Excellent 90-100 -
B. Good 75-89 -
C. Satisfactory 60-74 -
D. Poor 0-59 -

Course activities and grading breakdown

Activity Type Percentage of the overall course grade
Labs/seminar classes 20
Interim performance assessment 30
Exams 50

Recommendations for students on how to succeed in the course

Resources, literature and reference materials

Open access resources

  • Goodfellow et al. Deep Learning, MIT Press. 2017
  • Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 2017.
  • Osinga, Douwe. Deep Learning Cookbook: Practical Recipes to Get Started Quickly. O’Reilly Media, 2018.

Closed access resources

Software and tools used within the course

Teaching Methodology: Methods, techniques, & activities

Activities and Teaching Methods

Activities within each section
Learning Activities Section 1 Section 2 Section 3
Development of individual parts of software product code 1 1 1
Homework and group projects 1 1 1
Midterm evaluation 1 1 1
Testing (written or computer based) 1 1 1
Discussions 1 1 1

Formative Assessment and Course Activities

Ongoing performance assessment

Section 1

Activity Type Content Is Graded?
Question Suppose you use Batch Gradient Descent and you plot the validation error at every epoch. If you notice that the validation error consistently goes up, what is likely going on? How can you fix this? 1
Question Is it a good idea to stop Mini-batch Gradient Descent immediately when the validation error goes up? 1
Question List the optimizers that you know (except SGD) and explain one of them 1
Question Describe Xavier (or Glorot) initialization. Why do you need it? 1
Question Name advantages of the ELU activation function over ReLU. 0
Question Can you name the main innovations in AlexNet, compared to LeNet-5? What about the main innovations in GoogLeNet and ResNet? 0
Question What is the difference between LSTM and GRU cells? 0

Section 2

Activity Type Content Is Graded?
Question What is CRISP-DM? 1
Question What is TDSP? 1
Question How to use MLflow? 1
Question What is TensorBoard? 1
Question How to apply Kubeflow in practice? 1
Question Explain issues in distributed learning of deep NNs. 0
Question How do you organize your data science project? 0
Question Recall a checklist for organization of a typical data science project. 0

Section 3

Activity Type Content Is Graded?
Question What is an Autoencoder? Can you list the structure and types of Autoencoders? 1
Question Can you describe ways to train Stacked AEs? 1
Question What is Denoising AE? Can you describe what is sparsity loss and why it can be useful? 1
Question Can you make a distinction between AE and VAE? 1
Question If an autoencoder perfectly reconstructs the inputs, is it necessarily a good autoencoder? How can you evaluate the performance of an autoencoder? 0
Question How do you tie weights in a stacked autoencoder? What is the point of doing so? 0
Question What about the main risk of an overcomplete autoencoder? 0
Question How the loss function for VAE is defined? What is ELBO? 0
Question Can you list the structure and types of a GAN? 0
Question How would you train a GAN? 0
Question How would you estimate the quality of a GAN? 0
Question Can you describe cost function of a Discriminator? 0

Final assessment

Section 1

  1. Explain what the Teacher Forcing is.
  2. Why do people use encoder–decoder RNNs rather than plain sequence-to-sequence RNNs for automatic translation?
  3. How could you combine a convolutional neural network with an RNN to classify videos?

Section 2

  1. Can you explain what it means for a company to be ML-ready?
  2. What a company can do to become ML-ready / Data driven?
  3. Can you list approaches to structure DS-teams? Discuss their advantages and disadvantages.
  4. Can you list and define typical roles in a DS team?
  5. What do you think about practical aspects of processes and roles in Data Science projects/teams?

Section 3

  1. Can you make a distinction between Variational approximation of density and MCMC methods for density estimation?
  2. What is DCGAN? What is its purpose? What are main features of DCGAN?
  3. What is your opinion about Word Embeddings? What types do you know? Why are they useful?
  4. How would you classify different CNN architectures?
  5. How would you classify different RNN architectures?
  6. Explain attention mechanism. What is self-attention?
  7. Explain the Transformer architecture. What is BERT?

The retake exam

Section 1

Section 2

Section 3