Difference between revisions of "BSc: Natural Language Processing"
(Blanked the page) Tag: Blanking |
|||
Line 1: | Line 1: | ||
+ | |||
+ | = Natural Language Processing = |
||
+ | * '''Course name''': Natural Language Processing |
||
+ | * '''Code discipline''': |
||
+ | * '''Subject area''': |
||
+ | |||
+ | == Short Description == |
||
+ | The course covers classical and modern methods of processing and analyzing natural language texts. It aims to teach fundamental approaches to text analysis, to develop and consolidate skills in working with modern software tools for natural language processing. |
||
+ | |||
+ | == Prerequisites == |
||
+ | |||
+ | === Prerequisite subjects === |
||
+ | * CSE302 — Introduction to Machine Learning |
||
+ | |||
+ | == Course Topics == |
||
+ | {| class="wikitable" |
||
+ | |+ Course Sections and Topics |
||
+ | |- |
||
+ | ! Section !! Topics within the section |
||
+ | |- |
||
+ | | Formal foundations of text analysis methods || |
||
+ | # Fundamentals of the theory of formal languages |
||
+ | # Statistical language modeling |
||
+ | # Theory of parsing |
||
+ | |- |
||
+ | | Classical models of representation and analysis of text and applications || |
||
+ | # Models Based on Entropy Maximization (MaxEnt) |
||
+ | # Decision trees in text processing. Markov models. Support Vector Machines in Text Classification Problems |
||
+ | # Applications: information extraction, question-answer systems, text generation, machine translation |
||
+ | # Quality assessment of NLP systems |
||
+ | |- |
||
+ | | Neural network models for text analysis || |
||
+ | # Vector representations of words. word2vec model, contextual vector representations |
||
+ | # Architectures based on convolutional networks |
||
+ | # Architectures based on recurrent networks |
||
+ | # Encoder-decoder architecture |
||
+ | # Attention mechanism |
||
+ | |- |
||
+ | | Modern models based on the "Transformer" architecture || |
||
+ | # Architecture "Transformer" |
||
+ | # Self-attention mechanism |
||
+ | # Pre-trained language models. BERT. GPT |
||
+ | |} |
||
+ | == Intended Learning Outcomes (ILOs) == |
||
+ | |||
+ | === What is the main purpose of this course? === |
||
+ | The course is about the processing and modeling natural languages. In addition to frontal lectures, the flipped classes and student project presentations will be organized. During lab sessions the working language is Python. The primary framework for deep learning is PyTorch. Usage of TensorFlow and Keras is possible, usage of Docker is highly appreciated. |
||
+ | |||
+ | === ILOs defined at three levels === |
||
+ | |||
+ | ==== Level 1: What concepts should a student know/remember/explain? ==== |
||
+ | By the end of the course, the students should know: |
||
+ | * Fundamental approaches to text analysis |
||
+ | * Various natural language processing algorithms; |
||
+ | * Ways to measure the performance of NLP systems; |
||
+ | * Popular software tools for natural language processing. |
||
+ | |||
+ | ==== Level 2: What basic practical skills should a student be able to perform? ==== |
||
+ | By the end of the course, the students should be able to ... |
||
+ | * to describe and explain the difference between formal and natural languages; |
||
+ | * to describe and explain classical methods used for text analysis; |
||
+ | * to describe and explain neural network architectures used for text analysis; |
||
+ | * to describe and explain the difference between different neural network architectures for text analysis; |
||
+ | * to describe and explain modern architectures based on the Transformer. |
||
+ | |||
+ | ==== Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios? ==== |
||
+ | By the end of the course, the students should be able to ... |
||
+ | * to apply machine learning methods for solving text processing problems; |
||
+ | * to apply methods for assessing the quality of NLP systems; |
||
+ | * to apply deep learning algorithms for solving text processing problems. |
||
+ | |||
+ | == Grading == |
||
+ | |||
+ | === Course grading range === |
||
+ | {| class="wikitable" |
||
+ | |+ |
||
+ | |- |
||
+ | ! Grade !! Range !! Description of performance |
||
+ | |- |
||
+ | | A. Excellent || 90-100 || - |
||
+ | |- |
||
+ | | B. Good || 75-89 || - |
||
+ | |- |
||
+ | | C. Satisfactory || 60-74 || - |
||
+ | |- |
||
+ | | D. Poor || 0-59 || - |
||
+ | |} |
||
+ | |||
+ | === Course activities and grading breakdown === |
||
+ | {| class="wikitable" |
||
+ | |+ |
||
+ | |- |
||
+ | ! Activity Type !! Percentage of the overall course grade |
||
+ | |- |
||
+ | | Midterm || 30 |
||
+ | |- |
||
+ | | Final project || 40 |
||
+ | |- |
||
+ | | Assignments || 20 |
||
+ | |- |
||
+ | | Lab Participation / Quizzes || 10 |
||
+ | |} |
||
+ | |||
+ | === Recommendations for students on how to succeed in the course === |
||
+ | |||
+ | The student is recommended the following scheme of preparation for classes: |
||
+ | |||
+ | * Work out lecture notes. |
||
+ | * Work out the materials of seminars (practical) classes. |
||
+ | * In case of difficulty, formulate questions to the teacher. |
||
+ | |||
+ | To prepare for the classes, it is recommended to use the presented resources and additional literature. |
||
+ | |||
+ | == Resources, literature and reference materials == |
||
+ | |||
+ | === Open access resources === |
||
+ | * Clark, Alexander, Chris Fox, and Shalom Lappin, eds. The handbook of computational linguistics and natural language processing. John Wiley & Sons, 2013. |
||
+ | * Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed.) |
||
+ | * Géron A. Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. – " O'Reilly Media, Inc.", 2019. SECOND EDITION |
||
+ | |||
+ | === Additional literature === |
||
+ | |||
+ | * Osinga, Douwe. Deep Learning Cookbook: Practical Recipes to Get Started Quickly. O'Reilly Media, 2018. |
||
+ | * Николенко С. Кадурин А., Архангельская Е. Глубокое обучение. – Спб.: Питер, 2018. |
||
+ | * Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing |
||
+ | |||
+ | === Closed access resources === |
||
+ | |||
+ | |||
+ | === Software and tools used within the course === |
||
+ | |||
+ | = Teaching Methodology: Methods, techniques, & activities = |
||
+ | |||
+ | == Activities and Teaching Methods == |
||
+ | {| class="wikitable" |
||
+ | |+ Activities within each section |
||
+ | |- |
||
+ | ! Learning Activities !! Section 1 !! Section 2 !! Section 3 |
||
+ | |- |
||
+ | | Development of individual parts of software product code || 1 || 1 || 1 |
||
+ | |- |
||
+ | | Homework and group projects || 1 || 1 || 1 |
||
+ | |- |
||
+ | | Midterm evaluation || 1 || 1 || 1 |
||
+ | |- |
||
+ | | Testing (written or computer based) || 1 || 1 || 1 |
||
+ | |- |
||
+ | | Discussions || 1 || 1 || 1 |
||
+ | |} |
||
+ | == Formative Assessment and Course Activities == |
||
+ | |||
+ | === Ongoing performance assessment === |
||
+ | |||
+ | ==== Section 1 ==== |
||
+ | {| class="wikitable" |
||
+ | |+ |
||
+ | |- |
||
+ | ! Activity Type !! Content !! Is Graded? |
||
+ | |- |
||
+ | | Question || ? || 1 |
||
+ | |- |
||
+ | | Question || ? || 1 |
||
+ | |} |
||
+ | ==== Section 2 ==== |
||
+ | {| class="wikitable" |
||
+ | |+ |
||
+ | |- |
||
+ | ! Activity Type !! Content !! Is Graded? |
||
+ | |- |
||
+ | | Question || ? || 1 |
||
+ | |- |
||
+ | | Question || ? || 1 |
||
+ | |} |
||
+ | ==== Section 3 ==== |
||
+ | {| class="wikitable" |
||
+ | |+ |
||
+ | |- |
||
+ | ! Activity Type !! Content !! Is Graded? |
||
+ | |- |
||
+ | | Question || ? || 1 |
||
+ | |- |
||
+ | | Question || ? || 1 |
||
+ | |} |
||
+ | === Final assessment === |
||
+ | '''Section 1''' |
||
+ | # ? |
||
+ | '''Section 2''' |
||
+ | # ? |
||
+ | '''Section 3''' |
||
+ | # ? |
||
+ | '''Section 4''' |
||
+ | # ? |
||
+ | |||
+ | === The retake exam === |
||
+ | '''Section 1''' |
||
+ | |||
+ | '''Section 2''' |
||
+ | |||
+ | '''Section 3''' |
||
+ | |||
+ | '''Section 4''' |
Revision as of 19:56, 29 December 2022
Natural Language Processing
- Course name: Natural Language Processing
- Code discipline:
- Subject area:
Short Description
The course covers classical and modern methods of processing and analyzing natural language texts. It aims to teach fundamental approaches to text analysis, to develop and consolidate skills in working with modern software tools for natural language processing.
Prerequisites
Prerequisite subjects
- CSE302 — Introduction to Machine Learning
Course Topics
Section | Topics within the section |
---|---|
Formal foundations of text analysis methods |
|
Classical models of representation and analysis of text and applications |
|
Neural network models for text analysis |
|
Modern models based on the "Transformer" architecture |
|
Intended Learning Outcomes (ILOs)
What is the main purpose of this course?
The course is about the processing and modeling natural languages. In addition to frontal lectures, the flipped classes and student project presentations will be organized. During lab sessions the working language is Python. The primary framework for deep learning is PyTorch. Usage of TensorFlow and Keras is possible, usage of Docker is highly appreciated.
ILOs defined at three levels
Level 1: What concepts should a student know/remember/explain?
By the end of the course, the students should know:
- Fundamental approaches to text analysis
- Various natural language processing algorithms;
- Ways to measure the performance of NLP systems;
- Popular software tools for natural language processing.
Level 2: What basic practical skills should a student be able to perform?
By the end of the course, the students should be able to ...
- to describe and explain the difference between formal and natural languages;
- to describe and explain classical methods used for text analysis;
- to describe and explain neural network architectures used for text analysis;
- to describe and explain the difference between different neural network architectures for text analysis;
- to describe and explain modern architectures based on the Transformer.
Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?
By the end of the course, the students should be able to ...
- to apply machine learning methods for solving text processing problems;
- to apply methods for assessing the quality of NLP systems;
- to apply deep learning algorithms for solving text processing problems.
Grading
Course grading range
Grade | Range | Description of performance |
---|---|---|
A. Excellent | 90-100 | - |
B. Good | 75-89 | - |
C. Satisfactory | 60-74 | - |
D. Poor | 0-59 | - |
Course activities and grading breakdown
Activity Type | Percentage of the overall course grade |
---|---|
Midterm | 30 |
Final project | 40 |
Assignments | 20 |
Lab Participation / Quizzes | 10 |
Recommendations for students on how to succeed in the course
The student is recommended the following scheme of preparation for classes:
- Work out lecture notes.
- Work out the materials of seminars (practical) classes.
- In case of difficulty, formulate questions to the teacher.
To prepare for the classes, it is recommended to use the presented resources and additional literature.
Resources, literature and reference materials
Open access resources
- Clark, Alexander, Chris Fox, and Shalom Lappin, eds. The handbook of computational linguistics and natural language processing. John Wiley & Sons, 2013.
- Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed.)
- Géron A. Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. – " O'Reilly Media, Inc.", 2019. SECOND EDITION
Additional literature
- Osinga, Douwe. Deep Learning Cookbook: Practical Recipes to Get Started Quickly. O'Reilly Media, 2018.
- Николенко С. Кадурин А., Архангельская Е. Глубокое обучение. – Спб.: Питер, 2018.
- Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing
Closed access resources
Software and tools used within the course
Teaching Methodology: Methods, techniques, & activities
Activities and Teaching Methods
Learning Activities | Section 1 | Section 2 | Section 3 |
---|---|---|---|
Development of individual parts of software product code | 1 | 1 | 1 |
Homework and group projects | 1 | 1 | 1 |
Midterm evaluation | 1 | 1 | 1 |
Testing (written or computer based) | 1 | 1 | 1 |
Discussions | 1 | 1 | 1 |
Formative Assessment and Course Activities
Ongoing performance assessment
Section 1
Activity Type | Content | Is Graded? |
---|---|---|
Question | ? | 1 |
Question | ? | 1 |
Section 2
Activity Type | Content | Is Graded? |
---|---|---|
Question | ? | 1 |
Question | ? | 1 |
Section 3
Activity Type | Content | Is Graded? |
---|---|---|
Question | ? | 1 |
Question | ? | 1 |
Final assessment
Section 1
- ?
Section 2
- ?
Section 3
- ?
Section 4
- ?
The retake exam
Section 1
Section 2
Section 3
Section 4