MSc: Advanced Statistics
Advanced Statistics
- Course name: Advanced Statistics
- Code discipline: DS-03
- Subject area:
Short Description
This course in advanced statistics with a view toward applications in data sciences. It is intended for masters students who are looking to expand their knowledge of theoretical methods used in modern research in data sciences. The course presents some of the key probabilistic methods and results that may form an essential mathematical toolbox for a data scientist. This course places particular emphasis on random vectors, random matrices, and random projections. It teaches basic theoretical skills for the analysis of these objects, which include concentration inequalities, covering and packing arguments, decoupling and symmetrization tricks, chaining and comparison techniques for stochastic processes, combinatorial reasoning based on the VC dimension, and a lot more. This course integrates theory with applications for covariance estimation, semidefinite programming, networks, elements of statistical learning, error correcting codes, clustering, matrix completion, dimension reduction, sparse signal recovery, sparse regression, and more.
Prerequisites
Prerequisite subjects
- Excellent knowledge of probability and statistics.
Prerequisite topics
Course Topics
Section | Topics within the section |
---|---|
Concentration of sums of independent random variables |
|
Random vectors in high dimensions |
|
Random matrices |
|
Intended Learning Outcomes (ILOs)
What is the main purpose of this course?
The main purpose of this course is to present the fundamentals of high-dimensional statistics with applications to data science. The course presents some of the key probabilistic methods and results that may form an essential mathematical toolbox for a data scientist. This course places particular emphasis on random vectors, random matrices, and random projections. This course integrates theory with applications for covariance estimation, semidefinite programming, networks, elements of statistical learning, error correcting codes, clustering, matrix completion, dimension reduction, sparse signal recovery, sparse regression, and more.
ILOs defined at three levels
Level 1: What concepts should a student know/remember/explain?
By the end of the course, the students should be able to ...
- Explain the difference between low-dimensional and high-dimensional data
- Explain concentration inequalities and their application
- Remember the main statistical properties of high-dimensional vectors and matrices
Level 2: What basic practical skills should a student be able to perform?
By the end of the course, the students should be able to ...
- Perform basic Monte Carlo computations, such as Monte Carlo integration
- Obtain simple but accurate bounds of complex statistical metrics
- Apply the median of means estimator
- Investigate simple statistics of social networks
- Exploit the thin-shell phenomenon when analysing data
- Apply data clustering and dimension reduction
Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?
By the end of the course, the students should be able to ...
- To understand the problems related to statistical analysis of data.
- To apply theoretical statistics in real-life via computer simulations and thereby confirm or reject the correctness of the theoretical concepts.
- To identify the correct statistical methods that needs to be applied to data in order to solve the given tasks in real-life.
- To be able to generate and run experiments on random data samples.
Grading
Course grading range
Grade | Range | Description of performance |
---|---|---|
A. Excellent | 85-100 | - |
B. Good | 70-84 | - |
C. Satisfactory | 50-69 | - |
D. Poor | 0-49 | - |
Minimum Requirements For Passing The Course
There are two requirements for passing this course:.
- You must have at least 50% on the Final Exam.
- You must have at least 50% of the overall grade.
Course activities and grading breakdown
Activity Type | Percentage of the overall course grade |
---|---|
Quiz/Assignment during each lecture (weekly evaluations) | 20 |
Labs classes (weekly evaluations) | 20 |
Midterm | 20 |
Final exam | 40 |
Plagiarism Rules
- If a student submits a solution to a weekly assignment/quiz and/or lab that is identical to the one submitted from another student, then both students will obtain the maximum points for this task but with the negative sign.
Recommendations for students on how to succeed in the course
- Watch the video lecture and read the lecture notes before coming to the onsite lectures and to the labs.
- Attend the onsite lectures and questions related to parts of the material that you find unclear.
- Submit solutions to the weekly quizzes.
- Submit the weekly lab reports.
- Prepare seriously for the midterm exam.
- Prepare seriously for the final exam.
Resources, literature and reference materials
Open access resources
- The lecture notes and the video lectures provided via Moodle are sufficient for passing this course with grade A.
Software and tools used within the course
- You can use any software by your choice to perform the lab tasks.
Teaching Methodology: Methods, techniques, & activities
Formative Assessment and Course Activities
Ongoing performance assessment
The performance will be assessed via weekly quizzes and weekly labs.
Final assessment
The final assessment is in a written form. You mast have at least 50% on the final exam to pass the course.
The retake exam
The retake of the exam will be in a written form.