<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://eduwiki.innopolis.university/index.php?action=history&amp;feed=atom&amp;title=MSc%3ADataMining</id>
	<title>MSc:DataMining - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://eduwiki.innopolis.university/index.php?action=history&amp;feed=atom&amp;title=MSc%3ADataMining"/>
	<link rel="alternate" type="text/html" href="https://eduwiki.innopolis.university/index.php?title=MSc:DataMining&amp;action=history"/>
	<updated>2026-05-07T19:43:11Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.36.1</generator>
	<entry>
		<id>https://eduwiki.innopolis.university/index.php?title=MSc:DataMining&amp;diff=129&amp;oldid=prev</id>
		<title>10.90.136.11: Created page with &quot;= Data Mining =  * &lt;span&gt;'''Course name:'''&lt;/span&gt; Data Mining * &lt;span&gt;'''Course number:'''&lt;/span&gt; 346  == Course Characteristics ==  === Key concepts of the class ===  * The...&quot;</title>
		<link rel="alternate" type="text/html" href="https://eduwiki.innopolis.university/index.php?title=MSc:DataMining&amp;diff=129&amp;oldid=prev"/>
		<updated>2021-07-30T11:26:12Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;= Data Mining =  * &amp;lt;span&amp;gt;&amp;#039;&amp;#039;&amp;#039;Course name:&amp;#039;&amp;#039;&amp;#039;&amp;lt;/span&amp;gt; Data Mining * &amp;lt;span&amp;gt;&amp;#039;&amp;#039;&amp;#039;Course number:&amp;#039;&amp;#039;&amp;#039;&amp;lt;/span&amp;gt; 346  == Course Characteristics ==  === Key concepts of the class ===  * The...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;= Data Mining =&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;span&amp;gt;'''Course name:'''&amp;lt;/span&amp;gt; Data Mining&lt;br /&gt;
* &amp;lt;span&amp;gt;'''Course number:'''&amp;lt;/span&amp;gt; 346&lt;br /&gt;
&lt;br /&gt;
== Course Characteristics ==&lt;br /&gt;
&lt;br /&gt;
=== Key concepts of the class ===&lt;br /&gt;
&lt;br /&gt;
* The role, subject, problems and methods of Data Mining in the context of other Data Science and Data Engineering disciplines and activities.&lt;br /&gt;
* Data Mining in Finance, in particular in Algorithmic Financial Trading: in-depth study of problems and methods.&lt;br /&gt;
&lt;br /&gt;
=== What is the purpose of this course? ===&lt;br /&gt;
&lt;br /&gt;
The main purpose of this course is two-fold:&lt;br /&gt;
&lt;br /&gt;
* to provide the students with a general understanding of the role of Data Mining which is closely connected to, but not identical to, Data Acquisition on one side, and Big Data Analysis / Machine Learning on the other side;&lt;br /&gt;
* to provide the students with a hands-on experience in application of Data Mining methods and techniques in a real-life modern area such as Financial Industry (more specifically, Algorithmic Financial Trading).&lt;br /&gt;
&lt;br /&gt;
.&lt;br /&gt;
&lt;br /&gt;
=== Course objectives based on Bloom’s taxonomy ===&lt;br /&gt;
&lt;br /&gt;
# Understand the role, subject, problems and methods of Data Mining in relation to other Data Science and Data Engineering disciplines, such as Data Acquisition and Big Data Analysis / Machine Learning.&lt;br /&gt;
# Define and understand the key concepts of Data Science for Finance: Financial Assets, Instruments, Markets and Trading, Exchanges, Prices, Volumes.&lt;br /&gt;
# Define and understand the key concepts of Financial Markets micro-structure: Orders, Order Books, Bids, Asks, Matching Engines, Trades.&lt;br /&gt;
# Understand the principal problems and solution methods of Data Mining with application to Financial Trading:&lt;br /&gt;
#* construction of Order Books and Trades from Order Logs or incremental updates;&lt;br /&gt;
#* data cleaning (e.g. resolving Bid-Ask collisions, wrong trade sizes, stale orders);&lt;br /&gt;
#* construction of descriptive statistics for Order Books and Trades;&lt;br /&gt;
#* feature generation for subsequent Data Analysis (e.g. WVAPs, Volumes, Market Pressure etc).&lt;br /&gt;
# Develop hands-on experience with complex, industrial-strength Data Mining processes related to Financial Trading data.&lt;br /&gt;
# Understand and apply the classical and modern methods of statistical data analysis, such as Principal Component Analysis (PCA) and Independent Component Analysis (ICA).&lt;br /&gt;
# Understand the methods of feature generation for Machine Learning methods.&lt;br /&gt;
# Understand and apply patterns of Machine Learning methods to analyze financial trading data.&lt;br /&gt;
&lt;br /&gt;
=== - What should a student remember at the end of the course? ===&lt;br /&gt;
&lt;br /&gt;
By the end of the course, the students should be able to:&lt;br /&gt;
&lt;br /&gt;
* Know the differences between Data Mining and Data Acquisition (on one hand) and Big Data Analysis (on the other hand).&lt;br /&gt;
* Know the principal definitions and terminology related to Financial Trading data.&lt;br /&gt;
* Know the importance of data clearing / pre-processing in Data Mining in general and for Financial Trading data in particular.&lt;br /&gt;
* Know that Machine Learning methods of data analysis cannot be used effectively without proper data pre-processing and feature engineering.&lt;br /&gt;
* Know the relationship between Statistical and Machine Learning-based methods of data analysis.&lt;br /&gt;
&lt;br /&gt;
=== - What should a student be able to understand at the end of the course? ===&lt;br /&gt;
&lt;br /&gt;
By the end of the course, the students should be able to:&lt;br /&gt;
&lt;br /&gt;
* Understand the mechanisms of exchange-based financial trading.&lt;br /&gt;
* Understand the micro-structure of financial markets (orders, order books, matching, trades etc).&lt;br /&gt;
* Understand the methods of constructing Order Book and Trades data from the raw data in Finance.&lt;br /&gt;
* Understand the typical errors which occur in Financial Trading data, and the methods for their detection and correction.&lt;br /&gt;
* Understand the modern methods of statistical analysis of financial time series data, in particular, the method of ICA and its advantages over PCA.&lt;br /&gt;
* Understand the methods of feature generation for analysis of Financial Trading data using Machine Learning methods.&lt;br /&gt;
* Understand how the statistical and machine learning-based methods of data analysis can efficiently be integrated to achieve the best results.&lt;br /&gt;
&lt;br /&gt;
=== - What should a student be able to apply at the end of the course? ===&lt;br /&gt;
&lt;br /&gt;
By the end of the course, the students should be able to:&lt;br /&gt;
&lt;br /&gt;
* Develop sufficiently complex, “industrial-strength” software solutions (e.g. in Python) for processing raw Financial Trading data into Order Books and Trades, providing:&lt;br /&gt;
** adequate temporal and spatial efficiency;&lt;br /&gt;
** necessary logic for data clean-up and resolution of data errors / inconsistencies.&lt;br /&gt;
* Judiciously select and apply the appropriate Machine Learning methods for analysis of Financial Trading data and prediction of financial time series.&lt;br /&gt;
&lt;br /&gt;
=== Course evaluation ===&lt;br /&gt;
&lt;br /&gt;
The course is largely project-based, thus higher weights are assigned to Labs; Final Project Presentation is provided in place of a Final Exam:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|+ Course grade breakdown&lt;br /&gt;
!&lt;br /&gt;
!&lt;br /&gt;
!align=&amp;quot;center&amp;quot;| '''Proposed points'''&lt;br /&gt;
|-&lt;br /&gt;
| Labs&lt;br /&gt;
| 20&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 40&lt;br /&gt;
|-&lt;br /&gt;
| Interim performance assessment&lt;br /&gt;
| 30&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 20&lt;br /&gt;
|-&lt;br /&gt;
| Final Project Presentation&lt;br /&gt;
| 50&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 40&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Grades range ===&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|+ Course grading range&lt;br /&gt;
!&lt;br /&gt;
!&lt;br /&gt;
!align=&amp;quot;center&amp;quot;| '''Proposed range'''&lt;br /&gt;
|-&lt;br /&gt;
| A. Excellent&lt;br /&gt;
| 90-100&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 85–100&lt;br /&gt;
|-&lt;br /&gt;
| B. Good&lt;br /&gt;
| 75-89&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 70–84&lt;br /&gt;
|-&lt;br /&gt;
| C. Satisfactory&lt;br /&gt;
| 60-74&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 50–69&lt;br /&gt;
|-&lt;br /&gt;
| D. Poor&lt;br /&gt;
| 0-59&lt;br /&gt;
|align=&amp;quot;center&amp;quot;|&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Resources and reference material ===&lt;br /&gt;
&lt;br /&gt;
* M. Kolanovich, ''Big Data and AI Strategies'', JP Morgan, 2017.&lt;br /&gt;
* K. Kim, ''Electronic and Algorithmic Trading Technology: The Complete Guide'', Elsevier, 2007.&lt;br /&gt;
* M. Durbin, ''All About High-Frequency Trading'', McGrow-Hill, 2010.&lt;br /&gt;
* I. Aldridge, ''High-Frequency Trading'', Wiley, 2010.&lt;br /&gt;
&lt;br /&gt;
== Course Sections ==&lt;br /&gt;
&lt;br /&gt;
The main sections of the course and approximate hour distribution between them is as follows:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|+ Course Sections&lt;br /&gt;
!align=&amp;quot;center&amp;quot;| '''Section'''&lt;br /&gt;
! '''Section Title'''&lt;br /&gt;
!align=&amp;quot;center&amp;quot;| '''Teaching Hours'''&lt;br /&gt;
|-&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 1&lt;br /&gt;
| Introduction into Data Mining&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 2&lt;br /&gt;
|-&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 2&lt;br /&gt;
| Data Mining and Data Analysis in Financial Trading&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 4&lt;br /&gt;
|-&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 3&lt;br /&gt;
| Micro-Structure of Financial Markets&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 16&lt;br /&gt;
|-&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 4&lt;br /&gt;
| Feature Engineering for Machine Learning in Financial Trading&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 4&lt;br /&gt;
|-&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 5&lt;br /&gt;
| Descriptive Statistics of Financial Trading Data&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 4&lt;br /&gt;
|-&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 6&lt;br /&gt;
| Principal and Independent Component Analysis (PCA and ICA)&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 8&lt;br /&gt;
|-&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 7&lt;br /&gt;
| Machine Learning Methods for Prediction of Financial Data&lt;br /&gt;
|align=&amp;quot;center&amp;quot;| 16&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Section 1 ===&lt;br /&gt;
&lt;br /&gt;
==== Section title: ====&lt;br /&gt;
&lt;br /&gt;
Introduction into Data Mining&lt;br /&gt;
&lt;br /&gt;
=== Topics covered in this section: ===&lt;br /&gt;
&lt;br /&gt;
* The subject, problems and methods of Data Mining. Data Mining as a Data Engineering discipline.&lt;br /&gt;
* Relationships between Data Mining, Data Acquisition, Big Data Analysis and Machine Learning.&lt;br /&gt;
* A typical data processing workflow.&lt;br /&gt;
&lt;br /&gt;
=== What forms of evaluation were used to test students’ performance in this section? ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;tabular&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span&amp;gt;|a|c|&amp;lt;/span&amp;gt; &amp;amp;amp; '''Yes/No'''&amp;lt;br /&amp;gt;&lt;br /&gt;
Development of individual parts of software product code &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Homework and group projects &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Midterm evaluation &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Testing (written or computer based) &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Reports &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Essays &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Oral polls &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Discussions &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
=== Typical questions for ongoing performance evaluation within this section ===&lt;br /&gt;
&lt;br /&gt;
# Is Data Mining a science or an engineering discipline?&lt;br /&gt;
# Explain a typical data processing workflow from Acquisition to Analysis and the role of Data Mining in this process.&lt;br /&gt;
# What are the main problems and methods of Data Mining?&lt;br /&gt;
# How is Data Mining related to Data Analysis and Machine Learning?&lt;br /&gt;
&lt;br /&gt;
=== Typical questions for seminar classes (labs) within this section ===&lt;br /&gt;
&lt;br /&gt;
None&lt;br /&gt;
&lt;br /&gt;
=== Test questions for final assessment in this section ===&lt;br /&gt;
&lt;br /&gt;
# Explain the main methods of detecting data outliers and gaps.&lt;br /&gt;
# Do you concur with the statement that Machine Learning algorithms are in general capable of discerning arbitrarily-complex relationships in data, provided that the training dataset is large enough?&lt;br /&gt;
# Explain how Data Mining can be used to facilitate efficient Machine Learning.&lt;br /&gt;
&lt;br /&gt;
=== Section 2 ===&lt;br /&gt;
&lt;br /&gt;
==== Section title: ====&lt;br /&gt;
&lt;br /&gt;
Data Mining and Data Analysis in Financial Trading&lt;br /&gt;
&lt;br /&gt;
=== Topics covered in this section: ===&lt;br /&gt;
&lt;br /&gt;
* Financial Asset Classes: Equities, Currencies, Commodities, Interest Rate Products and others.&lt;br /&gt;
* Financial Instruments, Trading and Exchanges.&lt;br /&gt;
* Stochastic dynamics of prices and trading volumes of financial instruments. The notion of stochastic differential equations. Trends and Volatilities.&lt;br /&gt;
* The nomenclature of professional specializations in Quantitative Finance related to data science and data engineering: Quantitative Analysts, Quantitative Researchers, Quantitative Developers, Research Analysis. Their relationship to Data Mining.&lt;br /&gt;
&lt;br /&gt;
=== What forms of evaluation were used to test students’ performance in this section? ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;tabular&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span&amp;gt;|a|c|&amp;lt;/span&amp;gt; &amp;amp;amp; '''Yes/No'''&amp;lt;br /&amp;gt;&lt;br /&gt;
Development of individual parts of software product code &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Homework and group projects &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Midterm evaluation &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Testing (written or computer based) &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Reports &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Essays &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Oral polls &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Discussions &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
=== Typical questions for ongoing performance evaluation within this section ===&lt;br /&gt;
&lt;br /&gt;
# What are Equities as a financial asset class?&lt;br /&gt;
# Explain the difference between the QA and QR professional specializations.&lt;br /&gt;
# Which of professional specializations in quantitative finance is mostly concerned with Data Mining?&lt;br /&gt;
&lt;br /&gt;
=== Typical questions for seminar classes (labs) within this section ===&lt;br /&gt;
&lt;br /&gt;
None&lt;br /&gt;
&lt;br /&gt;
=== Test questions for final assessment in this section ===&lt;br /&gt;
&lt;br /&gt;
# Explain the components of stochastic dynamics of financial assets (HINT: Trends and Volatilities).&lt;br /&gt;
# Provide two definitions of the Trend, and explain how they are related to each other.&lt;br /&gt;
# What are the main differences between stochastic dynamics of Equity and IRP prices? In the price prediction problem, where would the role of data science be more important?&lt;br /&gt;
# Explain the differences between QA and QR specializations regarding the subject of their research.&lt;br /&gt;
&lt;br /&gt;
=== Section 3 ===&lt;br /&gt;
&lt;br /&gt;
==== Section title: ====&lt;br /&gt;
&lt;br /&gt;
Micro-Structure of Financial Markets&lt;br /&gt;
&lt;br /&gt;
==== Topics covered in this section: ====&lt;br /&gt;
&lt;br /&gt;
* The “mechanics” of Exchange-based trading in financial instruments.&lt;br /&gt;
* Orders, Order Books, Bids and Asks, Price Levels and Order Volumes.&lt;br /&gt;
* Bid-Ask Spread, Passive and Aggressive Orders, Order Matching, Trades.&lt;br /&gt;
* Limit and Market orders, semantics of order execution.&lt;br /&gt;
* Formats of historical Market Data: L1, L2 and L3 (Full Orders Log) data.&lt;br /&gt;
* Example: Market Data for Moscow Exchange (FX and Equities sections).&lt;br /&gt;
* Compiling Order Book data from a stream of Full Order Log data.&lt;br /&gt;
&lt;br /&gt;
=== What forms of evaluation were used to test students’ performance in this section? ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;tabular&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span&amp;gt;|a|c|&amp;lt;/span&amp;gt; &amp;amp;amp; '''Yes/No'''&amp;lt;br /&amp;gt;&lt;br /&gt;
Development of individual parts of software product code &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Homework and group projects &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Midterm evaluation &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Testing (written or computer based) &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Reports &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Essays &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Oral polls &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Discussions &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
=== Typical questions for ongoing performance evaluation within this section ===&lt;br /&gt;
&lt;br /&gt;
# Explain how the Matching Engine of a financial exchange works.&lt;br /&gt;
# Is it correct to say that a Passive order is always a limit one? Is the converse true?&lt;br /&gt;
# What happens with a limit order if it is not completely filled at a limit price?&lt;br /&gt;
&lt;br /&gt;
==== Typical questions for seminar classes (labs) within this section ====&lt;br /&gt;
&lt;br /&gt;
PROJECT WORK: Develop a Data Acquisition and Data Mining software solution in Python which reads historical market data of Moscow Exchange from files (in a Full Orders Log format) and:&lt;br /&gt;
&lt;br /&gt;
# selects the applicable Instruments;&lt;br /&gt;
# for each applicable Instrument composes a sequence of Order Book snapshots;&lt;br /&gt;
# correctly applies New, Cancel and Modify (Trade) records from the Full Orders Log;&lt;br /&gt;
# efficiently detects and corrects data anomalies / errors (e.g. invalid trade prices or sizes, stale orders etc);&lt;br /&gt;
# provides hooks for outputting the Order Book features which can subsequently be used for machine learning purposes.&lt;br /&gt;
&lt;br /&gt;
==== Test questions for final assessment in this section ====&lt;br /&gt;
&lt;br /&gt;
# What are Market Orders and how are they recognized in MOEX Full Orders Log?&lt;br /&gt;
# What are Bid-Ask collisions, why could they occur and how are they resolved in the project software solution?&lt;br /&gt;
# What is the temporal and spatial complexity of the project algorithm implemented?&lt;br /&gt;
&lt;br /&gt;
=== Section 4 ===&lt;br /&gt;
&lt;br /&gt;
==== Section title: ====&lt;br /&gt;
&lt;br /&gt;
Feature Engineering for Machine Learning in Financial Trading&lt;br /&gt;
&lt;br /&gt;
==== Topics covered in this section: ====&lt;br /&gt;
&lt;br /&gt;
* From order book snaphots to ML features.&lt;br /&gt;
* The uniformity requirements in feature engineering.&lt;br /&gt;
* Volume-Weighted Average Prices (VWAPs) over uniform Size bands. VWAP-based mid-prices and Bid-Ask spreads.&lt;br /&gt;
* Order Volumes over uniform Price Step bands.&lt;br /&gt;
* Trades and “market pressure” over uniformly-defined time intervals. Using Exponential Moving Averages (EMA filters).&lt;br /&gt;
* Derived features (e.g. logarithms).&lt;br /&gt;
&lt;br /&gt;
=== What forms of evaluation were used to test students’ performance in this section? ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;tabular&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span&amp;gt;|a|c|&amp;lt;/span&amp;gt; &amp;amp;amp; '''Yes/No'''&amp;lt;br /&amp;gt;&lt;br /&gt;
Development of individual parts of software product code &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Homework and group projects &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Midterm evaluation &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Testing (written or computer based) &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Reports &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Essays &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Oral polls &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Discussions &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
=== Typical questions for ongoing performance evaluation within this section ===&lt;br /&gt;
&lt;br /&gt;
# Explain why we need WVAPs and how they are computed.&lt;br /&gt;
# Explain why we need to use a uniform grid of price levels in Volumes computation.&lt;br /&gt;
# Why may we need logarithms of features in addition to features themselves?&lt;br /&gt;
&lt;br /&gt;
==== Typical questions for seminar classes (labs) within this section ====&lt;br /&gt;
&lt;br /&gt;
PROJECT WORK: Based on the solution implemented in Section 3, provide generation of standard features from Order Book snapshots of financial instruments traded at Moscow Exchange.&lt;br /&gt;
&lt;br /&gt;
==== Test questions for final assessment in this section ====&lt;br /&gt;
&lt;br /&gt;
# What are the potential (adverse) effects of order book data errors on features generation?&lt;br /&gt;
# How do we manage the spatial complexity of storing the features generated from Order Books?&lt;br /&gt;
# Explain the “reciprocity” between VWAPs and Order Volumes, and why we main need both kinds of features?&lt;br /&gt;
# What is “market pressure” and how is it computed?&lt;br /&gt;
&lt;br /&gt;
=== Section 5 ===&lt;br /&gt;
&lt;br /&gt;
==== Section title: ====&lt;br /&gt;
&lt;br /&gt;
Descriptive Statistics of Financial Trading Data&lt;br /&gt;
&lt;br /&gt;
==== Topics covered in this section: ====&lt;br /&gt;
&lt;br /&gt;
* Distribution of Order Volumes by price depth. Single-humped and two-humped distribution densities, their typical occurences in different financial instruments.&lt;br /&gt;
* Distribution of Aggressive orders by interval between arrival: exponential-type distribution.&lt;br /&gt;
* Hypotheses testing on distributions: the &amp;lt;math display=&amp;quot;inline&amp;quot;&amp;gt;\xi^2&amp;lt;/math&amp;gt; and Kolmogorov–Smirnov criteria.&lt;br /&gt;
* Correlation analysis of financial time series. Avoiding pitfalls:&lt;br /&gt;
** the centricity and stationarity requirements;&lt;br /&gt;
** using finite differences and fractional-order differentiation.&lt;br /&gt;
* Lead-lag analysis and the Hayashi–Yoshida method.&lt;br /&gt;
&lt;br /&gt;
=== What forms of evaluation were used to test students’ performance in this section? ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;tabular&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span&amp;gt;|a|c|&amp;lt;/span&amp;gt; &amp;amp;amp; '''Yes/No'''&amp;lt;br /&amp;gt;&lt;br /&gt;
Development of individual parts of software product code &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Homework and group projects &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Midterm evaluation &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Testing (written or computer based) &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Reports &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Essays &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Oral polls &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Discussions &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
=== Typical questions for ongoing performance evaluation within this section ===&lt;br /&gt;
&lt;br /&gt;
# Explain the typical differences in Order Volumes distribution between Futures and Spot FX instruments.&lt;br /&gt;
# What are the typical pitfalls in correlation analysis?&lt;br /&gt;
# What is the purpose and methods of Lead-Lag analysis?&lt;br /&gt;
&lt;br /&gt;
==== Typical questions for seminar classes (labs) within this section ====&lt;br /&gt;
&lt;br /&gt;
PROJECT WORK: Based on the solutions implemented in Sections 3 and 4, perform:&lt;br /&gt;
&lt;br /&gt;
# Orders Volume distribution analysis for Spot FX instruments&lt;br /&gt;
# distribution analysis of intervals between order book dates&lt;br /&gt;
# distribution analysis of intervals between trades&lt;br /&gt;
# lead-lag analysis between USD/RUB and EUR/RUB instruments&lt;br /&gt;
&lt;br /&gt;
==== Test questions for final assessment in this section ====&lt;br /&gt;
&lt;br /&gt;
# Explain the hypothesis testing methods for distribution densities of random variables.&lt;br /&gt;
# Explain the purpose and the technique of fractional-order differentiation of time series.&lt;br /&gt;
# Explain the Hayashi–Yoshida method.&lt;br /&gt;
&lt;br /&gt;
=== Section 6 ===&lt;br /&gt;
&lt;br /&gt;
==== Section title: ====&lt;br /&gt;
&lt;br /&gt;
Principal and Independent Component Analysis (PCA and ICA)&lt;br /&gt;
&lt;br /&gt;
==== Topics covered in this section: ====&lt;br /&gt;
&lt;br /&gt;
* The objectives of component analysis.&lt;br /&gt;
* Risk Factors for prices of financial instruments.&lt;br /&gt;
* PCA: an “empirical” method.&lt;br /&gt;
* PCA: a rigorous method based on stochastic differential equations.&lt;br /&gt;
* Hypotheses testing for residual Brownian motions.&lt;br /&gt;
* From PCA to ICA.&lt;br /&gt;
* ICA methods and interpretation of results.&lt;br /&gt;
&lt;br /&gt;
=== What forms of evaluation were used to test students’ performance in this section? ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;tabular&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span&amp;gt;|a|c|&amp;lt;/span&amp;gt; &amp;amp;amp; '''Yes/No'''&amp;lt;br /&amp;gt;&lt;br /&gt;
Development of individual parts of software product code &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Homework and group projects &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Midterm evaluation &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Testing (written or computer based) &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Reports &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Essays &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Oral polls &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Discussions &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
=== Typical questions for ongoing performance evaluation within this section ===&lt;br /&gt;
&lt;br /&gt;
# Explain the objectives of the component analysis methods.&lt;br /&gt;
# Explain the differences between the “empirical” and the SDE-based PCA methods.&lt;br /&gt;
# How is independence of components achieved in ICA?&lt;br /&gt;
&lt;br /&gt;
==== Typical questions for seminar classes (labs) within this section ====&lt;br /&gt;
&lt;br /&gt;
PROJECT WORK: Based on the solutions implemented in Sections 3–5, perform:&lt;br /&gt;
&lt;br /&gt;
# “empirical” PCA of the time series for major FX and Equity instruments at MOEX;&lt;br /&gt;
# SDE-based PCA of the same instruments (assuming constant covariance matrix but non-constant reversion terms), and test the hypothesis for the residual Brownian motions;&lt;br /&gt;
# ICA under the same conditions as above, and try to interpret the risk factors obtained.&lt;br /&gt;
&lt;br /&gt;
==== Test questions for final assessment in this section ====&lt;br /&gt;
&lt;br /&gt;
# Explain the SDE-based approach to PCA and ICA.&lt;br /&gt;
# Explain how the residual stochastic innovations are constructed and tested for being Brownian motions.&lt;br /&gt;
# What are the advantages of ICA over PCA?&lt;br /&gt;
&lt;br /&gt;
=== Section 7 ===&lt;br /&gt;
&lt;br /&gt;
==== Section title: ====&lt;br /&gt;
&lt;br /&gt;
Machine Learning Methods for Prediction of Financial Data&lt;br /&gt;
&lt;br /&gt;
==== Topics covered in this section: ====&lt;br /&gt;
&lt;br /&gt;
* Recap of Machine Learning concepts relevant to financial time series: Supervised Learning (SL) and Reinforcement Learning (RL).&lt;br /&gt;
* Recap of SL methods: explicit regression models, SVMs, boosted gradient methods, Artificial Neural Nets (ANNs).&lt;br /&gt;
* The problem of predicting financial time series in Algorithmic Trading.&lt;br /&gt;
* The importance of feature engineering over the “best non-linear method” selection.&lt;br /&gt;
* The danger of over-fitting and the methods of controlling it.&lt;br /&gt;
&lt;br /&gt;
=== What forms of evaluation were used to test students’ performance in this section? ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;tabular&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span&amp;gt;|a|c|&amp;lt;/span&amp;gt; &amp;amp;amp; '''Yes/No'''&amp;lt;br /&amp;gt;&lt;br /&gt;
Development of individual parts of software product code &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Homework and group projects &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Midterm evaluation &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
Testing (written or computer based) &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Reports &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Essays &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Oral polls &amp;amp;amp; 0&amp;lt;br /&amp;gt;&lt;br /&gt;
Discussions &amp;amp;amp; 1&amp;lt;br /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
=== Typical questions for ongoing performance evaluation within this section ===&lt;br /&gt;
&lt;br /&gt;
# Explain the differences between Reinforcement Learning and Supervised Learning.&lt;br /&gt;
# Explain why careful feature engineering is so important in Machine Learning in general, and for predicting financial time series in particular.&lt;br /&gt;
# Explain the objectives and potential time horizons of price prediction in Algorithmic Trading.&lt;br /&gt;
&lt;br /&gt;
==== Typical questions for seminar classes (labs) within this section ====&lt;br /&gt;
&lt;br /&gt;
PROJECT WORK: Based on the solutions implemented in Sections 3–6, implement an price prediction methods for USD/RUB instruments at MOEX:&lt;br /&gt;
&lt;br /&gt;
# apply the features constructed in Section 4;&lt;br /&gt;
# apply an explicit linear regression with Lasso regularization;&lt;br /&gt;
# then construct an ANN and compare the quality of predictions.&lt;br /&gt;
&lt;br /&gt;
==== Test questions for final assessment in this section ====&lt;br /&gt;
&lt;br /&gt;
# What is over-fitting in ML and how can it be controlled?&lt;br /&gt;
# Explain the similarities and differences between SVMs and ANNs.&lt;br /&gt;
# Describe in detail the API of Machine Learning library you have been using in your Lab Project.&lt;/div&gt;</summary>
		<author><name>10.90.136.11</name></author>
	</entry>
</feed>