Statistics for Economists 2 – From Time Series Analysis to Machine Learning

Statistics for Economists 2 – From Time Series Analysis to Machine Learning

Delivery institution

FACULTY OF SOCIAL SCIENCES
APPLIED ECONOMICS

Instructor(s):

Marton Gosztonyi, PhD

Start date

9 September 2025

End date

16 December 2025

Study field

CHARM priority field

Study level

Study load, ECTS

4

Short description

An advanced-level course focusing on modern statistical and machine learning techniques for economic research. Students will apply tools like ARIMA, logistic regression, PCA, random forests, and SHAP to real-world data, developing the skills to analyze time-dependent and high-dimensional datasets using Python.

Full description

This course builds on foundational knowledge of statistics, including key concepts such as probability, distributions, sampling, hypothesis testing, analysis of variance, and regression analysis. It is designed for students who have already gained experience in applying these basic techniques to economic problems. In this second-level course, learners advance to contemporary analytical methods used in modern economic research and data science, focusing on more complex, high-dimensional, and time-dependent data.
The course is designed to equip students with hands-on experience in Python while deepening their understanding of statistical reasoning and machine learning. Students will work with real datasets to build time series models (e.g., ARIMA), perform logistic regression and generalized linear modeling with regularization (lasso, ridge), apply dimensionality reduction techniques like PCA and factor analysis, and experiment with clustering and supervised learning models such as random forests and gradient boosting.
Rather than separating theory from practice, all sessions follow an integrated format where students immediately apply concepts through coding and group analysis. The aim is to empower students to perform independent, robust, and communicable economic analyses that meet both academic and applied research standards.

Learning outcomes

At the end of the course, the learner will be able to:
• Apply and interpret time series models for forecasting economic data
• Build and evaluate classification models using logistic regression and regularization
• Conduct dimensionality reduction using PCA and factor analysis
• Use clustering and unsupervised learning techniques for pattern discovery
• Implement and compare supervised machine learning algorithms (e.g., random forest, SVM)
• Interpret model results and communicate findings effectively

Course requirements

Students should have completed an introductory statistics course, or possess equivalent prior knowledge. Required competencies include understanding basic statistical principles (e.g., mean, variance), probability theory, distributions, sampling and estimation, hypothesis testing, analysis of variance (ANOVA), and linear regression. Students should also be comfortable interpreting statistical results and reasoning with real-world economic data. No prior programming experience is assumed, but a willingness to learn and use Python is essential.

Places available

20

Course literature (compulsory or recommended):

Compulsory
• James et al. (2023). An Introduction to Statistical Learning with Applications in Python
• Kenett et al. (2022). Modern Statistics: A Computer-Based Approach with Python
• McClave et al. (2017). Statistics for Business and Economics
Recommended
• Müller & Guido (2016). Introduction to Machine Learning with Python
• Hyndman & Athanasopoulos (2021). Forecasting: Principles and Practice
• Xiao (2022). Artificial Intelligence Programming with Python

Planned educational activities and teaching methods:

• Lectures
• Group assignments
• Term paper project
• Python-based exercises
• Weekly consultations

Language

Assessment method

• 6 collaborative homework assignments (35%) • Group term paper using Python (40%) • Final online exam with theoretical and interpretive questions (25%) Minimum 50% is required in each component to pass the course.

Final certification

Transcript of records

no

Assessment date

19 December 2025

Modality

Learning management System in use

Canvas

Contact hours per week for the student:

3.5 hours

Specific regular weekly teaching day/time

Tuesdays 14:00-17:00 CET

Time zone