Foundations of Machine Learning with Applications in Python
August 18-22, 2025 in Amsterdam, Zuidas
Faculty
Ronald de Vlaming is an Assistant Professor at the Department of Econometrics and Data Science at the School of Business and Economics, Vrije Universiteit Amsterdam.
Janneke van Brummelen is an Assistant Professor at the Department of Econometrics and Data Science at the School of Business and Economics, Vrije Universiteit Amsterdam
Meet the lecturers.
Course
Research, policymaking, and business rely on ever-bigger data to answer wide-ranging questions. What are the risk factors for developing a disease? How to assess the risk profile of a new customer, when determining the appropriate insurance premium? How to best forecast unemployment? How to optimally target online advertisements? Machine-learning techniques are well-suited to answer such data-driven questions.
In this course, we provide a fast-paced and solution-oriented introduction to machine-learning algorithms. Special attention is paid to the theoretical foundations of machine-learning algorithms, as well as real-life applications.
During the lectures, we will introduce you to a wide variety of machine-learning techniques, ranging from linear and nonlinear regression models to dimensionality-reduction techniques and clustering methods, as well as deep learning using neural networks.
During the lab sessions, we will guide you step by step through real-life case studies in economics, business, and medicine. We discuss how to implement machine-learning solutions, from conceptualizing the problem and implementing the appropriate techniques in Python, to evaluating the quality of your solution and ensuring its scalability, as well as overcoming challenges such as overfitting.
Learning Goals
After successfully completing this course, you have the knowledge required to start solving problems in your own discipline using a wide range of machine-learning techniques. You will be able to communicate the core idea and intuition behind these techniques, you will understand their statistical foundations, and you will be able to reflect critically on their suitability for tackling the problem at hand. In addition, you will be able to implement simple machine-learning algorithms from scratch in Python, and you will be able to leverage existing machine-learning libraries such as scikit-learn and TensorFlow, to engineer more complex solutions.
Schedule
The course will be a mix of lectures, tutorials and computer labs which is designed to foster an engaging and collaborative learning environment. The course spans five days, Monday to Friday.
Indicative timetable:
- 09:30 – 12:20: Lecture
- 12:20 – 13:00: Lunch
- 13:00 – 15:30: Interactive Lab Session
- Day 1: Introduction to Data Science and Machine Learning: Supervised and unsupervised learning, the machine-learning life-cycle, support-vector machines, data retrieval, data visualization, feature engineering, overfitting, regularization, and cross-validation.
- Day 2: Supervised Learning – Linear and Nonlinear Models: Linear models, generalized linear models, nonlinear models and their estimation using numerical methods, the logit model, nonparametric methods, and using scikit-learn for training and cross-validation of linear and nonlinear models.
- Day 3: Supervised Learning – Regularized Regression, Classification, and Trees: Overfitting and regularization revisited, ridge and lasso regression, high-dimensional feature spaces, kernel ridge regression, nested cross-validation, nonlinear support vector machines, classification and regression trees, ensemble learning, bagging and boosting, and random forests.
- Day 4: Unsupervised Learning Similarity and distance metrics, K-means clustering, soft clustering using Gaussian mixture models, hierarchical clustering, dimensionality-reduction techniques in general and principal-component analysis.
- Day 5: Deep Learning and Neural Networks: Applications of deep learning, the perceptron, activation functions, single-layer and multi-layer neural networks, building and training neural networks, stochastic gradient descent and backpropagation, practical hurdles in training neural network, and recent advances in deep learning.
Literature
All methods that you learn to apply in this course are discussed in detail in the lectures. Hence, there is no required reading. However, we recommend the following books for background reading during the course:
- Hastie, Tibshirani, and Friedman (2009). The elements of statistical learning: data mining, inference, and prediction. 2nd edition. Springer. ISBN-13: 978-0387848570. Freely available at: https://hastie.su.domains/ElemStatLearn/printings/ESLII_print12_toc.pdf.
- Hastie, Tibshirani, and Wainwright (2015). Statistical learning with sparsity. 1st edition. Taylor & Francis. ISBN-13: 978-1498712163. Freely available at: https://hastie.su.domains/StatLearnSparsity_files/SLS_corrected_1.4.16.pdf.
In addition, the following books may serve as a useful reference:
- Provost and Fawcett (2013). Data science for business. 1st edition. O’Reilly. ISBN-13: 978-1449361327.
- James, Witten, Hastie, and Tibshirani (2013). An Introduction to Statistical Learning. Springer. 1st edition. ISBN-13: 978-1461471370. Available at: https://link.springer.com/content/pdf/10.1007/978-1-4614-7138-7.pdf.
- Müller and Guido (2016). Introduction to Machine Learning with Python. 1st edition. O’Reilly. ISBN-13: 978-1449369415.
Level
The summer course welcomes (research) master students, PhD students and post-docs with a quantitative background and who are interested in understanding and applying state-of-the-art machine-learning techniques for classification, prediction, and forecasting. We also welcome professionals from policy institutions such as central banks or international firms and institutions. You do not need to have prior experience working with machine-learning techniques. However, the course will move at a fast pace. Therefore, prior exposure to implementing statistical models such as linear regression and maximum-likelihood estimation will make it considerably easier to follow the course.
Admission requirements
Basic knowledge of Python and Jupyter Notebooks, and intermediate knowledge of matrix algebra and statistics.
Academic Director |
Ronald de Vlaming |
Degree Program |
Certificate |
Credits |
Participants who joined at least 80% of all sessions will receive a certificate of participation stating that the summer school is equivalent to a workload of 3 ECTS. Note that it is the student’s own responsibility to get these credits registered at their university. |
Mode |
Short-term |
Language |
English |
Venue |
Tinbergen Institute Amsterdam, Gustav Mahlerplein 117, 1082 MS Amsterdam |
Capacity |
30 participants (minimum 15) |
Fees |
|
Application deadline |
August 3, 2025 |
Apply here |
Contact
Summer School