Foundations of Data Analysis and Machine Learning in Python
July 25-29, 2022 in Amsterdam, Zuidas
Faculty
Lukas Hoesch is an Assistant Professor at the Department of Econometrics and Data Science at the School of Business and Economics, Vrije Universiteit Amsterdam.
Ronald de Vlaming is an Assistant Professor at the Department of Economics at the School of Business and Economics, Vrije Universiteit Amsterdam.
Meet the lecturers.
Course
Research, policymaking, and business rely on ever-bigger data to answer wide-ranging questions. What are the risk factors for developing a disease? Which individuals do we need to charge a higher insurance premium? How to best forecast inflation? How to optimally target online advertisements? Machine learning techniques are well-suited to answer such data-driven questions.
In this course, you will learn about a wide variety of machine learning techniques, ranging from linear and non-linear regression models to dimensionality-reduction techniques, clustering methods and deep learning using artificial neural networks. Special attention will be paid to both a theoretical understanding of the various methods as well as to real-life applications of the techniques using Python.
Schedule
Five days with the following structure:
09:00 – 12:00: Lectures
13:00 – 16:00: Tutorials
Covered topics:
- Day 1: Basic concepts: Data retrieval and processing: databases, web scraping, data lakes, feature engineering, and DataFrames. Estimating simple linear regression models. Data visualization using matplotlib, seaborn, and plotly.
- Day 2: Supervised learning – I: Linear and non-linear models for regression and classification: OLS, logit models, and generalized linear models. Model selection and cross validation.
- Day 3: Supervised learning – II: Regularization techniques: ridge and lasso. Kernel methods and the kernel trick for high-dimensional feature engineering. Decision trees and random forests. Bagging and boosting.
- Day 4: Unsupervised learning: Principal components analysis, K-means clustering,
K-nearest neighbours, and Gaussian mixture models. - Day 5: Deep learning: Neural networks, training and evaluation of neural networks, and convolutional and recurrent neural networks.
Literature
An Introduction to Statistical Learning. James G, Witten D, Hastie T, and Tibshirani R. Springer, 2013, ISBN: 978-1-4614-7137-0. Freely available at: https://link.springer.com/content/pdf/10.1007/978-1-4614-7138-7.pdf.
Statistical Learning with Sparsity. Hastie T, Tibshirani R, and Wainwright M.
Chapman & Hall/CRC, 2015, ISBN: 978-1-4987-1216-3. Freely available at: https://hastie.su.domains/StatLearnSparsity_files/SLS_corrected_1.4.16.pdf.
Introduction to Machine Learning with Python. Mueller, A C and Guido, S.
O’Reilly, 2016, ISBN: 978-1-4493-6941-5.
Level
The summer course welcomes (research) master students, PhD students and post-docs with a quantitative background and who are interested in understanding and applying state-of-the-art machine-learning techniques for classification, prediction, and forecasting. We also welcome professionals from policy institutions such as central banks or international firms and institutions.
Admission requirements
Basic knowledge of Python and Jupyter Notebooks, and intermediate knowledge of matrix algebra and statistics.
Academic Director | Ronald de Vlaming |
Degree Program | Certificate |
Credits | Participants who joined at least 80% of all sessions will receive a certificate of participation stating that the summer school is equivalent to a workload of 3 ECTS. Note that it is the student’s own responsibility to get these credits registered at their university. |
Mode | Short-term |
Language | English |
Venue | Tinbergen Institute Amsterdam, Gustav Mahlerplein 117, 1082 MS Amsterdam |
Capacity | 30 participants (minimum 15) |
Fees | Tuition Fees and Payments |
Application deadline | June 27, 2022 |
Apply here | Link to application form |
Contact
Summer School