• Graduate Program
    • Why study Business Data Science?
    • Program outline
    • Courses
    • Course registration
    • Admissions
    • Facilities
      • Student Offices
      • Location
      • Housing
      • Student Council
  • Research
  • News
  • Events
    • Events Calendar
    • Events archive
    • Summer School
      • Behavioral Decision Making
      • Deep Learning
      • Econometrics and Data Science Methods for Business and Economics and Finance
      • Foundations of Data Analysis and Machine Learning in Python
      • Introduction to Genome-Wide Data Analysis
      • Reinforcement Learning
      • Tinbergen Institute Summer School Program
  • Summer School
  • Alumni
Home | News | Research by Yi He challenges ways of working and thinking within data science
News | June 10, 2022

Research by Yi He challenges ways of working and thinking within data science

In an interview with the Amsterdam School of Economics (ASE), Tinbergen Institute research fellow Yi He (University of Amsterdam) shares his fascination with the way data science applies mostly outdated mathematical models. He also elaborates on how his research will make clear exactly what kind of solution data scientists should use for a given problem.

Research by Yi He challenges ways of working and thinking within data science

Earlier studies by Yi focussed predominately on statistical issues. Nowadays, he is more concerned with data science, acknowledging that, at the end of the day, he is more of a data scientist than mathematician. So what does he see as the main difference? 'I base my ideas and theories on real-world situations and events. The research done by mathematicians is at a much higher level of abstraction,' the he explains.

''Many data scientists use very clear and understandable mathematical solutions to formulate answers to concrete financial questions. But when you look closely at these mathematical solutions, you notice that they’re actually rather simplistic ‒ to the point of not giving you a very reliable answer. So I cast doubt on such solutions Yi says. To me, data science is all about using data in the best possible way to answer questions. And I believe it means moving off the beaten track. It really feels great to be doing this kind of research. To me, data science is all about using data in the best possible way to answer questions. And I believe it means moving off the beaten track. It really feels great to be doing this kind of research.'

He acknowledges that all predictions can be expected to have margins of error. But in some cases, more complex methods work better while in others 'simplicity is king'. Which method should be applied to which data set? 'That’s the issue we’re now trying to resolve.' And Yi is doing so with a new mathematical theory he is currently developing. 'It’s not after all simply a question of how, but just as much one of why. We need to understand why complex data models give us the best predictions in some situations but not others.' He concludes: 'It will save us a lot of time ‒ and spare us many illusions.'

This is an excerpt of the interview by ASE. Read the complete interview on their website

The paper “Most powerful test against a sequence of high dimensional local alternatives" authored by Yi He and co-authors Jiti Gao (Monash University, Australia), and Sombut Jaidee (Monash University, Australia) is forthcoming in the Journal of Econometricsdoi.org/10.1016/j.jeconom.2021.10.015.