Mathematics of Machine Learning
Machine learning and data science are vibrant areas of mathematical research which primarily focus on how to extract meaningful patterns from large data sets. When considering its mathematical foundations, there are three fundamental topics:
- Approximation: What problems can be solved by a model (efficiently)?
- Generalization: How well can we do on a complex problem if we only have a finite amount of data to learn from?
- Optimization: Given data and a model, how do we find the best possible parameters?
Some models which are used productively in machine learning take inspiration from mathematical analysis ('partial differential equations’ on graphs or geometric data embeddings) or biology (neural networks). Research in the Department of Mathematics at Pitt focusses on the mathematical foundations of machine learning, as well as collaboration with other research groups on foundations and applications of machine learning.
Approximation and Optimization in Deep Learning and Data Science
Machine learning research has two facets: Developing new models for broad classes of problems, and applying novel techniques to specific problems. While I have an interest in the latter, the focus of my work is on the former. I particularly enjoy working on problems in approximation theory (what kind of problems can neural networks solve?), optimization (how do I train a neural network to extract a pattern from a given dataset?) and at the interface of the two topics.
Specific topics in my research include: Function spaces for neural networks, momentum-based first order optimization algorithms, applications of partial differential equations in machine learning and applications of neural networks for solving partial differential equations. Most of my research involves mathematical analysis in some form.
Theoretical Guarantees of Modern Machine Leaning
Despite the immense and ubiquitous influence of machine learning, there is a lack of tools to analyze those successful cases. Without knowing why it works, users will not be able fix it when it fails to work. This is a growing concern as more and more applications incorporate machine learning, even for delicate tasks such as self-driving cars and surgery. Hung-Hsu Chou is dedicating himself to develop theoretical guarantee of modern machine learning. The main questions that he would like to address are:
- What properties do we expect from the results yielded by algorithms?
- What underlying principle can we extract from successful machine learning models?
- How can we design new mechanics in problem solving based on the answer of the previous two questions?
In particular, Hung-Hsu Chou’s research topics include implicit bias/regularization, neural tangent kernel, edge of stability, conformal prediction, and adversarial attacks.