🦁 LionTalk
Brought to you by LionDevelopers @ Columbia University
Upcoming Seminars
Machine learning and symmetries
Abstract
Symmetries play a significant role in machine learning. In scientific applications, they often arise as constraints imposed by physical laws. More broadly, symmetries emerge whenever objects admit multiple ways to express them (for example, in graph machine learning). In addition, modern machine learning models are heavily overparameterized, so many distinct sets of parameters can represent the same function, revealing further underlying symmetries. In this talk, we describe methods for incorporating symmetries into machine learning models using classical tools from algebra, including invariant theory and Galois theory. A particularly interesting feature of symmetry-preserving models is that they can be defined independently of the size or dimension of the input. The formalization of this setting, known as any-dimensional machine learning, is inspired by ideas from representation stability. In this talk we present a theoretical framework for understanding the assumptions imposed by such models, which allows us to align learning models with data of varying sizes and learning tasks in a principled way. Any-dimensional models use a fixed set of parameters and can be evaluated on data of varying sizes. Hyperparameter transfer considers the complementary setting, in which the data are fixed while the model size varies, and studies how optimal hyperparameters (such as the learning rate) can be transferred from smaller models to larger ones. If time permits, we will also discuss recent connections between any-dimensional machine learning and hyperparameter transfer.
Speaker Bio
Soledad Villar is an Assistant Professor of Applied Mathematics and Statistics at Johns Hopkins. Her research is in the mathematical foundations of deep learning, including geometric deep learning. She received an NSF CAREER award, a Sloan Fellowship in Mathematics, and her research has been supported by ONR, Apple, Amazon, and the Simons Foundation. She is originally from Uruguay.
Who Marries Whom in China? Education, Sorting, and Marital Surplus
Abstract
This seminar will cover topics related to education, sorting, and marital surplus in China.
Speaker Bio
PhD Student in Economics at Columbia University.
Who should buy ? Public Procurement and Bureaucratic Quality
Abstract
This seminar will explore public procurement and bureaucratic quality.
Speaker Bio
PhD Student in Economics at Columbia University.
To be announced.
Abstract
The topic for this seminar is to be announced.
Speaker Bio
PhD Student in Labour and Public Economics at Paris School of Economics.
Finance, Market Power, and Misallocation
Abstract
This seminar will discuss finance, market power, and misallocation.
Speaker Bio
PhD Student in Economics at Columbia University.
Geoeconomic Financial Fragmentation
Abstract
This seminar will cover geoeconomic financial fragmentation.
Speaker Bio
PhD Student in Economics at Columbia University.
Cost of Wildfire Smoke: a Revealed Preferences Approach
Abstract
Although wildfire smoke is a major public health concern, relatively little is known about smoke avoidance behavior and its associated costs. In this paper, I estimate the welfare impacts of wildfire smoke avoidance during the 2018 California wildfire season, one of the deadliest in recent US history. I leverage granular smartphone location data to construct daily ZIP-to-ZIP trip flows for different types of trips. In particular, I develop a methodology to identify outdoor recreation trips by overlaying individual smartphone pings with GIS data on parks and other protected areas in California. I jointly estimate trip elasticities to wildfire smoke and travel time using a gravity framework. I find a strong decrease in outdoor trips to smoky destinations. I build a three-tiered logit to model the agents’ decision to leave their home, choose an activity and a destination. I use this framework to study the impact of wildfire smoke on welfare.
Speaker Bio
PhD Student in Sustainable Development at Columbia University.
To be announced.
Abstract
The topic for this seminar is to be announced.
Speaker Bio
PhD Student in Industrial Engineering and Operations Research at Columbia University.
To be announced.
Abstract
The topic for this seminar is to be announced.
Speaker Bio
Professor of Political Science at Stanford University.
N/A
IICD Seminar Series: Andrea Sottoriva, Human Technopole
Abstract
N/A
Speaker Bio
Not detected in text
Signed and Directed Graph Clustering in Financial Time Series: Statistical Arbitrage, Lead-Lag Structure, and the Market Tug-of-War
Abstract
We develop spectral methods for clustering heterogeneous networks, in the setting of signed and directed networks, and demonstrate their benefits on networks arising from stochastic block models and financial multivariate time series data, where one is often interested in clustering assets that exhibit similar contemporaneous behavior. We demonstrate the economic benefits of the proposed graph clustering algorithms in statistical arbitrage and portfolio construction applications. Both signed and directed graph clustering problems share an important common feature: they can be solved by exploiting the spectrum of certain graph Laplacian matrices or derivations thereof, allowing for performance guarantees under suitably defined stochastic block models. We further develop a likelihood-based spectral clustering framework for directed graphs, providing a principled objective function along with theoretical guarantees. A task of major interest in financial applications is that of uncovering lead-lag relationships in high-dimensional multivariate time series. In such settings, certain groups of variables partially lead the evolution of the system, while other variables follow with a time delay, resulting in a lead-lag structure that can be encoded as edges of a directed network. Detecting clusters exhibiting a notion of pairwise flow imbalance amounts to identifying baskets of assets that lead and lag each other. We leverage graph clustering and ranking algorithms for lead-lag detection, and demonstrate that our methodology identifies statistically significant lead-lag clusters in the US equity market. We study the composition of the uncovered clusters, compare performance across time frequencies, and benchmark against established approaches from the lead-lag literature for portfolio construction. In addition, we uncover a market-wide "tug-of-war", whereby overnight speculation and daytime price correction propagate across stocks through directed lead-lag relations, giving rise to economically meaningful cross-asset trading opportunities.
Speaker Bio
N/A
IICD Workshop Series: Parameter Estimation Workshop
Abstract
N/A
Speaker Bio
Not detected in text
Learning Multi-Index Models via Harmonic Analysis
Abstract
Modern machine learning heavily relies on the success of large-scale models trained via gradient-type algorithms. A major effort in recent years has been to understand the fundamental limits of these learning algorithms: What governs the complexity of gradient-based training? Which distributions can these methods learn efficiently? What are the underlying computational-statistical trade-offs? In this talk, we focus on a key property of generic gradient-based methods: their equivariance with respect to a large symmetry group. We develop a group-theoretic framework to analyze the complexity of learning a fixed distribution with an equivariant algorithm. This framework reveals a natural factorization of the group-distribution pair and suggests an optimal sequential, adaptive learning process. We illustrate this framework in the classical problem of learning multi-index models and characterize the computational-statistical trade-offs in this setting. We conclude by revisiting several existing algorithms through this lens. This is based on joint work with Hugo Latourelle-Vigeant, Hugo Koubbi, Nirmit Joshi, and Nati Srebro.
Speaker Bio
Theodor Misiakiewicz is an Assistant Professor in the Department of Statistics and Data Science at Yale University. His research focuses on deep learning theory, high-dimensional statistics, and computational learning theory. Prior to Yale, he received his PhD from Stanford University in 2023 and was a Research Assistant Professor at the Toyota Technological Institute at Chicago (TTIC) in 2023-2024.
N/A
Abstract
N/A
Speaker Bio
N/A
N/A
N/A
Abstract
N/A
Speaker Bio
N/A
N/A
Abstract
N/A
Speaker Bio
N/A