This has been another year where I felt like I slacked on my reading, and that probably is genuinely true for the tumultuous last half, but my read folder lists 154, so I can pick out a few that I liked to highlight in this end-of-year tradition. These are in approximate chronological order by date read, and as always I make no claim to depth or representativeness, so even in areas I follow there’s a lot on my should-read list. Topics shifted during the year with major career changes, from LLMs and RL to more classic econometrics to a bit of computation and even some computational game theory, to applied economics. In the past year I’ve gone through full-time work at an AI startup, two academic job searches (one still ongoing: please hire me! I’m very willing to move abroad), Covid and other health issues, homelessness (I was fine, don’t worry about it), a visiting liberal arts teaching load with a new prep on short notice, and, um, a new gender, and still managed to get out two papers I’m proud of, with more ongoing. Since I am in self-promotion mode until someone gives me a job, my papers first:
- Zhou, Huang, Azizzadenesheli, Childers, and Lipton. “Timing as an Action: Learning When to Observe and Act” AISTATS 2024
- We derived reinforcement learning algorithms for what economists will recognize as the sticky information setting! This incorporates periods of non-observation but turns out to be easier than a POMDP because the learner has the option to observe at any time, for a cost. Led by a dream team of Helen Zhou (experiments, healthcare applications) and Audrey Huang (badass RL theorist).
- Gupta, Lipton, and Childers. “Online Data Collection for Efficient Semiparametric Inference”
- Extends our previous work on Online Moment Selection for combining data sources adaptively via a kind of Bandit Generalized Method of Moments to the semiparametric/double-machine learning case with nonparametric nuisance parameters. Think adaptive experiments, but for a much broader class of possibly-observational settings and models. This also builds on some fun technical work on uniform-in-time confidence sequences.
Papers I liked
- Stevenson. “Cause Effect and the Structure of the Social World”, BU Law Review, 2023.
- The Iron Law of Evaluation is that most social interventions do approximately nothing. What does that mean for social science?
- Agarwal, Suo, Chen, Hazan. “Spectral State Space Models”
- What gets referred to in ML as “State Space Models” are neural networks with linear layers taking sequences of arbitrary length as input, and passing them through a structured linear filter, allowing sequence modeling without the weird permutation invariance that enables parallelism in transformers but without the sequential computation bottleneck of RNNs. This area opened up, among other things, a chance to bring in classical signal processing methods in neural-land; this paper addresses low-frequency behavior by prewhitening and projections onto a 1-sided filter.
- Curth, Jeffares, van der Schaar. “A U-turn on Double Descent: Rethinking Parameter Counting in Statistical Learning”
- The double-descent phenomenon was always oddly placed in statistical learning theory, since parameter counting underlying “overparameterized” networks has been known at least since Vapnik-Cervonenkis to not be the right measure of generalization performance. This paper explains that what’s going on in cases where models enter this regime is that the definition of the model changes.
- See also: Curth on fixed vs random design as another deflationary component of new results in generalization. For much more in this area, see the Simon’s program this semester on “Modern Paradigms in Generalization”, particularly, as always, the boot camp. I found Nati Srebro’s talks (1,2), the most enlightening on the broad scope of classical vs modern generalization theory.
- Kaplan, Nikolakoudis, Violante. “Price Level and Inflation Dynamics in Heterogeneous Agent Economies”
- Recent events have revived interest in the Fiscal Theory of the Price Level, and getting a handle on it has been a big chunk of my macro reading this year. I find an Aiyagari-style heterogeneous agent approach simplifies analysis by giving a classic asset demand and supply characterization.
- See also: John Cochrane has written some good explainer papers (“Fiscal Histories”, “Fiscal Narratives” ) and blog posts. I have yet to dive into his book or the voluminous critical literature, so I’m reserving judgement, but at minimum clarifying long-run fiscal assumptions in macro models is a helpful venue to pursue.
- Koenker and Gu. “Empirical Bayes for the Reluctant Frequentist”
- Compound loss decision theory offers a good frequentist justification for empirical or hierarchical Bayes, and some practical procedures based on nonparametric maximum likelihood. At minimum, specifying the decision problem gives you an answer to when and where hierarchy is useful.
- Luedtke. “Simplifying debiased inference via automatic differentiation and probabilistic programming”
- Reverse-mode autodiff for the Von Mises Semiparametric functional calculus! For those not in the know, the key tricky bit underlying all of “Semiparametric/doubly robust/targeted/double machine learning” is the influence function, which is a weird kind of functional derivative, but until now not one amenable to automatic derivation, making manual calculation a huge part of the applied statistics aparatus. This paper heralds a day, like the advent of automatic differentiation did for probabilistic programming and scientific machine learning, when the benefits of semiparametrics (parametric root n confidence intervals for important functionals in the front, high-dimensional nonparametric ML in the back) can be exploited automatically in any model a user can think up. I raved about this on social media.
- Liu, Attias, Roy. “The Minimax Regret of Sequential Probability Assignment, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood”
- Characterizing optimal online learning for conditional densities in log loss. Maybe I’m just a sucker for new complexity measures, IDK.
- Benigno and Eggertsson. “Revisiting the Phillips and Beveridge Curves: Insights from the 2020s Inflation Surge”
- I was teaching a policy-oriented undergrad econ class for the first time in years this past semester, and this Jackson Hole paper was a great way to explain “where we are” with graphical analysis that can be followed by students with just a little macro under their belts.
- Xu, Min, Wang, Wang, Jordan, and Yang. “Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning”
- Aiyagari, with all the i’s dotted and t’s crossed. I have a lot of thoughts about where reinforcement learning can and can’t take us in macro that follow from the problems in Moll’s “The Trouble with Rational Expectations in Heterogeneous Agent Models: A Challenge for Macroeconomics”, but I think it’s going to take some serious grappling with strategic issues.