Barbara Mellers, Psychology, University of Pennsylvania
Abstract: Psychological Science, a leading journal in the field of psychology, recently published a paper by Hauenstein et al. (2024) in which the authors reanalyzed data from the Good Judgment Project (e.g., Mellers et al., 2014) and argued that IARPA’s tournaments permitted method variance; forecasters could select their own questions and make their own forecasts whenever they wished while the forecast window was open. This method variance, they argued, was distinct from forecaster ability. Some forecasters may have appeared to be more accurate than they actually were by “gaming” the tournament. They could select the easier questions and then make predictions at when the event was soon to resolve (and when they had the most information).
Hauenstein et al. developed a model that distinguished between a latent forecaster ability and two latent strategic factors (question selection and timing). After fitting their model to our data, they claimed that our interventions of probability training and placing forecasters in teams had detrimental effects on forecasting ability. We disagree. The selection of questions and the timing of submissions are not bugs but features or skills that accurate forecasters must have. We believe probability training and working in teams help forecasters become more accurate.
We would like to conduct an adversarial collaboration with Hauenstein et al. (2024). In this collaboration, we would “hold constant” forecasting questions. All forecasters would make predictions on all events. Second, we would “hold constant” timing, by ensuring that all forecasters made their predictions with the same period. By manipulating probability training and teaming in this tournament, we can empirically disentangle the claims of Hauenstein et al. from our own.

