We are excited to announce that Alexandru Marcoci, Senior Research Associate in AI Risk and Foresight at CSER, was awarded funding from Open Philanthropy’s Benchmarking LLM agents on consequential real-world tasks program.
Together with Abel Brodeur (University of Ottawa/Institute for Replication) and Rohan Alexander (University of Toronto), they will investigate the ability of LLMs to assess whether the primary findings of a research paper in the social and behavioural sciences reproduce successfully. In the coming year they will organise a series of hackathons in which LLMs, Human-LLM and Human-only teams will attempt to detect coding errors, discrepancies between the codes and the article, computationally reproduce the results using the same software as the authors and another language, and conduct robustness checks.