Why and How Governments Should Monitor AI Development

Abstract

Governments face a range of policy challenges from AI technologies, which are developed and deployed at a speed that traditional governance approaches cannot keep pace with. This results in a situation where private companies are able to deploy AI systems with substantial potential for harm or misuse in mostly unregulated markets, and governments are unable to effectively scrutinise these systems in the ways needed to govern them. As AI capabilities advance, they will raise increasingly high-stakes policy challenges, making it increasingly important that governments have the tools to intervene in ways that promote benefits and mitigate risks.

Jess Whittlestone and Jack Clark present a proposal for addressing this problem by investing in government initiatives to measure and monitor various aspects of AI research, deployment, and impacts, including:

Continuously analyzing deployed systems for potential harms, as well as developing better ways to measure the impacts of deployed systems where such measures do not already exist.
Tracking activity, attention, and progress in AI research by using bibliometric analysis, benchmarks and open source data.
Assessing the technical maturity of AI capabilities relevant to specific domains of policy interest

Governments could use this measurement and monitoring infrastructure for a variety of purposes, including:

Testing deployed systems to see if they conform to regulation.
Incentivizing positive applications of AI via measuring and ranking deployed systems.
More rigorous and coordinated approaches to impact assessment and assurance.
Comparative analysis of the strength of countries’ AI ecosystems.
Prioritizing funding and incentivizing research.
Early warning systems for sources of risk or opportunity.

Building up this infrastructure will likely need to be an iterative process, beginning with small pilot projects. Promising pilot projects might include:

Assessing the landscape of AI datasets and evaluating who they do and don’t represent. Using these findings to fund the creation of datasets to fill in the gaps.
Using geographic bibliometric analysis to understand a country’s competitiveness on key areas of AI research and development.
Hosting competitions to make it easy to measure progress in a certain policy-relevant AI domain, such as competitions to find vulnerabilities in widely-deployed vision systems, or to evaluate the advancing capabilities of smart industrial robots.
Funding projects to improve assessment methods in commercially important areas (e.g. certain types of computer vision, to accelerate progress and commercial application in these areas.
Tracking the deployment of AI systems for particular economically relevant tasks, in order to better track, forecast, and ultimately prepare for the societal impacts of such systems.
Monitoring concrete cases of harm caused by AI systems on a national level, to keep policymakers up to date on the current impacts of AI, as well as potential future impacts caused by research advances.
Monitoring the adoption of or spending on AI technology across sectors, to identify the most important sectors to track and govern, as well as generalizable insights about how to leverage AI technology in other sectors.
Monitoring the share of key inputs to AI progress that different actors control (i.e., talent, computational resources and the means to produce them, and the relevant data), to better understand which actors policymakers will need to regulate and where intervention points are.

Read Full Paper