Abstract
The landscape of AI safety is frequently explored differently by contrasting specialised AI versus general AI (or AGI), by analysing the short-term hazards of systems with limited capabilities against those more long-term risks posed by ‘superintelligence’, and by conceptualising sophisticated ways of bounding control an AI system has over its environment and itself (impact, harm to humans, self-harm, containment, etc.). In this position paper we reconsider these three aspects of AI safety as quantitative factors –generality, capability and control–, suggesting that by defining metrics for these dimensions, AI risks can be characterised and analysed more precisely. As an example, we illustrate how to define these metrics and their values for some simple agents in a toy scenario within a reinforcement learning setting.
Proceedings of the Workshop on Artificial Intelligence Safety (SafeAI 2020)
co-located with 34th AAAI Conference on Artificial Intelligence (AAAI 2020) New York, USA, Feb 7, 2020.