Making AI safe with independent audits

Artificial intelligence (AI) systems are becoming ubiquitous from financial trading to transportation, and cybersecurity to medical diagnostics.

However, high-profile accidents and problems with AI have become increasingly visible, making it critical to ensure the safety of these systems and the public’s trust in them.

A multidisciplinary and international team—comprised of workers from government, industry and academia across six countries—has proposed independent auditing methods as a mechanism to enable reliable, safe and trustworthy AI systems.

Their proposal is outlined in “Governing AI safety through independent audits,"(link is external) which was recently published in the journal Nature Machine Intelligence. The lead author is Gregory Falco,(link is external) an assistant research professor at John Hopkins University. Ben Shneiderman, a Distinguished University Professor Emeritus with an appointment in the University of Maryland Institute for Advanced Computer Studies, is the paper’s second author.

“Since enforceable principles must capture a range of cases and risk considerations, our research team represents interdisciplinary fields of study and practice, including computer science, systems engineering, law, business, public policy and ethics,” says Shneiderman, whose added expertise lies in human-computer interaction.

The team provides a three-prong approach: conduct risk assessments to proactively identify and catalogue potential threats to public safety, design audit trails that capture the context of failures with high-fidelity data to support retrospective forensic analyses, and enforce adherence to safety requirements in diverse operating environments and legal jurisdictions.

Their independent audit model is based on financial auditing and accounting models, like the U.S. Securities and Exchange Commission’s. The researchers envision internal assessments and audits of AI systems as an annual process embedded in corporations, with courts and government agencies having the ability to institutionalize audits and expand requirements into law, just as they do in the financial sector.

Audit trails will provide high-fidelity data to identify system weaknesses and improve them, similar to how a flight data recorder enables aviation analysts to understand system failures and the actions that were taken to address them.

The researchers suggest using insurance companies to enforce requirements, since they have the ability to pressure large corporations by setting lower premiums for self-driving cars with documented safety records, for example. They also recommend that the courts enforce safety requirements for AI systems by issuing decisions that clarify the responsibilities of stakeholders.

“Although our framework will require testing and refinement, we anticipate that it will have the benefit of encouraging the ethical use of AI systems that are aligned with the users' values, and promote accountability that clarifies who is responsible for failures,” says Shneiderman.

Shneiderman's upcoming book on Human-Centered AI will expand on these themes, including 15 practical recommendations for implementation in commercial and research systems.

ISR would like to thank UMIACS and Maria Herd, who wrote this story.

Published September 2, 2021