Evaluating artificial intelligence tools in health requires a wider lens than regulators have proposed to this point. That’s according to a pointed warning from researchers published in the data science journal Patterns. How so? Advanced AI goes mostly unregulated today, but the few existing guidelines focus too narrowly on the tools’ performance alone, argue the researchers from Carnegie Mellon University, the Hospital for Sick Children, the Dalla Lana School of Public Health, Columbia University and the University of Toronto. They say that evaluation of AI must take into account the broader context within which it’s implemented. That means considering when health care workers should use the technology and how they should respond to its guidance. The researchers lay out a six-part checklist for health care providers looking to adopt an AI tool. Each provider should have: 1. A use case, including how a proposed system is expected to help patients and improve health system efficiency or treatment equity 2. A clear specification of the task the system performs to achieve those goals 3. Benchmarks to evaluate the system’s success or failure 4. An understanding of how the system performs for different groups of people 5. A grasp of a system’s limits and when it shouldn’t be used 6. A protocol for monitoring systems to ensure it’s working under real-world conditions Distinguishing real-world outcomes from those of the theoretical, best-case scenario, is key, said co-author Alex John London, a professor of ethics and computational technologies at Carnegie Mellon: “Tools are not neutral. They reflect our values, so how they work reflects the people, processes, and environments in which they are put to work.” Even so: The researchers said they’re not arguing that it’s necessary to fully understand how an AI system works before implementing it. “Many interventions in medicine lack these properties in the sense that their clinical benefits have been demonstrated in well-designed trials but we do not know the precise mechanism by which they bring about that effect,” they write.
|