Recently, a paper published in the “Nature” sub-journal “Nature Machine Intelligence” pointed out that researchers at the University of Washington studied artificial intelligence (AI) models for detecting new coronaviruses and found that these models are unstable and may lead to diagnostic errors. The phenomenon.
Regarding the reasons for the misdiagnosis, the researchers believe that most of the models only rely on data analysis and data such as the characteristics of the patient’s chest X-ray to judge whether the patient is infected with the new coronavirus, rather than diagnosis and analysis based on real medical pathology. .
The title of the paper is “AI for radiographic COVID-19 detection selects shortcuts over signal”.
Paper link: https://www.nature.com/articles/s42256-021-00338-7
1. AI improves the speed of diagnosis and treatment, but the model lacks transparency
The application of artificial intelligence in the medical industry has improved the speed and accuracy of medical diagnosis, and also won the best treatment time for patients. From the initial consultation, personalized treatment, to the prediction of the success rate of surgery, artificial intelligence will become an indispensable part of the patient’s medical treatment process in the future.
As researchers at the University of Washington found, artificial intelligence can reduce the pressure on doctors to see a doctor and provide a convenient and fast way for patients to see a doctor. But if deployed in a clinical setting, this “shortcut learning” approach that leverages AI could lead to diagnostic errors.
“Doctors usually analyze and summarize specific patterns of disease development from X-ray images,” said Dr. Paul G, the author of the paper, who is currently pursuing a Ph.D. A modality of analytic diagnosis that requires learning. But this does not refer to an analytic system that relies on short-cut learning, as it may lead to false diagnoses. For example, a short-cut learning system might infer that a patient suffers from an elderly patient because they are elderly Certain diseases, and that’s simply because it’s more common in older patients. There’s nothing inherently wrong with using this ‘shortcut’ approach, but it’s not guaranteed to be accurate.”
The research team pointed out that this kind of shortcut learning is still in the early stage of research and development, and has not yet become an authoritative professional doctor, so it will not be promoted. Meanwhile, team member DeGrave said, “This shortcut-learning model can only be used in the hospital where it was developed, and when it is applied to other hospitals, there will be misdiagnosis.”
The short-cut learning model lacks transparency and is seen as a “black box” for artificial intelligence by researchers specializing in medicine and science. Specifically, after the model was trained on massive amounts of data, no one knew how it derived the diagnosis.
2. The AI model is unstable, and the accuracy of the secondary test is halved
Recently, the research team used this model to identify the new coronavirus. But due to a lack of training data on Covid-19, these models are prone to a diagnosis known as “worst-case confounding.” It refers to the confusing judgment caused by the interference of various factors. This situation suggests that the model relies more on shortcut learning than on basic medical pathology.
Another author of the paper, Joseph Janizek from Allen College, said: “Artificial intelligence is only performing data identification and decision analysis, not really learning the pathology of the disease. When all new coronavirus positive cases come from one dataset, and all negative cases are When they all come from another dataset, it is prone to misdiagnosis. Although researchers have proposed some corresponding solutions to reduce misdiagnosis, in the case of insufficient sample data, these methods have little effect.”
The research team experimented with the method in the published paper. First, they tested the initial performance of the model. The researchers then re-tested the model with external data from the new hospital system.
While the models maintained high performance on the first test, on the second test, their accuracy was halved. The researchers’ explanation for this is that there is a “generalization gap”, and it also favorably demonstrates that confounding factors are responsible for the model’s success on initial data tests. The research team then used artificial intelligence techniques to generate adversarial networks and saliency maps to clarify which image features were key decision information in the model.
The researchers then trained the model on external data from a second time, which included positive and negative Covid-19 cases from similar