Mobile AI rivals specialists in diagnosing skin cancer, outperforms junior doctors
A recent Lancet Digital Health study assesses whether mobile phone-powered artificial intelligence (AI) could support the diagnosis and management of pigmented skin cancer.
Study: Comparison of humans versus mobile phone-powered artificial intelligence for the diagnosis and management of pigmented skin cancer in secondary care: a multicentre, prospective, diagnostic, clinical trial. Image Credit: Gordana Sermek / Shutterstock.com
Background
Advanced machine learning (ML) has been applied in the development of AI-based computer algorithms for the diagnosis of various diseases. One of the key advantages of AI-based diagnostic systems is their unrestricted availability.
The accuracy of AI-based diagnostic systems depends on multiple factors. In most cases, diagnosis by human experts and AI-assisted diagnostic systems have been compared in simulated settings, which might not truly reflect real clinical settings. However, these limitations do not apply to certain clinical situations where direct physical connection with a patient is not required, like radiology.
Most studies have shown that AI-based diagnostic systems can better diagnose skin cancer than human experts. In fact, the International Skin Imaging Collaboration (ISIC) 2018 Challenge indicated that computer algorithms can outperform top experts in the diagnosis of skin cancer.
One of the critical limitations of this study was its design, as clinicians made diagnoses based on images on a computer screen without contextual information. Therefore, it is imperative to investigate whether the diagnostic superiority of AI is applicable in clinical settings where clinicians can directly study skin lesions in patients.
About the study
The current multicentered, diagnostic, and prospective clinical trial compared the decisions of the AI diagnostic algorithm of the ISIC 2018 Challenge with medical experts’ diagnoses in clinical settings. Subsequently, a new algorithm was developed based on data from current healthcare professionals.
Clinicians from the Sydney Melanoma Diagnostic Centre in Australia and the Department of Dermatology of the Medical University of Vienna in Austria were recruited.
These clinicians were classified as either specialists or novices. Specialists were experts with medical qualifications related to the diagnosis and management of pigmented skin lesions, whereas novices were dermatology junior doctors with unaccredited or accredited trainee positions and experience in examining and managing this disease.
Adult patients between the ages of 18 and 99 years were recruited. Participants selected for this clinical trial underwent routine excision or biopsy of suspicious pigmented skin lesions bigger than three millimeters (mm) in the longest diameter. These patients had a modified Fitzpatrick I-III skin type.
Both specialists and novices examined the patient’s whole-body skin without any verbal communication with the patients. Each clinician recorded their diagnosis, which was compared with AI assessment. All patients were examined against their total-body photographs from baseline assessments.
Images of lesions were taken from selected patients using a dermoscopic tool attached to a mobile phone equipped with DermEngine software from MetaOptima Technology. This tool enabled taking polarized images without reflective artifacts.
Reference tests for lesions were associated with histopathological examination. The lesions that remained unchanged on total-body photographs were considered benign, while the changed lesions underwent digital dermoscopy monitoring.
Study findings
A total of 172 suspicious pigmented lesions from 124 patients were included in the diagnostic study. For the management study, 5,696 pigmented lesions were included from 66 high-risk patients.
The cohort presented unbalanced data, in which 99.7% of lesions were benign, and 0.3% were malignant, for which the management study metric was more focused on specificity and not sensitivity. The seven-class AI was slightly superior to the specialists and highly superior to the novices.
The newly developed seven-class mobile phone-powered AI, equipped with a simple dermoscopy phone attachment, provided data for creating the algorithm, which aided in the superior diagnosis of skin cancer compared to novices. For diagnostics, the seven-class AI algorithm was equivalent to the specialists’ decision, while it remained superior to the novices’ diagnosis.
Comparatively, ISIC AI algorithms were significantly inferior to the specialists’ diagnoses but significantly superior to the novices’ decisions. This difference in performance between seven-class AI and ISIC AI could be attributed to the seven-class AI being trained with a much larger and diverse dataset. The seven-class management AI was significantly inferior to specialists’ and novices’ management.
No change in balanced multiclass accuracy was observed, which was considered a primary endpoint of the online studies. Overfitting of the ISIC AI algorithm was determined based on significantly reduced analytical performance on images collected from sources that were not included during training.
Conclusions
The current clinical trial provided an AI-based skin cancer diagnosis system using dermoscopy images. In contrast to previous studies, the study findings reveal that a simple mobile phone technology without any expensive hardware could accurately diagnose skin cancer.
Journal reference:
- Menzies. W. S., Sinz, C., Menzies, M., et al. (2023) Comparison of humans versus mobile phone-powered artificial intelligence for the diagnosis and management of pigmented skin cancer in secondary care: a multicentre, prospective, diagnostic, clinical trial. Lancet Digital Health 5. doi:10.1016/S2589-7500(23)00130-9.
link