![]() Garcia Santa Cruz, Beatriz ![]() ![]() ![]() in Medical Image Analysis (2021), 74 Computer-aided diagnosis and stratification of COVID-19 based on chest X-ray suffers from weak bias assessment and limited quality-control. Undetected bias induced by inappropriate use of datasets, and ... [more ▼] Computer-aided diagnosis and stratification of COVID-19 based on chest X-ray suffers from weak bias assessment and limited quality-control. Undetected bias induced by inappropriate use of datasets, and improper consideration of confounders prevents the translation of prediction models into clinical practice. By adopting established tools for model evaluation to the task of evaluating datasets, this study provides a systematic appraisal of publicly available COVID-19 chest X-ray datasets, determining their potential use and evaluating potential sources of bias. Only 9 out of more than a hundred identified datasets met at least the criteria for proper assessment of the risk of bias and could be analysed in detail. Remarkably most of the datasets utilised in 201 papers published in peer-reviewed journals, are not among these 9 datasets, thus leading to models with a high risk of bias. This raises concerns about the suitability of such models for clinical use. This systematic review highlights the limited description of datasets employed for modelling and aids researchers to select the most suitable datasets for their task. [less ▲] Detailed reference viewed: 200 (41 UL)![]() ![]() Garcia Santa Cruz, Beatriz ![]() ![]() ![]() Poster (2021, August) Machine learning and data-driven solutions open exciting opportunities in many disciplines including healthcare. The recent transition to this technology into real clinical settings brings new challenges ... [more ▼] Machine learning and data-driven solutions open exciting opportunities in many disciplines including healthcare. The recent transition to this technology into real clinical settings brings new challenges. Such problems derive from several factors, including their dataset origin, composition and description, hampering their fairness and secure application. Considering the potential impact of incorrect predictions in applied-ML healthcare research is urgent. Undetected bias induced by inappropriate use of datasets and improper consideration of confounders prevents the translation of prediction models into clinical practice. Therefore, in this work, the use of available systematic tools to assess the risk of bias in models is employed as the first step to explore robust solutions for better dataset choice, dataset merge and design of the training and validation step during the ML development pipeline. [less ▲] Detailed reference viewed: 125 (17 UL)![]() ; ; et al E-print/Working paper (2021) Artificial intelligence (AI) methods for the automatic detection and quantification of COVID-19 lesions in chest computed tomography (CT) might play an important role in the monitoring and management of ... [more ▼] Artificial intelligence (AI) methods for the automatic detection and quantification of COVID-19 lesions in chest computed tomography (CT) might play an important role in the monitoring and management of the disease. We organized an international challenge and competition for the development and comparison of AI algorithms for this task, which we supported with public data and state-of-the-art benchmark methods. Board Certified Radiologists annotated 295 public images from two sources (A and B) for algorithms training (n=199, source A), validation (n=50, source A) and testing (n=23, source A; n=23, source B). There were 1,096 registered teams of which 225 and 98 completed the validation and testing phases, respectively. The challenge showed that AI models could be rapidly designed by diverse teams with the potential to measure disease or facilitate timely and patient-specific interventions. This paper provides an overview and the major outcomes of the COVID-19 Lung CT Lesion Segmentation Challenge - 2020. [less ▲] Detailed reference viewed: 47 (0 UL)![]() Garcia Santa Cruz, Beatriz ![]() ![]() ![]() E-print/Working paper (2020) Machine learning based methods for diagnosis and progression prediction of COVID-19 from imaging data have gained significant attention in the last months, in particular by the use of deep learning ... [more ▼] Machine learning based methods for diagnosis and progression prediction of COVID-19 from imaging data have gained significant attention in the last months, in particular by the use of deep learning models. In this context hundreds of models where proposed with the majority of them trained on public datasets. Data scarcity, mismatch between training and target population, group imbalance, and lack of documentation are important sources of bias, hindering the applicability of these models to real-world clinical practice. Considering that datasets are an essential part of model building and evaluation, a deeper understanding of the current landscape is needed. This paper presents an overview of the currently public available COVID-19 chest X-ray datasets. Each dataset is briefly described and potential strength, limitations and interactions between datasets are identified. In particular, some key properties of current datasets that could be potential sources of bias, impairing models trained on them are pointed out. These descriptions are useful for model building on those datasets, to choose the best dataset according the model goal, to take into account the specific limitations to avoid reporting overconfident benchmark results, and to discuss their impact on the generalisation capabilities in a specific clinical setting. [less ▲] Detailed reference viewed: 355 (10 UL) |
||