Skip to main content

2024 | Buch

Data Augmentation, Labelling, and Imperfections

Third MICCAI Workshop, DALI 2023, Held in Conjunction with MICCAI 2023, Vancouver, BC, Canada, October 12, 2023, Proceedings

insite
SUCHEN

Über dieses Buch

This LNCS conference volume constitutes the proceedings of the 3rd International Workshop on

Data Augmentation, Labeling, and Imperfections (DALI 2023), held on October 12, 2023, in Vancouver, Canada, in conjunction with the 26th International

Conference on Medical Image Computing and Computer Assisted Intervention

(MICCAI 2023). The 16 full papers together in this volume were carefully reviewed and selected from 23 submissions.

The conference fosters a collaborative environment for addressing the critical challenges associated with medical data, particularly focusing on data, labeling, and dealing with data imperfections in the context of medical image analysis.

Inhaltsverzeichnis

Frontmatter
URL: Combating Label Noise for Lung Nodule Malignancy Grading
Abstract
Due to the complexity of annotation and inter-annotator variability, most lung nodule malignancy grading datasets contain label noise, which inevitably degrades the performance and generalizability of models. Although researchers adopt the label-noise-robust methods to handle label noise for lung nodule malignancy grading, they do not consider the inherent ordinal relation among classes of this task. To model the ordinal relation among classes to facilitate tackling label noise in this task, we propose a Unimodal-Regularized Label-noise-tolerant (URL) framework. Our URL contains two stages, the Supervised Contrastive Learning (SCL) stage and the Memory pseudo-labels generation and Unimodal regularization (MU) stage. In the SCL stage, we select reliable samples and adopt supervised contrastive learning to learn better representations. In the MU stage, we split samples with multiple annotations into multiple samples with a single annotation and shuffle them into different batches. To handle label noise, pseudo-labels are generated using the similarity between each sample and the central feature of each class, and temporal ensembling is used to obtain memory pseudo-labels that supervise the model training. To model the ordinal relation, we introduce unimodal regularization to keep the ordinal relation among classes in the predictions. Moreover, each lung nodule is characterized by three orthographic views. Experiments conducted on the LIDC-IDRI dataset indicate the superiority of our URL over other competing methods. Code is available at https://​github.​com/​axz520/​URL.
Xianze Ai, Zehui Liao, Yong Xia
Zero-Shot Learning of Individualized Task Contrast Prediction from Resting-State Functional Connectomes
Abstract
Given sufficient pairs of resting-state and task-evoked fMRI scans from subjects, it is possible to train ML models to predict subject-specific task-evoked activity using resting-state functional MRI (rsfMRI) scans. However, while rsfMRI scans are relatively easy to collect, obtaining sufficient task fMRI scans is much harder as it involves more complex experimental designs and procedures. Thus, the reliance on scarce paired data limits the application of current techniques to only tasks seen during training. We show that this reliance can be reduced by leveraging group-average contrasts, enabling zero-shot predictions for novel tasks. Our approach, named OPIC (short for Omni-Task Prediction of Individual Contrasts), takes as input a subject’s rsfMRI-derived connectome and a group-average contrast, to produce a prediction of the subject-specific contrast. Similar to zero-shot learning in large language models using special inputs to obtain answers for novel natural language processing tasks, inputting group-average contrasts guides the OPIC model to generalize to novel tasks unseen in training. Experimental results show that OPIC’s predictions for novel tasks are not only better than simple group-averages, but are also competitive with a state-of-the-art model’s in-domain predictions that was trained using in-domain tasks’ data.
Minh Nguyen, Gia H. Ngo, Mert R. Sabuncu
Microscopy Image Segmentation via Point and Shape Regularized Data Synthesis
Abstract
Current deep learning-based approaches for the segmentation of microscopy images heavily rely on large amount of training data with dense annotation, which is highly costly and laborious in practice. Compared to full annotation where the complete contour of objects is depicted, point annotations, specifically object centroids, are much easier to acquire and still provide crucial information about the objects for subsequent segmentation. In this paper, we assume access to point annotations only during training and develop a unified pipeline for microscopy image segmentation using synthetically generated training data. Our framework includes three stages: (1) it takes point annotations and samples a pseudo dense segmentation mask constrained with shape priors; (2) with an image generative model trained in an unpaired manner, it translates the mask to a realistic microscopy image regularized by object level consistency; (3) the pseudo masks along with the synthetic images then constitute a pairwise dataset for training an ad-hoc segmentation model. On the public MoNuSeg dataset, our synthesis pipeline produces more diverse and realistic images than baseline models while maintaining high coherence between input masks and generated images. When using the identical segmentation backbones, the models trained on our synthetic dataset significantly outperform those trained with pseudo-labels or baseline-generated images. Moreover, our framework achieves comparable results to models trained on authentic microscopy images with dense labels, demonstrating its potential as a reliable and highly efficient alternative to labor-intensive manual pixel-wise annotations in microscopy image segmentation. The code can be accessed through https://​github.​com/​CJLee94/​Points2Image.
Shijie Li, Mengwei Ren, Thomas Ach, Guido Gerig
A Unified Approach to Learning with Label Noise and Unsupervised Confidence Approximation
Abstract
Noisy label training is the problem of training a neural network from a dataset with errors in the labels. Selective prediction is the problem of selecting only the predictions of a neural network which have sufficient confidence. These problems are both important in medical deep learning, where they commonly occur simultaneously. Existing methods however tackle one problem but not both. We show that they are interdependent and propose the first integrated framework to tackle them both, which we call Unsupervised Confidence Approximation (UCA). UCA trains a neural network simultaneously for its main task (e.g. image segmentation) and for confidence prediction, from noisy label datasets. UCA does not require confidence labels and is thus unsupervised in this respect. UCA is generic as it can be used with any neural architecture. We evaluated its performance on the CIFAR-10N and Gleason-2019 datasets. UCA’s prediction accuracy increases with the required level of confidence. UCA-equipped networks are on par with the state-of-the-art in noisy label training when used in regular, full coverage mode. However, they have a risk-management facility, showing flawless risk-coverage curves with substantial performance gain over existing selective prediction methods.
Navid Rabbani, Adrien Bartoli
Transesophageal Echocardiography Generation Using Anatomical Models
Abstract
Through automation, deep learning (DL) can enhance the analysis of transesophageal echocardiography (TEE) images. However, DL methods require large amounts of high-quality data to produce accurate results, which is difficult to satisfy. Data augmentation is commonly used to tackle this issue. In this work, we develop a pipeline to generate synthetic TEE images and corresponding semantic labels. The proposed data generation pipeline expands on an existing pipeline that generates synthetic transthoracic echocardiography images by transforming slices from anatomical models into synthetic images. We also demonstrate that such images can improve DL network performance through a left-ventricle semantic segmentation task. For the pipeline’s unpaired image-to-image (I2I) translation section, we explore two generative methods: CycleGAN and contrastive unpaired translation. Next, we evaluate the synthetic images quantitatively using the Fréchet Inception Distance (FID) Score and qualitatively through a human perception quiz involving expert cardiologists and the average researcher.
In this study, we achieve a dice score improvement of up to 10% when we augment datasets with our synthetic images. Furthermore, we compare established methods of assessing unpaired I2I translation and observe a disagreement when evaluating the synthetic images. Finally, we see which metric better predicts the generated data’s efficacy when used for data augmentation.
Emmanuel Oladokun, Musa Abdulkareem, Jurica Šprem, Vicente Grau
Data Augmentation Based on DiscrimDiff for Histopathology Image Classification
Abstract
Histopathological analysis is the present gold standard for cancer diagnosis. Accurate classification of histopathology images has great clinical significance and application value for assisting pathologists in diagnosis. However, the performance of histopathology image classification is greatly affected by data imbalance. To address this problem, we propose a novel data augmentation framework based on the diffusion model, DiscrimDiff, which expands the dataset by synthesizing images of rare classes. To compensate for the lack of discrimination ability of the diffusion model for synthesized images, we design a post-discrimination mechanism to provide image quality assurance for data augmentation. Our method significantly improves classification performance on multiple datasets. Furthermore, histomorphological features of different classes concerned by the diffusion model may provide guiding significance for pathologists in clinical diagnosis. Therefore, we visualize histomorphological features related to classification, which can be used to assist pathologist-in-training education and improve the understanding of histomorphology.
Xianchao Guan, Yifeng Wang, Yiyang Lin, Yongbing Zhang
Clinically Focussed Evaluation of Anomaly Detection and Localisation Methods Using Inpatient CT Head Data
Abstract
Anomaly detection approaches in medical imaging show promise in reducing the need for labelled data. However, the question of how to evaluate anomaly detection algorithms remains challenging, both in terms of the data and the metrics. In this work, we take a cohort of inpatient CT head scans from an elderly stroke patient population containing a variety of anomalies, and treat the associated radiology reports as the reference for clinically relevant findings which should be detected by an anomaly detection algorithm. We apply two state-of-the-art anomaly detection methods to the data, namely denoising autoencoder (DAE) and context-to-local feature matching (CLFM) models. We then extract bounding boxes from the predicted anomaly score heatmaps, which we treat as candidate anomaly detections. A clinical evaluation is then conducted in which 3 radiologists rate the candidate anomalies with respect to their detection and localisation accuracy, by assigning the corresponding report sentence where a clinically relevant anomaly is correctly detected, and rating localisation according to a 3-point scale (good, partial, poor). We find that neither method exhibits sufficiently high recall for clinical use, even at low detection thresholds, although anomaly detection shows promise as a scalable approach for detecting clinically relevant findings. We highlight that selection of the optimal thresholds and extraction of discrete anomaly predictions (e.g. bounding boxes) are underexplored topics in anomaly detection.
Antanas Kascenas, Chaoyang Wang, Patrick Schrempf, Ryan Grech, Hui Lu Goh, Mark Hall, Alison Q. O’Neil
LesionMix: A Lesion-Level Data Augmentation Method for Medical Image Segmentation
Abstract
Data augmentation has become a de facto component of deep learning-based medical image segmentation methods. Most data augmentation techniques used in medical imaging focus on spatial and intensity transformations to improve the diversity of training images. They are often designed at the image level, augmenting the full image, and do not pay attention to specific abnormalities within the image. Here, we present LesionMix, a novel and simple lesion-aware data augmentation method. It performs augmentation at the lesion level, increasing the diversity of lesion shape, location, intensity and load distribution, and allowing both lesion populating and inpainting. Experiments on different modalities and different lesion datasets, including four brain MR lesion datasets and one liver CT lesion dataset, demonstrate that LesionMix achieves promising performance in lesion image segmentation, outperforming several recent Mix-based data augmentation methods. The code will be released at https://​github.​com/​dogabasaran/​lesionmix.
Berke Doga Basaran, Weitong Zhang, Mengyun Qiao, Bernhard Kainz, Paul M. Matthews, Wenjia Bai
Knowledge Graph Embeddings for Multi-lingual Structured Representations of Radiology Reports
Abstract
The way we analyse clinical texts has undergone major changes over the last years. The introduction of language models such as BERT led to adaptations for the (bio)medical domain like PubMedBERT and ClinicalBERT. These models rely on large databases of archived medical documents. While performing well in terms of accuracy, both the lack of interpretability and limitations to transfer across languages limit their use in clinical setting. We introduce a novel light-weight graph-based embedding method specifically catering radiology reports. It takes into account the structure and composition of the report, while also connecting medical terms in the report through the multi-lingual SNOMED Clinical Terms knowledge base. The resulting graph embedding uncovers the underlying relationships among clinical terms, achieving a representation that is better understandable for clinicians and clinically more accurate, without reliance on large pre-training datasets. We show the use of this embedding on two tasks namely disease classification of X-ray reports and image classification. For disease classification our model is competitive with its BERT-based counterparts, while being magnitudes smaller in size and training data requirements. For image classification, we show the effectiveness of the graph embedding leveraging cross-modal knowledge transfer and show how this method is usable across different languages.
Tom van Sonsbeek, Xiantong Zhen, Marcel Worring
Modular, Label-Efficient Dataset Generation for Instrument Detection for Robotic Scrub Nurses
Abstract
Surgical instrument detection is a fundamental task of a robotic scrub nurse. For this, image-based deep learning techniques are effective but usually demand large amounts of annotated data, whose creation is expensive and time-consuming. In this work, we propose a strategy based on the copy-paste technique for the generation of reliable synthetic image training data with a minimal amount of annotation effort. Our approach enables the efficient in situ creation of datasets for specific surgeries and contexts. We study the amount of employed manually annotated data and training set sizes on our model’s performance, as well as different blending techniques for improved training data. We achieve 91.9 box mAP and 91.6 mask mAP, training solely on synthetic data, in a real-world scenario. Our evaluation relies on an annotated image dataset of the wisdom teeth extraction surgery set, created in an actual operating room. This dataset, the corresponding code, and further data are made publicly available (https://​github.​com/​Jorebs/​Modular-Label-Efficient-Dataset-Generation-for-Instrument-Detection-for-Robotic-Scrub-Nurses).
Jorge Badilla-Solórzano, Nils-Claudius Gellrich, Thomas Seel, Sontje Ihler
Adaptive Semi-supervised Segmentation of Brain Vessels with Ambiguous Labels
Abstract
Accurate segmentation of brain vessels is crucial for cerebrovascular disease diagnosis and treatment. However, existing methods face challenges in capturing small vessels and handling datasets that are partially or ambiguously annotated. In this paper, we propose an adaptive semi-supervised approach to address these challenges. Our approach incorporates innovative techniques including progressive semi-supervised learning, adaptative training strategy, and boundary enhancement. Experimental results on 3DRA datasets demonstrate the superiority of our method in terms of mesh-based segmentation metrics. By leveraging the partially and ambiguously labeled data, which only annotates the main vessels, our method achieves impressive segmentation performance on mislabeled fine vessels, showcasing its potential for clinical applications.
Fengming Lin, Yan Xia, Nishant Ravikumar, Qiongyao Liu, Michael MacRaild, Alejandro F. Frangi
Proportion Estimation by Masked Learning from Label Proportion
Abstract
The PD-L1 rate, the number of PD-L1 positive tumor cells over the total number of all tumor cells, is an important metric for immunotherapy. This metric is recorded as diagnostic information with pathological images. In this paper, we propose a proportion estimation method with a small amount of cell-level annotation and proportion annotation, which can be easily collected. Since the PD-L1 rate is calculated from only ‘tumor cells’ and not using ‘non-tumor cells’, we first detect tumor cells with a detection model. Then, we estimate the PD-L1 proportion by introducing a masking technique to ‘learning from label proportion’. In addition, we propose a weighted focal proportion loss to address data imbalance problems. Experiments using clinical data demonstrate the effectiveness of our method. Our method achieved the best performance in comparisons.
Takumi Okuo, Kazuya Nishimura, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, Ryoma Bise
Active Learning Strategies on a Real-World Thyroid Ultrasound Dataset
Abstract
Machine learning applications in ultrasound imaging are limited by access to ground-truth expert annotations, especially in specialized applications such as thyroid nodule evaluation. Active learning strategies seek to alleviate this concern by making more effective use of expert annotations; however, many proposed techniques do not adapt well to small-scale (i.e. a few hundred images) datasets. In this work, we test active learning strategies including an uncertainty-weighted selection approach with supervised and semi-supervised learning to evaluate the effectiveness of these tools for the prediction of nodule presence on a clinical ultrasound dataset. The results on this as well as two other medical image datasets suggest that even successful active learning strategies have limited clinical significance in terms of reducing annotation burden.
Hari Sreedhar, Guillaume P. R. Lajoinie, Charles Raffaelli, Hervé Delingette
A Realistic Collimated X-Ray Image Simulation Pipeline
Abstract
Collimator detection remains a challenging task in X-ray systems with unreliable or non-available information about the detectors position relative to the source. This paper presents a physically motivated image processing pipeline for simulating the characteristics of collimator shadows in X-ray images. By generating randomized labels for collimator shapes and locations, incorporating scattered radiation simulation, and including Poisson noise, the pipeline enables the expansion of limited datasets for training deep neural networks. We validate the proposed pipeline by a qualitative and quantitative comparison against real collimator shadows. Furthermore, it is demonstrated that utilizing simulated data within our deep learning framework not only serves as a suitable substitute for actual collimators but also enhances the generalization performance when applied to real-world data.
Benjamin El-Zein, Dominik Eckert, Thomas Weber, Maximilian Rohleder, Ludwig Ritschl, Steffen Kappler, Andreas Maier
Masked Conditional Diffusion Models for Image Analysis with Application to Radiographic Diagnosis of Infant Abuse
Abstract
The classic metaphyseal lesion (CML) is a distinct injury that is highly specific for infant abuse. It commonly occurs in the distal tibia. To aid radiologists detect these subtle fractures, we need to develop a model that can flag abnormal distal tibial radiographs (i.e. those with CMLs). Unfortunately, the development of such a model requires a large and diverse training database, which is often not available. To address this limitation, we propose a novel generative model for data augmentation. Unlike previous models that fail to generate data that span the diverse radiographic appearance of the distal tibial CML, our proposed masked conditional diffusion model (MaC-DM) not only generates realistic-appearing and wide-ranging synthetic images of the distal tibial radiographs with and without CMLs, it also generates their associated segmentation labels. To achieve these tasks, MaC-DM combines the weighted segmentation masks of the tibias and the CML fracture sites as additional conditions for classifier guidance. The augmented images from our model improved the performances of ResNet-34 in classifying normal radiographs and those with CMLs. Further, the augmented images and their associated segmentation masks enhanced the performance of the U-Net in labeling areas of the CMLs on distal tibial radiographs.
Shaoju Wu, Sila Kurugol, Andy Tsai
Self-supervised Single-Image Deconvolution with Siamese Neural Networks
Abstract
Inverse problems in image reconstruction are fundamentally complicated by unknown noise properties. Classical iterative deconvolution approaches amplify noise and require careful parameter selection for an optimal trade-off between sharpness and grain. Deep learning methods allow for flexible parametrization of the noise and learning its properties directly from the data. Recently, self-supervised blind-spot neural networks were successfully adopted for image deconvolution by including a known point-spread function in the end-to-end training. However, their practical application has been limited to 2D images in biomedical domain because it implies large kernels, which are poorly optimized. We tackle this problem with Fast Fourier Transform convolutions that provide training speed-up in 3D microscopy deconvolution tasks. Further, we propose to adopt a Siamese invariance loss for deconvolution and empirically identify its optimal position in the neural network between blind-spot and full image branches. The experimental results show that our improved framework outperforms the previous state-of-the-art deconvolution methods with a known point spread function.
Mikhail Papkov, Kaupo Palo, Leopold Parts
Backmatter
Metadaten
Titel
Data Augmentation, Labelling, and Imperfections
herausgegeben von
Yuan Xue
Chen Chen
Chao Chen
Lianrui Zuo
Yihao Liu
Copyright-Jahr
2024
Electronic ISBN
978-3-031-58171-7
Print ISBN
978-3-031-58170-0
DOI
https://doi.org/10.1007/978-3-031-58171-7

Premium Partner