Handing over medical images to artificial intelligence (AI) for analysis can detect and measure abnormalities faster and more accurately than human experts, pushing image-based medical diagnosis even further. To improve patient outcomes and establish targeted treatments, high-quality AI models that are universal across different populations must be established. However, to build such an AI model, it is inseparable from the support of a large amount of data, and this data also needs to be carefully labeled before it can be learned by the machine.

Is Medical Image Data Training Too Difficult? That’s because you don’t know this weapon yet~
Credit: Phonlamai Photo/Shutterstock.com

Handing over medical images to artificial intelligence (AI) for analysis can detect and measure abnormalities faster and more accurately than human experts, pushing image-based medical diagnosis even further. To improve patient outcomes and establish targeted treatments, high-quality AI models that are universal across different populations must be established. However, to build such an AI model, it is inseparable from the support of a large amount of data, and this data also needs to be carefully labeled before it can be learned by the machine.

Now, we can complete the training of AI through a branch of deep learning (DL) – weakly supervised learning. This machine learning technology can reduce the completeness and accuracy of data labeling requirements, helping doctors to obtain more in-depth information more easily. The data used for weakly supervised learning only needs to be roughly labeled with easier implementation (for example, only the entire image needs to be labeled instead of subdivided key regions in the image), and the learning process can take full advantage of pre-trained models and common available data. interpretive methods. In this article, we will examine the important role data management plays in weakly supervised learning.

Labeling medical images is not easy

Image annotation in the medical industry presents many difficulties. First of all, medical images themselves and related inspection result data are often stored in different systems, which makes data labeling difficult, and it is difficult to obtain labelled data.

For example, imaging data from computed tomography (CT) or magnetic resonance imaging (MRI) may be stored in hospital systems, but associated biopsy or tumor resection findings are often stored in pathology laboratories, which may be located in Private clinics or testing facilities other than hospitals. At this point, while it is practical to reconcile certain data and labels, the work of acquiring and aggregating the data can be time-consuming, especially if more than one private practice or testing facility is involved.

Also, finding and marking signs of disease onset and progression (biomarkers) in images is inherently time-consuming and complex, because the data must be labeled pixel-by-pixel, and the final number of markers can be up to Thousands. This is especially problematic if algorithms need to be used to segment images or locate specific regions such as lesions or surgical boundaries. Such a process is often cost-intensive, because medical image labeling usually requires specialized knowledge, and MRI and CT images also require 3D labeling. Combined, these two shortcomings make medical image labeling an expensive endeavor that is difficult to outsource.

Since the labeling process requires professional knowledge, the quality of the labeling will also change depending on the degree to which the labeler has mastered this knowledge, which in turn affects the final performance of the deep learning model. For data annotation, the accuracy of annotation is a big problem. Typically, inexperienced radiologists or residents are trained in data labeling, but their labeling accuracy is clearly not as good as that of clinicians with decades of experience.

In addition, the differences shown by the readers will also affect the results of the annotation: on the one hand, different readers will interpret the same image slightly differently; Label the same image, and the final result will be slightly different

Finally, the very fact of manual labeling can limit the final result. One of the advantages of machine learning is that the model can discover laws that humans cannot perceive. However, manual annotation ultimately relies on manual input, and the final output of the model is easily limited.

For example, if an AI can only replicate human thinking about certain tasks, it is likely to inadvertently replicate someone’s biases.

In addition, features in some seemingly irrelevant regions of the input data may also be predictive, but simply discarded because they are not within the human-selected region of interest.

For example, it is entirely possible that prominent signs of disease will appear in other tissues surrounding the area of ​​interest, or even in other nearby organs.

Use weakly supervised learning for training

In the above scenarios, we would prefer that the AI ​​accept more general annotations (such as whether an image contains cancer tissue or other disease indications), and then let the model find the most telling features (Figure 1) . This is where weakly supervised learning comes in.

Is Medical Image Data Training Too Difficult? That’s because you don’t know this weapon yet~

Figure 1: Automatic annotation using weakly supervised learning. AI found predictive features that pathologists did not. (Image source: Pathology Informatics Team of RIKEN Center for Advanced Intelligence Project)

Weakly supervised learning is a branch of deep learning that aims to generate well-performing deep learning models with fewer, coarser annotations. These annotations can be roughly divided into three categories: incomplete, imprecise, and inaccurate annotations. The term “roughly” is used here because multiple annotation methods can be combined in a single dataset, and the purpose of weakly supervised annotation is to solve various combinations of problems as needed.

Incomplete annotations usually manifest as a part of the dataset is annotated while the rest is not.

Inaccurate annotation is to directly annotate the overall result of the image without segmenting specific areas of interest.

Inaccurate labeling stems from lack of expertise among labelers and ambiguity or uncertainty among certain disease indications.

Interestingly, imprecise annotations may be more useful than incomplete or inaccurate annotations if good results can be produced with coarser, easier-to-implement annotations. An imprecise callout is less error-prone because it doesn’t need to be as detailed as other callouts, and it’s easier to get:

For example, it is only necessary to extract information about the cancer stage from the scan report, which can indicate that the scan image contains cancer tissue, and it is no longer necessary to manually “cut out” the cancerous area from the three-dimensional image. These annotations themselves are “inaccurate”, but they allow the dataset to have more available annotations, which in turn improves accuracy.

Most importantly, in this way of marking, we no longer have to pay a lot of money to hire or train highly professional people to mark all the relevant minutiae. This method can ultimately improve the accuracy of the annotation, after all, it is easier to give an alternative answer than to describe all the features in detail.

To exploit such imprecise annotations in common medical imaging applications such as detecting and locating critical regions, it is common to utilize the following two-step process:

Build backbone models, such as training a deep learning model to predict classes described by imprecise annotations.

In models that make predictions on specific scans, pixel attribute methods (also known as saliency or interpretability methods) are used to highlight the most relevant regions of the model’s decisions.

Figure 2 shows a variety of gradient-based pixel attribute methods.

Is Medical Image Data Training Too Difficult? That’s because you don’t know this weapon yet~
Figure 2: Two input images (goldfish and bear), and the gradient-based pixel attribute method used to perform segmentation during weakly supervised learning. (Image source: TF Keras Vis on GitHub)

Use Convolutional Neural Networks as Backbone

Imaging data is often used in the medical field, so the use of convolutional neural networks (CNN) as the main underlying deep learning framework for weakly supervised learning is a natural choice. CNNs work by learning to reduce the number of pixels that need to be processed in medical scans (usually by reducing the dimensionality of a 3D image), and then assigning these pixels to class labels.

In weakly supervised learning, we can also use a combination of methods. You can either train a new network with your own dataset (if that dataset is sufficient to provide the advantage of other similar data sources), or you can use a pretrained network for transfer learning on new tasks. For example, ResNet50 and VGG16 are two CNN architectures trained with millions of images from everyday life. Although they have not been trained on medical images, they are still very useful because the convolutional filters learned in the layers in the early stages of the model tend to involve generic features such as lines, shapes, and textures, which Medical imaging is still useful.

To use one of these models for transfer learning, simply remove the late stage class prediction layers and re-initialize with layers representing the classes required for the new medical imaging task. While the ultimate goal of the model is to have the output highlight relevant objects and areas of interest in the image, the first step is simply to predict whether these areas of interest exist in the image.

AI Interpretability for Weakly Supervised Localization

Once the deep learning backbone is trained to accurately predict whether there are classes of interest, the next step is to use some AI interpretability method to segment the regions of interest. These interpretability methods (also known as pixel-attribute methods) aim to gain insight into what a deep learning model sees in an image when it makes some kind of prediction, the output of which is some form of image (often called a saliency map) , which can be calculated in a number of different ways depending on the final goal.

Among these methods, the gradient-based saliency map is one of the most commonly used methods, and its core includes output prediction and detection of all neurons that make up the output. Depending on the method, this detection can go all the way back to the first input layer – the standard gradient (Vanilla Gradient), or it can stay at some later layer, such as the last convolutional layer in a neural network architecture – GradCAM (Figure 3). Other methods can achieve different goals, such as producing smoother regions of interest, improving the limitations of simpler methods, or making tighter segmentations around desired features.

Is Medical Image Data Training Too Difficult? That’s because you don’t know this weapon yet~
Figure 3: GradCAM is an ML interpretability method that can be used to segment features in weakly supervised learning, it takes the gradient with respect to the output class of the last convolutional layer. (Image credit: Zhou et al, MIT Computer and Artificial Intelligence Laboratory)

Epilogue

Until recently, identifying biomarkers in medical images required large volumes of image data that were annotated in complex ways. However, techniques such as weakly supervised learning have lowered the requirements for data labeling completeness, precision, and accuracy, making it easy to uncover problems that previously required a lot of time and a high degree of expertise.

Weakly supervised learning can work with only coarser annotations that are easier to implement (e.g. only annotating the entire image instead of subdivided key regions in the image). It can reuse pre-trained CNN models and then use common interpretability methods to highlight areas of interest based on predicted classes. Armed with these features, models trained with medical imaging data can be used for a variety of applications without the need for extensive pixel-level annotation. Not only does this save time and cost, it is more likely to uncover predictive features previously unknown to clinicians, thereby increasing diagnostic accuracy and improving patient outcomes.

Is Medical Image Data Training Too Difficult? That’s because you don’t know this weapon yet~
Becks

about the author

Becks is the head of machine learning technology at Imagia, a Montreal-based startup that helps clinicians use artificial intelligence to advance medical research. In her spare time, she also works with Whale Seeker, another start-up that uses artificial intelligence to detect whales, and aims to allow industrial development to coexist in harmony with these gentle beasts. She works in the field of deep learning and machine learning, focusing on researching new deep learning methods and applying them directly to solve real-world problems, building pipelines and platforms to train and deploy AI models, and AI for startups and data strategy consulting services.

About Mouser Electronics

Mouser Electronics is an authorized global distributor of semiconductors and Electronic components serving the world’s largest electronic design community. Mouser Electronics is authorized to distribute nearly 1,200 well-known brands, and can order millions of online products, providing customers with a one-stop sourcing platform. Welcome to follow us and get first-hand design and industry information!

The Links:   LM150X08-A4K3 G185HAN011