Medical Image Modeling

Transfer learning: Kind of the source of the first major “AI Summer” or DL becoming dominant in AI in general. Driven by ImageNET. Leverage the early layers of that network and fine-tune for downstream tasks.

Assumption is that low level features transfer across domains. But does this help? In medicine, this transfer learning may not work. Pretraining on ImageNET was not helpful in the medical domain! How do these Pretrained layers process ‘natural’ images like a cat versus an x-ray?

CNNs have a spatial correspondence at successive layers. High frequency filters will do, for example, edges. ‘Higher’ layers will extract more complex layers. Be wary of post-hoc interpretation.

See this paper. The TLDR is that transfer learning did not really help from ImageNET → Medical Domain (cultural effects; people saw it working very well generally and thought “Hey why not try it here?”)

There’s also domain-specific pre-trained models: e.g. CheXzero for x-rays. Consensus is that more domain-specificity is better (think a funnel: every repeated stage of domain adaptation was found to be better; not cheap!)

Label Quality problem

Doesn’t just affect classification! Affects training and evaluation. Think Image → Radiologist → Report → NLP → Label. Each step adds noise! Radiologists can disagree. Invest in label quality over architecture search (stride length, padding, etc. in CNNs; relatively modest effects overall tho).

How do you measure noise? Be stupid and go further upstream and get the ‘true’ labels. Or just add noise to the labels/randomize them and see what happens to Loss Function vs. Training Epoch.

Segmentation and U-Net

Kvasir-SEG Polyp dataaset.

Assign a label to each pixel in the mask. Image (256x256) → Encode → Compressed (8x8) → Decode → Mask (256x256). You want the right shape but want it to be a contiguous region.

U-Net (Ronneberger et al 2015). Each part/layer of the encoder stack can ‘talk’ to the corresponding layer in the decoder stack using “skip connections”. Why even reduce? You’re relying on the inductive biases of each successive CNN layers to recover/find some pertinent information. You’re relying on skip connections to make hierarchical judgements (?)

THey’re a bit like residual connections in ResNet.

Dice Loss

For training and eval. Imagine polyps/masks being very small. < 1% maybe. So what we do here is normalize by region size and not by pixel.

So if the model drops a pixel outside the ‘true’ mask, do you care? It’s not correcting a class imbalance problem it’s correcting an importance problem. If you have a large polyp then you don’t care too much about getting all the details right. This is in comparison to the background.

It is weighting the pixel with the label compared to other pixels with the label.

L_{Dice} = 1 - \frac{2\sum{\hat{y_i}}y_i + \epsilon}{\sum{\hat{y_i} + \sum{y_i} + \epsilon}}

Whole Slide Pathology and MIL (Multiple Instance Learning)

Pathology slides are huge. Can’t fit all of them into GPU memory. Break into small chunks/tiles. Problem is Labels only exist at the slide level. Typically applies at a positive label level (label of cancer applies to all tiles not just $T_n$ ). 10,000 tiles but 50 show cancer. Size of tile and ‘bag’/pool is determined by what you can fit onto your GPU.

Most tile labels are wrong. You need a pooling strategy: Max Pool, Mean Pool, Attention-MIL.

Label Quality problem​

Segmentation and U-Net​

Dice Loss​

Whole Slide Pathology and MIL (Multiple Instance Learning)​

Label Quality problem

Segmentation and U-Net

Dice Loss

Whole Slide Pathology and MIL (Multiple Instance Learning)