U-Net: Convolutional Networks for Biomedical Image SegmentationU-Net: Convolutional Networks for Biomedical Image SegmentationRonneberger, Olaf and Fischer, Philipp and Brox, Thomas2015
Paper summarynandinics1. U-NET learns segmentation in an end to end images.
2. They solved Challenges are
* Very few annotated images (approx. 30 per application).
* Touching objects of the same class.
* Input image is fed in to the network, then the data is propagated through the network along all possible path at the end segmentation maps comes out.
* In U-net architecture, each blue box corresponds to a multi-channel feature map. The number of channels is denoted on top of the box. The x-y-size is provided at the lower left edge of the box. White boxes represent copied feature maps. The arrows denote the different operations.
* In two 3x3 convolutions (unpadded convolutions), each followed by a rectified linear unit (ReLU) and a 2x2 max pooling operation with stride 2for down sampling. At each down sampling step they double the number of feature channels.
* Contracting path (left side from up to down) is increases the feature channel and reduces the steps and an expansive path (right side from down to up) consists of sequence of up convolution and concatenation with the corresponds high resolution features from contracting path.
* The network does not have any fully connected layers and only uses the valid part of each convolution, i.e., the segmentation map only contains the pixels, for which the full context is available in the input image.
1. Overlap-tile strategy for seamless segmentation of arbitrary large images:
* To predict the pixels in the border region of the image, the missing context is extrapolated by mirroring the input image.
* In fig, segmentation of the yellow area uses input data of the blue area and the raw data extrapolation by mirroring.
2. Augment training data using deformation:
* They use excessive data augmentation by applying elastic deformations to the available training images.
* Then the network to learn invariance to such deformations, without the need to see these transformations in the annotated image corpus.
* Deformation used to be the most common variation in tissue and realistic deformations can be simulated efficiently.
3. Segmentation of touching object of the same class:
* They propose the use of a weighted loss, where the separating background labels between touching cells obtain a large weight in the loss function.
* Ensure separation of touching objects, in that segmentation mask for training (inserted background between touching objects) get the loss weights for each pixel.
4. Segmentation of neural structure in electro-microscopy(EM):
* Ongoing challenge since ISBI 2012 in this dataset structures with low contrast, fuzzy membranes and other cell components.
* The training data is a set of 30 images (512x512 pixels) from serial section transmission electron microscopy of the Drosophila first instar larva ventral nerve cord (VNC). Each image comes with corresponding fully annotated ground truth segmentation map for cells(white) and membranes (black).
* An evaluation can be obtained by sending the predicted membrane probability map to the organizers. The evaluation is done by thresholding the map at 10 different levels and computation of the warping error, the Rand error and the pixel error.
* The u-net (averaged over 7 rotated versions of the input data) achieves with-out any further pre or post-processing a warping error of 0.0003529, a rand-error of 0.0382 and a pixel error of 0.0611.
* ISBI cell tracking challenge 2015, one of the dataset contains cell phase contrast microscopy has strong shape variations,weak outer borders, strong irrelevant inner borders and cytoplasm has same structure like background.
* The first data set PHC-U373 contains Glioblastoma-astrocytoma U373 cells on a polyacrylimide substrate recorded by phase contrast microscopy- It contains 35 partially annotated training images. Here we achieve an average IOU ("intersection over union") of 92%,which is significantly better than the second best algorithm with 83%.
* The second data set DIC-HeLa are HeLa cells on a flat glass recorded by differential interference contrast (DIC) microscopy - It contains 20 partially annotated training images. Here we achieve an average IOU of 77.5% which is significantly better than the second best algorithm with 46%.