nnU-Net is now considered a standard and state-of-the-art tool for medical image segmentation, but, I think that it is opinionated in the following ways that affect how I may use it:
- It enforces a specific training and evaluation workflow, including assumptions about data splits. This means using custom train-valid-test splits requires some workarounds.
- It assumes 3D volumetric data, and while 2D data is supported, it’s the not the primary use case.
- It assumes grayscale images, and does not support RGB images out of the box. This again means that RGB images require some workarounds.
A lot of researchers, however, happily use nnU-Net for 3D volumetric medical images, so these limitations may not be relevant to them.
I, on the other hand, work primarily with 2D RGB images of skin lesions, so these limitations (especially #2 and #3) are very relevant to me, and to anyone else who may be working with RGB medical images: skin images (dermoscopy and clinical images of skin lesions, clinical photographs of wounds, burns, etc.), retinal fundus images, colonoscopy images, robotic surgery images, etc.
Similarly, several standardized datasets come with predefined train-valid-test partitions (e.g., ISIC challenges often release train.csv, val.csv, test.csv), and nnU-Net’s default behavior of generating its own train-valid-test splits may not be what you want.
In this post, I will briefly go over how I use nnU-Net for 2D RGB images and custom train-valid-test splits. I learnt these things by going over a lot of the official nnU-Net documentation and some other online tutorials, and I have tried to compile all of it here.
At a high level, consider a 2D segmentation task where we already have train.csv, val.csv, and test.csv files, each with columns {img, seg}, indicating the respective filepaths. For my case, the images are 2D RGB and the segmentations are binary masks. I will cover the following steps in this post:
- Convert the CSV files to the nnU-Netv2-compatible directory structure.
- Preprocess the data.
- Train the model using only the train + val data.
- Evaluate the model on the test data separately.
Prerequisites
Install nnU-Netv2
$ pip install nnunetv2
This guide uses nnU-Netv2 v2.6.2 (nnunetv2==2.6.2), which is the latest version at the time of this writing.
Set up environment variables
This is required. Make sure to use absolute paths for the environment variables. In your terminal, set three environment variables:
$ export nnUNet_raw="/path/to/experiment/dir/raw/"
$ export nnUNet_preprocessed="/path/to/experiment/dir/preprocessed/"
$ export nnUNet_results="/path/to/experiment/dir/results/"
Adapting the 2D RGB dataset to nnU-Netv2
nnU-Netv2’s directory structure
nnU-Netv2 expects the following directory structure. We will only create the top-level directories, then use the data_format_conversion_script.py script to convert the CSV files to the nnU-Netv2-compatible directory structure. And then we will finally add the dataset.json file.
/path/to/experiment/dir/
├── raw/
│ ├── Dataset001_PH2/
│ │ ├── imagesTr/
│ │ ├── labelsTr/
│ │ ├── imagesTs/
│ │ └── labelsTs/
│ │ └── dataset.json
├── preprocessed/
└── results/
nnU-Netv2 expects a specific directory structure for each dataset:
imagesTr/: {Training + Validation} images.labelsTr/: {Training + Validation} segmentations.imagesTs/: Test images. Untouched during training.labelsTs/: Test segmentations. Untouched during training.
Handling RGB images
This is important. nnU-Netv2 does not support RGB images out of the box, so a single RGB image must be converted into three separate grayscale files, corresponding to the Red, Green, and Blue channels respectively.
We will use the following file naming rules:
- Images:
{case_id}_0000.png,{case_id}_0001.png,{case_id}_0002.pngfor the RGB channels. nnU-Net expects exactly 4 digits for channel indices. Each channel must be a separate grayscale image. - Segmentations:
{case_id}.pngfor the binary mask. The segmentation must be a one-channel image, and the values must be in {0, 1} (instead of {0, 255}).
For example, if the case_id is 0001, the images will be saved as 0001_0000.png, 0001_0001.png, 0001_0002.png, corresponding to the Red, Green, and Blue channels respectively, and the segmentation will be saved as 0001.png.
Note: Had the images been grayscale, the image file names would simply have been {case_id}_0000.png for the grayscale image.
Dataset configuration and preprocessing
Dataset configuration
In the dataset.json file, we need to specify the dataset configuration. The most important part here is that three channels must be specified explicitly [source]:
{
"channel_names": {
"0": "R",
"1": "G",
"2": "B"
},
"labels": {
"background": 0,
"lesion": 1
},
"numTraining": 160,
"file_ending": ".png"
}
Remember, 160 here is the number of training + validation images. Replace it with the actual number of training + validation images as applicable.
Preprocessing the data
To preprocess the data:
$ nnUNetv2_plan_and_preprocess -d 1 --verify_dataset_integrity
This does the following:
- Detects image shape.
- Normalizes the RGB images.
- Configures a 2D nnU-Netv2 plan.
Training the model
To train the model using the training and validation data:
$ nnUNetv2_train -tr nnUNetTrainer Dataset001_PH2 2d 0
- The
-trflag specifies the trainer class. - The
Dataset001_PH2flag specifies the dataset name. - The
2dflag specifies the dimensionality of the data, and forces the model to be trained in 2D. - The
0flag specifies the fold number. This is important. Since we combined our Train and Validation sets intoimagesTr, nnU-Net will internally split them (randomly) for its own validation monitoring. This satisfies nnU-Net’s requirement for a validation set while keeping our Test set (imagesTs) completely unseen.
Testing the trained model
Obtaining predictions on the test data
Once trained, we run inference on the explicit test directory we created earlier:
$ nnUNetv2_predict \
-d Dataset001_PH2 \
-i raw/Dataset001_PH2/imagesTs \
-o test_predictions \
-f 0 \
-tr nnUNetTrainer \
-c 2d
- The
-dflag specifies the dataset name. This is the dataset name we used in the training step. - The
-iflag specifies the input directory. This is the directory where the test images are stored. - The
-oflag specifies the output directory. This is the directory where the test predictions will be saved. - The
-fflag specifies the fold number. Fold0as we are using cross-validation internally. - The
-trflag specifies the trainer class. We are using the defaultnnUNetTrainerclass. - The
-cflag specifies the dimensionality of the data. We are using 2D.
Evaluating the predictions
To evaluate the predictions:
$ nnUNetv2_evaluate_folder \
-djfile test_predictions/dataset.json \
-pfile test_predictions/plans.json \
raw/Dataset001_PH2/labelsTs \
test_predictions/
- The
-djfileflag specifies thedataset.jsonfile. - The
-pfileflag specifies theplans.jsonfile. - Both of these files will be in the
test_predictionsdirectory. - The first argument is the directory containing the ground truth segmentations.
- The second argument is the directory containing the predicted segmentations.