Using nnU-Net with 2D RGB images and custom data splits

nnU-Net is now considered a standard and state-of-the-art tool for medical image segmentation, but, I think that it is opinionated in the following ways that affect how I may use it:

It enforces a specific training and evaluation workflow, including assumptions about data splits. This means using custom train-valid-test splits requires some workarounds.
It assumes 3D volumetric data, and while 2D data is supported, it’s the not the primary use case.
It assumes grayscale images, and does not support RGB images out of the box. This again means that RGB images require some workarounds.

A lot of researchers, however, happily use nnU-Net for 3D volumetric medical images, so these limitations may not be relevant to them.

I, on the other hand, work primarily with 2D RGB images of skin lesions, so these limitations (especially #2 and #3) are very relevant to me, and to anyone else who may be working with RGB medical images: skin images (dermoscopy and clinical images of skin lesions, clinical photographs of wounds, burns, etc.), retinal fundus images, colonoscopy images, robotic surgery images, etc.

Similarly, several standardized datasets come with predefined train-valid-test partitions (e.g., ISIC challenges often release train.csv, val.csv, test.csv), and nnU-Net’s default behavior of generating its own train-valid-test splits may not be what you want.

In this post, I will briefly go over how I use nnU-Net for 2D RGB images and custom train-valid-test splits. I learnt these things by going over a lot of the official nnU-Net documentation and some other online tutorials, and I have tried to compile all of it here.

At a high level, consider a 2D segmentation task where we already have train.csv, val.csv, and test.csv files, each with columns {img, seg}, indicating the respective filepaths. For my case, the images are 2D RGB and the segmentations are binary masks. I will cover the following steps in this post:

Convert the CSV files to the nnU-Netv2-compatible directory structure.
Preprocess the data.
Train the model using only the train + val data.
Evaluate the model on the test data separately.

Prerequisites

Install nnU-Netv2

$ pip install nnunetv2

This guide uses nnU-Netv2 v2.6.2 (nnunetv2==2.6.2), which is the latest version at the time of this writing.

Set up environment variables

This is required. Make sure to use absolute paths for the environment variables. In your terminal, set three environment variables:

$ export nnUNet_raw="/path/to/experiment/dir/raw/"
$ export nnUNet_preprocessed="/path/to/experiment/dir/preprocessed/"
$ export nnUNet_results="/path/to/experiment/dir/results/"

Adapting the 2D RGB dataset to nnU-Netv2

nnU-Netv2’s directory structure

nnU-Netv2 expects the following directory structure. We will only create the top-level directories, then use the data_format_conversion_script.py script to convert the CSV files to the nnU-Netv2-compatible directory structure. And then we will finally add the dataset.json file.

/path/to/experiment/dir/
├── raw/
│   ├── Dataset001_PH2/
│   │   ├── imagesTr/
│   │   ├── labelsTr/
│   │   ├── imagesTs/
│   │   └── labelsTs/
│   │   └── dataset.json
├── preprocessed/
└── results/

nnU-Netv2 expects a specific directory structure for each dataset:

imagesTr/: {Training + Validation} images.
labelsTr/: {Training + Validation} segmentations.
imagesTs/: Test images. Untouched during training.
labelsTs/: Test segmentations. Untouched during training.

Handling RGB images

This is important. nnU-Netv2 does not support RGB images out of the box, so a single RGB image must be converted into three separate grayscale files, corresponding to the Red, Green, and Blue channels respectively.

We will use the following file naming rules:

Images: {case_id}_0000.png, {case_id}_0001.png, {case_id}_0002.png for the RGB channels. nnU-Net expects exactly 4 digits for channel indices. Each channel must be a separate grayscale image.
Segmentations: {case_id}.png for the binary mask. The segmentation must be a one-channel image, and the values must be in {0, 1} (instead of {0, 255}).

For example, if the case_id is 0001, the images will be saved as 0001_0000.png, 0001_0001.png, 0001_0002.png, corresponding to the Red, Green, and Blue channels respectively, and the segmentation will be saved as 0001.png.

Note: Had the images been grayscale, the image file names would simply have been {case_id}_0000.png for the grayscale image.

Dataset configuration and preprocessing

Dataset configuration

In the dataset.json file, we need to specify the dataset configuration. The most important part here is that three channels must be specified explicitly [source]:

{
    "channel_names": {
        "0": "R",
        "1": "G",
        "2": "B"
    },
    "labels": {
        "background": 0,
        "lesion": 1
    },
    "numTraining": 160,
    "file_ending": ".png"
}

Remember, 160 here is the number of training + validation images. Replace it with the actual number of training + validation images as applicable.

Preprocessing the data

To preprocess the data:

$ nnUNetv2_plan_and_preprocess -d 1 --verify_dataset_integrity

This does the following:

Detects image shape.
Normalizes the RGB images.
Configures a 2D nnU-Netv2 plan.

Training the model

To train the model using the training and validation data:

$ nnUNetv2_train -tr nnUNetTrainer Dataset001_PH2 2d 0

The -tr flag specifies the trainer class.
The Dataset001_PH2 flag specifies the dataset name.
The 2d flag specifies the dimensionality of the data, and forces the model to be trained in 2D.
The 0 flag specifies the fold number. This is important. Since we combined our Train and Validation sets into imagesTr, nnU-Net will internally split them (randomly) for its own validation monitoring. This satisfies nnU-Net’s requirement for a validation set while keeping our Test set (imagesTs) completely unseen.

Testing the trained model

Obtaining predictions on the test data

Once trained, we run inference on the explicit test directory we created earlier:

$ nnUNetv2_predict \
  -d Dataset001_PH2 \
  -i raw/Dataset001_PH2/imagesTs \
  -o test_predictions \
  -f 0 \
  -tr nnUNetTrainer \
  -c 2d

The -d flag specifies the dataset name. This is the dataset name we used in the training step.
The -i flag specifies the input directory. This is the directory where the test images are stored.
The -o flag specifies the output directory. This is the directory where the test predictions will be saved.
The -f flag specifies the fold number. Fold 0 as we are using cross-validation internally.
The -tr flag specifies the trainer class. We are using the default nnUNetTrainer class.
The -c flag specifies the dimensionality of the data. We are using 2D.

Evaluating the predictions

To evaluate the predictions:

$ nnUNetv2_evaluate_folder \
  -djfile test_predictions/dataset.json \
  -pfile test_predictions/plans.json \
  raw/Dataset001_PH2/labelsTs \
  test_predictions/

The -djfile flag specifies the dataset.json file.
The -pfile flag specifies the plans.json file.
Both of these files will be in the test_predictions directory.
The first argument is the directory containing the ground truth segmentations.
The second argument is the directory containing the predicted segmentations.

Prerequisites#

Install nnU-Netv2#

Set up environment variables#

Adapting the 2D RGB dataset to nnU-Netv2#

nnU-Netv2’s directory structure#

Handling RGB images#

Dataset configuration and preprocessing#

Dataset configuration#

Preprocessing the data#

Training the model#

Testing the trained model#

Obtaining predictions on the test data#

Evaluating the predictions#