I’m trying to develop a Dataset and Dataloader for the Kaggle DSTL Satellite Imagery Feature Detection challenge dataset.
The original dataset contains 16-band images, 3-band images,1 P-Band image, train csv file, grid sizes csv file and the shapes file containing polygons that can be used to generate training masks.
The NN model need a concatenated input dataset of 20-channels containing image data (X_train), and masks (y_train).
What would be the appropriate way to develop the Dataset class and Dataloader class for the DSTL dataset?
My workflow is:
- Load the high-res image, get the shape of the image
- Up-scale the low-res image
- Align the M and A-Band images to the RGB-Band image.
- Concatenate the RGB, P, M and A-Band images into a single X_train sample
- Generate the ground truth image mask from the shapes polygon file, and create a high-res 10-ch image mask corresponding to the 10 different class to be segmented
- Crop 512x512 regions, and generate around 2500 images from a single high-resolution image and the image mask.
- Save the cropped images to disk, i.e. cache them.
I have normal python scripts that do all this at the moment.
The Dataloader would have to use the Dataset class and just get a batch of 512x512x20 shaped X_train and 512x512x10 shaped y_train samples.
I was wondering if I were to implement a Dataset and Dataloader, at what point should I use the Steps library, since most of the steps from 1) to 7) happen in the dataset class.