Using steps library versus transforms.compose


#1

While writing my dataset class, I find that I need to have a sequence of transforms to be applied to the data that I’m reading from a Kaggle dataset. This can be for example, resize, align, normalize and conversion to tensor for PyTorch.

Should I use the steps library or transforms.compose, as described here:
https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

In the above tutorial, each class has an init() and a call() method to be used as a transformer.

The Steps BaseTransformer however, has a different API, init(), fit() and transform().

If I had a class like this, with a transforms and transforms_mask parameter in the constructor, should I use a transforms class with the simple call api, or should I try and use the steps library.

I understand that the Steps library is meant to be used outside as part of a pipeline. But this is a special case where it doesn’t make much sense reading the individual items unprocessed from the dataset. So I have a need to pre-process the images before returning it to my dataloader.

What would you recommend that I do? I don’t want to end up writing two sets of composable transformers.

class DSTLSIFDDataset(Dataset):
    """
    Kaggle DSTL Satellite Imagery Dataset class.

    The download parameters are stored in a experiment/*_model/params.yaml file.

    You can also specify a set of transformers for the image and the mask. For image, we will use
    transforms.compose to sequence image resize and image alignment operations.

    Finally, the actual image loader, mask generator, and reflectance index calculator functions
    are assigned in the class constructor. This gives some flexibility for re-assignment using
    different implementations without rewriting large portions of this code.

    """

    def __init__(self,
                 dataset_params,
                 transform=None,
                 transform_mask=None,
                 download=True):
        super(DSTLSIFDDataset, self).__init__()

#2

As a follow up, here is an example of a a library that supports both the Steps style transformer, as well as a PyTorch style transformer that can be passed using transforms.compose().

See https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

I would welcome some thoughts on merging or adding an optional call() method to support both modes of operation.

class ImageResizeTransformer(BaseTransformer):
    """
    This class is an Image Resize Transformer. It takes an input image and
    resizes it using the specified target height, width and interpolation
    parameters.

    """
    def __init__(self, height, width, interpolation):
        self.height = height
        self.width = width
        self.interpolation = interpolation

        # assign function
        self.resize = resize

    def __call__(self, img):
        return self.resize(img=img, height=self.height, width=self.width,
                           interpolation=self.interpolation)

    def fit(self, **kwargs):
        return self

    def transform(self, img):
        return self.resize(img=img, height=self.height, width=self.width,
                           interpolation=self.interpolation)

    def save(self, filepath):
        joblib.dump(self.resize, filepath)

    def load(self, filepath):
        self.resize = joblib.load(filepath)
        return self

    def __repr__(self):
        return self.__class__.__name__ + 'height={0}, width={1}, interpolation={2}'.format(
            self.height, self.width, self.interpolation)


def resize(img, height, width,
           interpolation=cv2.INTER_CUBIC):
    """
    Resize an image using OpenCV.
    Note that cv2.resize dsize is (height, width).

    :param img: Input image
    :param height: height
    :param width: width
    :param interpolation: cv2 interpolation type
    :return: resized image
    """

    logger.debug("resizing image to height= {}, width= {} using interpolation= {}".format(height, width, interpolation))

    return cv2.resize(img, (height, width), interpolation=interpolation)

#3

Hi, I would say that you can easily use both.
For insance in the https://github.com/minerva-ml/open-solution-mapping-challenge we use imgaug and transform.compose in one transformer. Also we have exactly the same usecase where we transform and transform mask on the fly to generate batches so I would suggest to take a closer look at loaders.py in that project.