Picture Augmentation with Keras Preprocessing Layers and tf.picture


    Final Up to date on August 6, 2022

    While you work on a machine studying downside associated to photographs, not solely do it is advisable gather some photos as coaching knowledge, however you additionally must make use of augmentation to create variations within the picture. It’s very true for extra complicated object recognition issues.

    There are numerous methods for picture augmentation. Chances are you’ll use some exterior libraries or write your personal capabilities for that. There are some modules in TensorFlow and Keras for augmentation too.

    On this submit, you’ll uncover how you need to use the Keras preprocessing layer in addition to the tf.picture module in TensorFlow for picture augmentation.

    After studying this submit, you’ll know:

    • What are the Keras preprocessing layers, and the best way to use them
    • What are the capabilities offered by the tf.picture module for picture augmentation
    • The best way to use augmentation along with the tf.knowledge dataset

    Let’s get began.

    Picture augmentation with Keras preprocessing layers and tf.picture.
    Picture by Steven Kamenar. Some rights reserved.


    This text is split into 5 sections; they’re:

    • Getting Photographs
    • Visualizing the Photographs
    • Keras Preprocessing Layers
    • Utilizing tf.picture API for Augmentation
    • Utilizing Preprocessing Layers in Neural Networks

    Getting Photographs

    Earlier than you see how you are able to do augmentation, it is advisable get the pictures. In the end, you want the pictures to be represented as arrays, for instance, in HxWx3 in 8-bit integers for the RGB pixel worth. There are numerous methods to get the pictures. Some could be downloaded as a ZIP file. Should you’re utilizing TensorFlow, you might get some picture datasets from the tensorflow_datasets library.

    On this tutorial, you’ll use the citrus leaves photos, which is a small dataset of lower than 100MB. It may be downloaded from tensorflow_datasets as follows:

    Operating this code the primary time will obtain the picture dataset into your laptop with the next output:

    The perform above returns the pictures as a tf.knowledge dataset object and the metadata. This can be a classification dataset. You’ll be able to print the coaching labels with the next:

    This prints:

    Should you run this code once more at a later time, you’ll reuse the downloaded picture. However the different technique to load the downloaded photos right into a tf.knowledge dataset is to make use of the image_dataset_from_directory() perform.

    As you’ll be able to see from the display output above, the dataset is downloaded into the listing ~/tensorflow_datasets. Should you take a look at the listing, you see the listing construction as follows:

    The directories are the labels, and the pictures are information saved underneath their corresponding listing. You’ll be able to let the perform to learn the listing recursively right into a dataset:

    Chances are you’ll need to set batch_size=None if you don’t want the dataset to be batched. Often, you need the dataset to be batched for coaching a neural community mannequin.

    Visualizing the Photographs

    It is very important visualize the augmentation outcome, so you’ll be able to confirm the augmentation result’s what we would like it to be. You should use matplotlib for this.

    In matplotlib, you might have the imshow() perform to show a picture. Nonetheless, for the picture to be displayed accurately, the picture needs to be offered as an array of 8-bit unsigned integers (uint8).

    Given that you’ve a dataset created utilizing image_dataset_from_directory()You will get the primary batch (of 32 photos) and show a number of of them utilizing imshow(), as follows:

    Right here, you see a show of 9 photos in a grid, labeled with their corresponding classification label, utilizing ds.class_names. The photographs needs to be transformed to NumPy array in uint8 for show. This code shows a picture like the next:

    The entire code from loading the picture to show is as follows:

    Notice that should you’re utilizing tensorflow_datasets to get the picture, the samples are offered as a dictionary as an alternative of a tuple of (picture,label). It’s best to change your code barely to the next:

    For the remainder of this submit, assume the dataset is created utilizing image_dataset_from_directory(). Chances are you’ll must tweak the code barely in case your dataset is created in another way.

    Keras Preprocessing Layers

    Keras comes with many neural community layers, comparable to convolution layers, that it is advisable practice. There are additionally layers with no parameters to coach, comparable to flatten layers to transform an array like a picture right into a vector.

    The preprocessing layers in Keras are particularly designed to make use of within the early phases of a neural community. You should use them for picture preprocessing, comparable to to resize or rotate the picture or regulate the brightness and distinction. Whereas the preprocessing layers are speculated to be half of a bigger neural community, you may also use them as capabilities. Under is how you need to use the resizing layer as a perform to remodel some photos and show them side-by-side with the unique:

    The photographs are in 256×256 pixels, and the resizing layer will make them into 256×128 pixels. The output of the above code is as follows:

    For the reason that resizing layer is a perform, you’ll be able to chain them to the dataset itself. For instance,

    The dataset ds has samples within the type of (picture, label). Therefore you created a perform that takes in such tuple and preprocesses the picture with the resizing layer. You then assigned this perform as an argument for the map() within the dataset. While you draw a pattern from the brand new dataset created with the map() perform, the picture can be a reworked one.

    There are extra preprocessing layers obtainable. Some are demonstrated under.

    As you noticed above, you’ll be able to resize the picture. You can too randomly enlarge or shrink the peak or width of a picture. Equally, you’ll be able to zoom in or zoom out on a picture. Under is an instance of manipulating the picture dimension in varied methods for a most of 30% improve or lower:

    This code exhibits photos as follows:

    When you specified a set dimension in resize, you might have a random quantity of manipulation in different augmentations.

    You can too do flipping, rotation, cropping, and geometric translation utilizing preprocessing layers:

    This code exhibits the next photos:

    And at last, you are able to do augmentations on colour changes as effectively:

    This exhibits the pictures as follows:

    For completeness, under is the code to show the results of varied augmentations:

    Lastly, it is very important level out that the majority neural community fashions can work higher if the enter photos are scaled. Whereas we normally use an 8-bit unsigned integer for the pixel values in a picture (e.g., for show utilizing imshow() as above), a neural community prefers the pixel values to be between 0 and 1 or between -1 and +1. This may be finished with preprocessing layers too. Under is how one can replace one of many examples above so as to add the scaling layer into the augmentation:

    Utilizing tf.picture API for Augmentation

    In addition to the preprocessing layer, the tf.picture module additionally offers some capabilities for augmentation. Not like the preprocessing layer, these capabilities are supposed for use in a user-defined perform and assigned to a dataset utilizing map() as we noticed above.

    The capabilities offered by the tf.picture usually are not duplicates of the preprocessing layers, though there may be some overlap. Under is an instance of utilizing the tf.picture capabilities to resize and crop photos:

    Under is the output of the above code:

    Whereas the show of photos matches what you may count on from the code, the usage of tf.picture capabilities is kind of completely different from that of the preprocessing layers. Each tf.picture perform is completely different. Subsequently, you’ll be able to see the crop_to_bounding_box() perform takes pixel coordinates, however the central_crop() perform assumes a fraction ratio because the argument.

    These capabilities are additionally completely different in the best way randomness is dealt with. A few of these capabilities don’t assume random conduct. Subsequently, the random resize ought to have the precise output dimension generated utilizing a random quantity generator individually earlier than calling the resize perform. Another capabilities, comparable to stateless_random_crop(), can do augmentation randomly, however a pair of random seeds within the int32 must be specified explicitly.

    To proceed the instance, there are the capabilities for flipping a picture and extracting the Sobel edges:

    This exhibits the next:

    And the next are the capabilities to control the brightness, distinction, and colours:

    This code exhibits the next:

    Under is the whole code to show all the above:

    These augmentation capabilities needs to be sufficient for many makes use of. However you probably have some particular concepts on augmentation, you’ll most likely want a greater picture processing library. OpenCV and Pillow are widespread however highly effective libraries that let you remodel photos higher.

    Utilizing Preprocessing Layers in Neural Networks

    You used the Keras preprocessing layers as capabilities within the examples above. However they will also be used as layers in a neural community. It’s trivial to make use of. Under is an instance of how one can incorporate a preprocessing layer right into a classification community and practice it utilizing a dataset:

    Operating this code offers the next output:

    Within the code above, you created the dataset with cache() and prefetch(). This can be a efficiency method to permit the dataset to organize knowledge asynchronously whereas the neural community is skilled. This may be vital if the dataset has another augmentation assigned utilizing the map() perform.

    You will note some enchancment in accuracy should you take away the RandomFlip and RandomRotation layers since you make the issue simpler. Nonetheless, as you need the community to foretell effectively on a large variation of picture high quality and properties, utilizing augmentation may help your ensuing community change into extra highly effective.

    Additional Studying

    Under is a few documentation from TensorFlow that’s associated to the examples above:


    On this submit, you might have seen how you need to use the tf.knowledge dataset with picture augmentation capabilities from Keras and TensorFlow.

    Particularly, you discovered:

    • The best way to use the preprocessing layers from Keras, each as a perform and as a part of a neural community
    • The best way to create your personal picture augmentation perform and apply it to the dataset utilizing the map() perform
    • The best way to use the capabilities offered by the tf.picture module for picture augmentation

    Develop Deep Studying Initiatives with Python!

    Deep Learning with Python

     What If You May Develop A Community in Minutes

    …with just some strains of Python

    Uncover how in my new Book:

    Deep Studying With Python

    It covers end-to-end tasks on subjects like:

    Multilayer PerceptronsConvolutional Nets and Recurrent Neural Nets, and extra…

    Lastly Deliver Deep Studying To

    Your Personal Initiatives

    Skip the Lecturers. Simply Outcomes.

    See What’s Inside


    Please enter your comment!
    Please enter your name here