7.4 C
New York
Tuesday, March 21, 2023

Picture Augmentation with Keras Preprocessing Layers and tf.picture

Final Up to date on July 20, 2022

Once we work on a machine studying downside associated to photographs, not solely we have to accumulate some photographs as coaching information, but in addition have to make use of augmentation to create variations within the picture. It’s very true for extra advanced object recognition issues.

There are numerous methods for picture augmentation. You could use some exterior libraries or write your individual capabilities for that. There are some modules in TensorFlow and Keras for augmentation, too. On this submit you’ll uncover how we are able to use the Keras preprocessing layer in addition to tf.picture module in TensorFlow for picture augmentation.

After studying this submit, you’ll know:

  • What are the Keras preprocessing layers and easy methods to use them
  • What are the capabilities supplied by tf.picture module for picture augmentation
  • The best way to use augmentation along with tf.information dataset

Let’s get began.

Picture Augmentation with Keras Preprocessing Layers and tf.picture.
Picture by Steven Kamenar. Some rights reserved.


This text is cut up into 5 sections; they’re:

  • Getting Photographs
  • Visualizing the Photographs
  • Keras Preprocessing Layesr
  • Utilizing tf.picture API for Augmentation
  • Utilizing Preprocessing Layers in Neural Networks

Getting Photographs

Earlier than we see how we are able to do augmentation, we have to get the pictures. Finally, we want the pictures to be represented as arrays, for instance, in HxWx3 in 8-bit integers for the RGB pixel worth. There are numerous methods to get the pictures. Some might be downloaded as a ZIP file. Should you’re utilizing TensorFlow, chances are you’ll get some picture dataset from the tensorflow_datasets library.

On this tutorial, we’re going to use the citrus leaves photographs, which is a small dataset in lower than 100MB. It may be downloaded from tensorflow_datasets as follows:

Working this code the primary time will obtain the picture dataset into your pc with the next output:

The operate above returns the pictures as a tf.information dataset object and the metadata. It is a classification dataset. We are able to print the coaching labels with the next:

and this prints:

Should you run this code once more at a later time, you’ll reuse the downloaded picture. However the different technique to load the downloaded photographs right into a tf.information dataset is to the image_dataset_from_directory() operate.

As we are able to see the display screen output above, the dataset is downloaded into the listing ~/tensorflow_datasets. Should you have a look at the listing, you see the listing construction as follows:

The directories are the labels and the pictures are recordsdata saved beneath their corresponding listing. We are able to let the operate to learn the listing recursively right into a dataset:

You could wish to set batch_size=None if you do not need the dataset to be batched. Normally we want the dataset to be batched for coaching a neural community mannequin.

Visualizing the Photographs

You will need to visualize the augmentation consequence so we are able to confirm the augmentation result’s what we wish it to be. We are able to use matplotlib for this.

In matplotlib, we have now the imshow() operate to show a picture. Nonetheless, for the picture to be displayed accurately, the picture ought to be offered as an array of 8-bit unsigned integer (uint8).

Given we have now a dataset created utilizing image_dataset_from_directory(), we are able to get the primary batch (of 32 photographs) and show a number of of them utilizing imshow(), as follows:

Right here we show 9 photographs in a grid, and label the pictures with their corresponding classification label, utilizing ds.class_names. The photographs ought to be transformed to NumPy array in uint8 for show. This code shows a picture like the next:

The whole code from loading the picture to show is as follows.

Word that, should you’re utilizing tensorflow_datasets to get the picture, the samples are offered as a dictionary as an alternative of a tuple of (picture,label). You need to change your code barely into the next:

In the remainder of this submit, we assume the dataset is created utilizing image_dataset_from_directory(). You could have to tweak the code barely in case your dataset is created otherwise.

Keras Preprocessing Layers

Keras comes with many neural community layers similar to convolution layers that we have to prepare. There are additionally layers with no parameters to coach, similar to flatten layers to transform an array similar to a picture right into a vector.

The preprocessing layers in Keras are particularly designed to make use of in early levels in a neural community. We are able to use them for picture preprocessing, similar to to resize or rotate the picture or to regulate the brightness and distinction. Whereas the preprocessing layers are purported to be half of a bigger neural community, we are able to additionally use them as capabilities. Under is how we are able to use the resizing layer as a operate to rework some photographs and show them side-by-side with the unique:

Our photographs are in 256×256 pixels and the resizing layer will make them into 256×128 pixels. The output of the above code is as follows:

For the reason that resizing layer is a operate itself, we are able to chain them to the dataset itself. For instance,

The dataset ds has samples within the type of (picture, label). Therefore we created a operate that takes in such tuple and preprocess the picture with the resizing layer. We assigned this operate as an argument for map() within the dataset. Once we draw a pattern from the brand new dataset created with the map() operate, the picture can be a reworked one.

There are extra preprocessing layers obtainable. In under, we show some.

As we noticed above, we are able to resize the picture. We are able to additionally randomly enlarge or shrink the peak or width of a picture. Equally, we are able to zoom in or zoom out on a picture. Under is an instance to govern the picture measurement in numerous methods for a most of 30% improve or lower:

This code exhibits photographs as follows:

Whereas we specified a hard and fast dimension in resize, we have now a random quantity of manipulation in different augmentations.

We are able to additionally do flipping, rotation, cropping, and geometric translation utilizing preprocessing layers:

This code exhibits the next photographs:

And at last, we are able to do augmentations on coloration changes as nicely:

This exhibits the pictures as follows:

For completeness, under is the code to show the results of numerous augmentations:

Lastly, you will need to level out that the majority neural community mannequin can work higher if the enter photographs are scaled. Whereas we normally use 8-bit unsigned integer for the pixel values in a picture (e.g., for show utilizing imshow() as above), neural community prefers the pixel values to be between 0 and 1, or between -1 and +1. This may be carried out with a preprocessing layers, too. Under is how we are able to replace considered one of our instance above so as to add the scaling layer into the augmentation:

Utilizing tf.picture API for Augmentation

Moreover the preprocessing layer, the tf.picture module additionally supplied some capabilities for augmentation. In contrast to the preprocessing layer, these capabilities are meant for use in a user-defined operate and assigned to a dataset utilizing map() as we noticed above.

The capabilities supplied by tf.picture will not be duplicates of the preprocessing layers, though there are some overlap. Under is an instance of utilizing the tf.picture capabilities to resize and crop photographs:

Under is the output of the above code:

Whereas the show of photographs match what we might count on from the code, the usage of tf.picture capabilities is kind of completely different from that of the preprocessing layers. Each tf.picture operate is completely different. Subsequently, we are able to see the crop_to_bounding_box() operate takes pixel coordinates however the central_crop() operate assumes a fraction ratio as argument.

These capabilities are additionally completely different in the best way randomness is dealt with. A few of these operate doesn’t assume random conduct. Subsequently, the random resize ought to have the precise output measurement generated utilizing a random quantity generator individually earlier than calling the resize operate. Another operate, similar to stateless_random_crop(), can do augmentation randomly however a pair of random seed in int32 must be specified explicitly.

To proceed the instance, there are the capabilities for flipping a picture and extracting the Sobel edges:

which exhibits the next:

And the next are the capabilities to govern the brightness, distinction, and colours:

This code exhibits the next:

Under is the whole code to show the entire above:

These augmentation capabilities ought to be sufficient for many use. However you probably have some particular thought on augmentation, in all probability you would wish a greater picture processing library. OpenCV and Pillow are widespread however highly effective libraries that means that you can remodel photographs higher.

Utilizing Preprocessing Layers in Neural Networks

We used the Keras preprocessing layers as capabilities within the examples above. However they may also be used as layers in a neural community. It’s trivial to make use of. Under is an instance on how we are able to incorporate a preprocessing layer right into a classification community and prepare it utilizing a dataset:

Working this code offers the next output:

Within the code above, we created the dataset with cache() and prefetch(). It is a efficiency approach to permit the dataset to arrange information asynchronously whereas the neural community is educated. This is able to be vital if the dataset has another augmentation assigned utilizing the map() operate.

You will notice some enchancment in accuracy should you eliminated the RandomFlip and RandomRotation layers since you make the issue simpler. Nonetheless, as we wish the community to foretell nicely on a large variations of picture high quality and properties, utilizing augmentation might help our ensuing community extra highly effective.

Additional Studying

Under are documentations from TensorFlow which might be associated to the examples above:


On this submit, you may have seen how we are able to use the tf.information dataset with picture augmentation capabilities from Keras and TensorFlow.

Particularly, you discovered:

  • The best way to use the preprocessing layers from Keras, each as a operate and as a part of a neural community
  • The best way to create your individual picture augmentation operate and apply it to the dataset utilizing the map() operate
  • The best way to use the capabilities supplied by the tf.picture module for picture augmentation

Develop Deep Studying Tasks with Python!

Deep Learning with Python

 What If You Might Develop A Community in Minutes

…with only a few strains of Python

Uncover how in my new Book:

Deep Studying With Python

It covers end-to-end tasks on matters like:

Multilayer PerceptronsConvolutional Nets and Recurrent Neural Nets, and extra…

Lastly Deliver Deep Studying To

Your Personal Tasks

Skip the Lecturers. Simply Outcomes.

See What’s Inside

Related Articles


Please enter your comment!
Please enter your name here

Latest Articles