How do I get image information in Python?

In this tutorial, you will learn how you can extract some useful metadata within images using the Pillow library in Python.

Devices such as digital cameras, smartphones, and scanners use the EXIF standard to save images or audio files. This standard contains many useful tags to extract which can be useful for forensic investigation, such as the make, model of the device, the exact date and time of image creation, and even the GPS information on some devices.

Please note that there are free tools to extract metadata such as ImageMagick or ExifTool on Linux, the goal of this tutorial is to extract metadata with the Python programming language.

How do I get image information in Python?
Source

The square structuring element ‘A’ fits in the object we want to select, the ‘B’ intersects the object and ‘C’ is out of the object.

The zero-one pattern defines the configuration of the structuring element. It’s according to the shape of the object we want to select. The center of the structuring element identifies the pixel being processed.

How do I get image information in Python?
Source

How do I get image information in Python?
Dilation | Source

How do I get image information in Python?
Erosion | Source

2. Gaussian Image Processing

Gaussian blur which is also known as gaussian smoothing, is the result of blurring an image by a Gaussian function.

It is used to reduce image noise and reduce details. The visual effect of this blurring technique is similar to looking at an image through the translucent screen. It is sometimes used in computer vision for image enhancement at different scales or as a data augmentation technique in deep learning.

The basic gaussian function looks like:

How do I get image information in Python?

In practice, it is best to take advantage of the Gaussian blur’s separable property by dividing the process into two passes. In the first pass, a one-dimensional kernel is used to blur the image in only the horizontal or vertical direction. In the second pass, the same one-dimensional kernel is used to blur in the remaining direction. The resulting effect is the same as convolving with a two-dimensional kernel in a single pass. Let’s see an example to understand what gaussian filters do to an image.

If we have a filter which is normally distributed, and when its applied to an image, the results look like this:

How do I get image information in Python?

How do I get image information in Python?

How do I get image information in Python?

Original

Filter

Result

Source

You can see that some of the edges have little less detail. The filter is giving more weight to the pixels at the center than the pixels away from the center. Gaussian filters are low-pass filters i.e. weakens the high frequencies. It is commonly used in edge detection.

3. Fourier Transform in image processing

Fourier transform breaks down an image into sine and cosine components. 

It has multiple applications like image reconstruction, image compression, or image filtering. 

Since we are talking about images, we will take discrete fourier transform into consideration.

Let’s consider a sinusoid, it comprises of three things:

  • Magnitude – related to contrast 
  • Spatial frequency – related to brightness
  • Phase – related to color information

The image in the frequency domain looks like this:

How do I get image information in Python?
Source

The formula for 2D discrete fourier transform is:

How do I get image information in Python?

In the above formula, f(x,y) denotes the image.

The inverse fourier transform converts the transform back to image. The formula for 2D inverse discrete fourier transform is:

How do I get image information in Python?

4. Edge Detection in image processing

Edge detection is an image processing technique for finding the boundaries of objects within images. It works by detecting discontinuities in brightness.

This could be very beneficial in extracting useful information from the image because most of the shape information is enclosed in the edges. Classic edge detection methods work by detecting discontinuities in the brightness. 

It can rapidly react if some noise is detected in the image while detecting the variations of grey levels. Edges are defined as the local maxima of the gradient. 

The most common edge detection algorithm is sobel edge detection algorithm. Sobel detection operator is made up of 3*3 convolutional kernels. A simple kernel Gx and a 90 degree rotated kernel Gy. Separate measurements are made by applying both the kernel separately to the image.

And,

 * denotes the 2D signal processing convolution operation.

Resulting gradient can be calculated as:

How do I get image information in Python?
Source

5. Wavelet Image Processing

We saw a Fourier transform but it is only limited to the frequency. Wavelets take both time and frequency into the consideration. This transform is apt for non-stationary signals. 

We know that edges are one of the important parts of the image, while applying the traditional filters it’s been noticed that noise gets removed but image gets blurry. The wavelet transform is designed in such a way that we get good frequency resolution for low frequency components. Below is the 2D wavelet transform example:

How do I get image information in Python?
Source

Image processing using Neural Networks

Neural Networks are multi-layered networks consisting of neurons or nodes. These neurons are the core processing units of the neural network. They are designed to act like human brains. They take in data, train themselves to recognize the patterns in the data and then predict the output.

A basic neural network has three layers:

  1. Input layer
  2. Hidden layer
  3. Output layer

How do I get image information in Python?
Basic neural network | Source

The input layers receive the input, the output layer predicts the output and the hidden layers do most of the calculations. The number of hidden layers can be modified according to the requirements. There should be atleast one hidden layer in a neural network.

The basic working of the neural network is as follows:

  1. Let’s consider an image, each pixel is fed as input to each neuron of the first layer, neurons of one layer are connected to neurons of the next layer through channels. 
  2. Each of these channels is assigned a numerical value known as weight. 
  3. The inputs are multiplied by the corresponding weights and this weighted sum is then fed as input to the hidden layers. 
  4. The output from the hidden layers is passed through an activation function which will determine whether the particular neuron will be activated or not. 
  5. The activated neurons transmits data to the next hidden layers. In this manner, data is propagated through the network, this is known as Forward Propagation. 
  6. In the output layer, the neuron with the highest value predicts the output. These outputs are the probability values.
  7. The predicted output is compared with the actual output to obtain the error. This information is then transferred back through the network, the process is known as Backpropagation.
  8. Based on this information, the weights are adjusted. This cycle of forward and backward propagation is done several times on multiple inputs until the network predicts the output correctly in most of the cases.
  9. This ends the training process of the neural network. The time taken to train the neural network may get high in some cases.

In the below image, ai’s is the set of inputs, wi’s are the weights, z is the output and g is any activation function.

How do I get image information in Python?
Operations in a single neuron | Source

Here are some guidelines to prepare data for image processing. 

  • More data needs to be fed to the model to get the better results.
  • Image dataset should be of high quality to get more clear information, but to process them you may require deeper neural networks.
  • In many cases RGB images are converted to grayscale before feeding them into a neural network.

Types of Neural Network

Convolutional Neural Network

A convolutional neural network, ConvNets in short has three layers:

  • Convolutional Layer (CONV): They are the core building block of CNN, it is responsible for performing convolution operation.The element involved in carrying out the convolution operation in this layer is called the Kernel/Filter (matrix). The kernel makes horizontal and vertical shifts based on the stride rate until the full image is traversed.

How do I get image information in Python?
Movement of the kernel | Source

  • Pooling Layer (POOL): This layer is responsible for dimensionality reduction. It helps to decrease the computational power required to process the data. There are two types of Pooling: Max Pooling and Average Pooling. Max pooling returns the maximum value from the area covered by the kernel on the image. Average pooling returns the average of all the values in the part of the image covered by the kernel.
How do I get image information in Python?
Pooling operation | Source
  • Fully Connected Layer (FC): The fully connected layer (FC) operates on a flattened input where each input is connected to all neurons. If present, FC layers are usually found towards the end of CNN architectures.

How do I get image information in Python?
Fully connected layers | Source

CNN is mainly used in extracting features from the image with help of its layers. CNNs are widely used in image classification where each input image is passed through the series of layers to get a probabilistic value between 0 and 1.

How do I get image information in Python?
Source

Generative Adversarial Networks

Generative models use an unsupervised learning approach (there are images but there are no labels provided). 

GANs are composed of two models Generator and Discriminator. Generator learns to make fake images  that look realistic so as to fool the discriminator and Discriminator learns to distinguish fake from real images (it tries not to get fooled). 

Generator is not allowed to see the real images, so it may produce poor results in the starting phase while the discriminator is allowed to look at real images but they are jumbled with the fake ones produced by the generator which it has to classify as real or fake. 

Some noise is fed as input to the generator so that it’s able to produce different examples every single time and not the same type image. Based on the scores predicted by the discriminator, the generator tries to improve its results, after a certain point of time, the generator will be able to produce images that will be harder to distinguish, at that point of time, the user gets satisfied with its results. Discriminator also improves itself as it gets more and more realistic images at each round from the generator.

Popular types of GANs are Deep Convolutional GANs(DCGANs), Conditional GANs(cGANs), StyleGANs, CycleGAN, DiscoGAN, GauGAN and so on.

GANs are great for image generation and manipulation. Some applications of GANs include : Face Aging, Photo Blending, Super Resolution, Photo Inpainting, Clothing Translation. 

How do I get image information in Python?
Source

Image processing tools

1. OpenCV

It stands for Open Source Computer Vision Library. This library consists of around 2000+ optimised algorithms that are useful for computer vision and machine learning. There are several ways you can use opencv in image processing, a few are listed below:

  • Converting images from one color space to another i.e. like between BGR and HSV, BGR and gray etc.
  • Performing thresholding on images, like, simple thresholding, adaptive thresholding etc. 
  • Smoothing of images, like, applying custom filters to images and blurring of images.
  • Performing morphological operations on images.
  • Building image pyramids.
  • Extracting foreground from images using GrabCut algorithm.
  • Image segmentation using watershed algorithm.

Refer to this link for more details.

2. Scikit-image

It is an open-source library used for image preprocessing. It makes use of machine learning with built-in functions and can perform complex operations on images with just a few functions. 

It works with numpy arrays and is a fairly simple  library even for those who are new to python. Some operations that can be done using scikit image are :

  • To implement thresholding operations use try_all_threshold() method on the image. It will use seven global thresholding algorithms. This is in the filters module.
  • To implement edge detection use sobel() method in the filters module. This method requires a 2D grayscale image as an input, so we need to convert the image to grayscale.
  • To implement gaussian smoothing use gaussian() method in the filters module.
  • To apply histogram equalization, use exposure module, to apply normal histogram equalization to the original image, use equalize_hist() method and to apply adaptive equalization, use equalize_adapthist() method.
  • To rotate the image use rotate() function under the transform module.
  • To rescale the image use rescale() function from the transform module.
  • To apply morphological operations use binary_erosion() and binary_dilation() function under the morphology module.

3. PIL/pillow

PIL stands for Python Image Library and Pillow is the friendly PIL fork by Alex Clark and Contributors. It’s one of the powerful libraries. It supports a wide range of image formats like PPM, JPEG, TIFF, GIF, PNG, and BMP. 

It can help you perform several operations on images like rotating, resizing, cropping, grayscaling etc. Let’s go through some of those operations 

To carry out manipulation operations there is a module in this library called Image. 

  • To load an image use the open() method.
  • To display an image use show() method.
  • To know the file format use format attribute
  • To know the size of the image use size attribute
  • To know about the pixel format use mode attribute.
  • To save the image file after desired processing, use save() method. Pillow saves the image file in png format.
  • To resize the image use resize() method that takes two arguments as width and height.
  • To crop the image, use crop() method that takes one argument as a box tuple that defines position and size of the cropped region.
  • To rotate the image use rotate() method that takes one argument as an integer or float number representing the degree of rotation.
  • To flip the image use transform() method that take one argument among the following: Image.FLIP_LEFT_RIGHT, Image.FLIP_TOP_BOTTOM, Image.ROTATE_90, Image.ROTATE_180, Image.ROTATE_270. 

Read also

Essential Pil (Pillow) Image Tutorial (for Machine Learning People)

4. NumPy

With this library you can also perform simple image techniques, such as flipping images, extracting features, and analyzing them. 

Images can be represented by numpy multi-dimensional arrays and so their type is NdArrays. A color image is a numpy array with 3 dimensions. By slicing the multi-dimensional array the RGB channels can be separated. 

Below are some of the operations that can be performed using NumPy on the image (image is loaded in a variable named test_img using imread).

  • To flip the image in a vertical direction, use np.flipud(test_img).
  • To flip the image in a horizontal direction, use np.fliplr(test_img).
  • To reverse the image, use test_img[::-1]  (the image after storing it as the numpy array is named as <img_name>).
  • To add filter to the image you can do this: 

Example: np.where(test_img > 150, 255, 0), this says that in this picture if you find anything with 150, then replace it with 255, else 0.

  • You can also display the RGB channels separately. It can be done using this code snippet:

To obtain a red channel, do test_img[:,:,0], to obtain a green channel, do test_img[:,:,1] and to obtain a blue channel, do test_img[:,:,2].

5. Mahotas

It is a computer vision and image processing library and has more than 100 functions. Many of its algorithms are implemented in C++. Mahotas is an independent module in itself i.e. it has minimal dependencies. 

Currently, it depends only on C++ compilers for numerical computations, there is no need for NumPy module, the compiler does all its work. 

Here are names of some of the remarkable algorithms available in Mahotas:

  • Watershed (https://mahotas.readthedocs.io/en/latest/distance.html)
  • Morphological Operations (https://mahotas.readthedocs.io/en/latest/morphology.html)
  • Hit & miss, thinning. ()
  • Colorspace Conversions (https://mahotas.readthedocs.io/en/latest/color.html)
  • Speeded-Up Robust Features (SURF), a form of local features.
    (https://mahotas.readthedocs.io/en/latest/surf.html)
  • Thresholding. (https://mahotas.readthedocs.io/en/latest/thresholding.html)
  • Convolution. (https://mahotas.readthedocs.io/en/latest/api.html)
  • Spline interpolation (https://mahotas.readthedocs.io/en/latest/api.html)
  • SLIC superpixels. (https://www.pyimagesearch.com/2014/07/28/a-slic-superpixel-tutorial-using-python/)

Let’s look at some of the operations that could be done using Mahotas:

  • To read an image use imread() method.
  • To calculate the mean of the image use the mean() method.
  • Eccentricity of an image measures the shortest length of the paths from a given vertex v to reach any other vertex w of a connected graph. To find the eccentricity of an image, use the eccentricity() method under the features module.
  • For dilation and erosion on the image use, dilate() and erode() method under morph module.
  • To find the local maxima of the image use locmax() method.

Summary

In this article, I briefly explained about classical image processing that can be done using Morphological filtering, Gaussian filter, Fourier transform and Wavelet transform. 

All these can be performed using various image processing libraries like OpenCV, Mahotas, PIL, scikit-learn. 

I also discussed popular neural networks like CNN and GANs that are used for computer vision. 

Deep learning is changing the world with its broadway terminologies and advances in the field of image processing. Researchers are coming up with better techniques to fine tune the whole image processing field, so the learning does not stop here. Keep advancing.

Clear current search

What is neptune.ai?

It’s an experiment tracker and model registry that integrates with any MLOps stack.

Log model metadata from anywhere in your pipeline. See results in the web app.

See product

Table of contents

  1. What is image processing?
  2. Classic image processing algorithms
  3. Image processing using Neural Networks
  4. Types of Neural Network
  5. Image processing tools
  6. Summary

Read next

How to Do Data Exploration for Image Segmentation and Object Detection (Things I Had to Learn the Hard Way)

I’ve been working with object detection and image segmentation problems for many years. An important realization I made is that people don’t put the same amount of effort and emphasis on data exploration and results analysis as they would normally in any other non-image machine learning project.  


Why is it so? 

I  believe there are two major reasons for it:

People don’t understand object detection and image segmentation models in depth and treat them as black boxes, in that case they don’t even know what to look at and what the assumptions are.  It can be quite tedious from a technical point of view as we don’t have good image data exploration tools.

In my opinion image datasets are not really an exception, understanding how to adjust the system to match our data is a critical step to success. 

In this article I will share with you how I approach data exploration for image segmentation and object detection problems. Specifically:

How to read data from image in Python?

The Python Library Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and “read” the text embedded in images. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine.

How do I get image properties in Python?

Showing Image Properties Using Python.
filename: The name of the image file..
format: The file format such as JPG, GIF..
mode: The image mode such as RGB, RFBA, CMYK, or YCbCr..
size: The image size in pixels displayed as a width, height tuple..
width: The width of the image in pixels..
height: The height of the image in pixels..

How do you view image information?

How to access and view photo metadata.
Locate and right-click the intended digital image file..
Select 'Properties'.
Click the 'Details' tab at the top of the popup window..
Now simply scroll down through the resulting window until you find the metadata section you require..

How do I find the data type of an image in Python?

Python provide library to determine the type of an image, on such library is imghdr. The python imghdr package determines the type of image contained in a file or byte stream.