In this tutorial, you will learn how you can extract some useful metadata within images using the Pillow library in Python. Show
Devices such as digital cameras, smartphones, and scanners use the EXIF standard to save images or audio files. This standard contains many useful tags to extract which can be useful for forensic investigation, such as the make, model of the device, the exact date and time of image creation, and even the GPS information on some devices. Please note that there are free tools to extract metadata such as ImageMagick or ExifTool on Linux, the goal of this tutorial is to extract metadata with the Python programming language. SourceThe square structuring element ‘A’ fits in the object we want to select, the ‘B’ intersects the object and ‘C’ is out of the object. The zero-one pattern defines the configuration of the structuring element. It’s according to the shape of the object we want to select. The center of the structuring element identifies the pixel being processed. SourceDilation | SourceErosion | Source2. Gaussian Image ProcessingGaussian blur which is also known as gaussian smoothing, is the result of blurring an image by a Gaussian function. It is used to reduce image noise and reduce details. The visual effect of this blurring technique is similar to looking at an image through the translucent screen. It is sometimes used in computer vision for image enhancement at different scales or as a data augmentation technique in deep learning. The basic gaussian function looks like: In practice, it is best to take advantage of the Gaussian blur’s separable property by dividing the process into two passes. In the first pass, a one-dimensional kernel is used to blur the image in only the horizontal or vertical direction. In the second pass, the same one-dimensional kernel is used to blur in the remaining direction. The resulting effect is the same as convolving with a two-dimensional kernel in a single pass. Let’s see an example to understand what gaussian filters do to an image. If we have a filter which is normally distributed, and when its applied to an image, the results look like this: Original Filter Result Source You can see that some of the edges have little less detail. The filter is giving more weight to the pixels at the center than the pixels away from the center. Gaussian filters are low-pass filters i.e. weakens the high frequencies. It is commonly used in edge detection. 3. Fourier Transform in image processingFourier transform breaks down an image into sine and cosine components. It has multiple applications like image reconstruction, image compression, or image filtering. Since we are talking about images, we will take discrete fourier transform into consideration. Let’s consider a sinusoid, it comprises of three things:
The image in the frequency domain looks like this: The formula for 2D discrete fourier transform is: In the above formula, f(x,y) denotes the image. The inverse fourier transform converts the transform back to image. The formula for 2D inverse discrete fourier transform is: 4. Edge Detection in image processingEdge detection is an image processing technique for finding the boundaries of objects within images. It works by detecting discontinuities in brightness. This could be very beneficial in extracting useful information from the image because most of the shape information is enclosed in the edges. Classic edge detection methods work by detecting discontinuities in the brightness. It can rapidly react if some noise is detected in the image while detecting the variations of grey levels. Edges are defined as the local maxima of the gradient. The most common edge detection algorithm is sobel edge detection algorithm. Sobel detection operator is made up of 3*3 convolutional kernels. A simple kernel Gx and a 90 degree rotated kernel Gy. Separate measurements are made by applying both the kernel separately to the image. And, * denotes the 2D signal processing convolution operation. Resulting gradient can be calculated as: Source5. Wavelet Image ProcessingWe saw a Fourier transform but it is only limited to the frequency. Wavelets take both time and frequency into the consideration. This transform is apt for non-stationary signals. We know that edges are one of the important parts of the image, while applying the traditional filters it’s been noticed that noise gets removed but image gets blurry. The wavelet transform is designed in such a way that we get good frequency resolution for low frequency components. Below is the 2D wavelet transform example: SourceImage processing using Neural NetworksNeural Networks are multi-layered networks consisting of neurons or nodes. These neurons are the core processing units of the neural network. They are designed to act like human brains. They take in data, train themselves to recognize the patterns in the data and then predict the output. A basic neural network has three layers:
The input layers receive the input, the output layer predicts the output and the hidden layers do most of the calculations. The number of hidden layers can be modified according to the requirements. There should be atleast one hidden layer in a neural network. The basic working of the neural network is as follows:
In the below image, ai’s is the set of inputs, wi’s are the weights, z is the output and g is any activation function. Here are some guidelines to prepare data for image processing.
Types of Neural NetworkConvolutional Neural NetworkA convolutional neural network, ConvNets in short has three layers:
CNN is mainly used in extracting features from the image with help of its layers. CNNs are widely used in image classification where each input image is passed through the series of layers to get a probabilistic value between 0 and 1. SourceGenerative Adversarial NetworksGenerative models use an unsupervised learning approach (there are images but there are no labels provided). GANs are composed of two models Generator and Discriminator. Generator learns to make fake images that look realistic so as to fool the discriminator and Discriminator learns to distinguish fake from real images (it tries not to get fooled). Generator is not allowed to see the real images, so it may produce poor results in the starting phase while the discriminator is allowed to look at real images but they are jumbled with the fake ones produced by the generator which it has to classify as real or fake. Some noise is fed as input to the generator so that it’s able to produce different examples every single time and not the same type image. Based on the scores predicted by the discriminator, the generator tries to improve its results, after a certain point of time, the generator will be able to produce images that will be harder to distinguish, at that point of time, the user gets satisfied with its results. Discriminator also improves itself as it gets more and more realistic images at each round from the generator. Popular types of GANs are Deep Convolutional GANs(DCGANs), Conditional GANs(cGANs), StyleGANs, CycleGAN, DiscoGAN, GauGAN and so on. GANs are great for image generation and manipulation. Some applications of GANs include : Face Aging, Photo Blending, Super Resolution, Photo Inpainting, Clothing Translation. SourceImage processing tools1. OpenCVIt stands for Open Source Computer Vision Library. This library consists of around 2000+ optimised algorithms that are useful for computer vision and machine learning. There are several ways you can use opencv in image processing, a few are listed below:
Refer to this link for more details. 2. Scikit-imageIt is an open-source library used for image preprocessing. It makes use of machine learning with built-in functions and can perform complex operations on images with just a few functions. It works with numpy arrays and is a fairly simple library even for those who are new to python. Some operations that can be done using scikit image are :
3. PIL/pillowPIL stands for Python Image Library and Pillow is the friendly PIL fork by Alex Clark and Contributors. It’s one of the powerful libraries. It supports a wide range of image formats like PPM, JPEG, TIFF, GIF, PNG, and BMP. It can help you perform several operations on images like rotating, resizing, cropping, grayscaling etc. Let’s go through some of those operations To carry out manipulation operations there is a module in this library called Image.
Read alsoEssential Pil (Pillow) Image Tutorial (for Machine Learning People) 4. NumPyWith this library you can also perform simple image techniques, such as flipping images, extracting features, and analyzing them. Images can be represented by numpy multi-dimensional arrays and so their type is NdArrays. A color image is a numpy array with 3 dimensions. By slicing the multi-dimensional array the RGB channels can be separated. Below are some of the operations that can be performed using NumPy on the image (image is loaded in a variable named test_img using imread).
Example: np.where(test_img > 150, 255, 0), this says that in this picture if you find anything with 150, then replace it with 255, else 0.
To obtain a red channel, do test_img[:,:,0], to obtain a green channel, do test_img[:,:,1] and to obtain a blue channel, do test_img[:,:,2]. 5. MahotasIt is a computer vision and image processing library and has more than 100 functions. Many of its algorithms are implemented in C++. Mahotas is an independent module in itself i.e. it has minimal dependencies. Currently, it depends only on C++ compilers for numerical computations, there is no need for NumPy module, the compiler does all its work. Here are names of some of the remarkable algorithms available in Mahotas:
Let’s look at some of the operations that could be done using Mahotas:
SummaryIn this article, I briefly explained about classical image processing that can be done using Morphological filtering, Gaussian filter, Fourier transform and Wavelet transform. All these can be performed using various image processing libraries like OpenCV, Mahotas, PIL, scikit-learn. I also discussed popular neural networks like CNN and GANs that are used for computer vision. Deep learning is changing the world with its broadway terminologies and advances in the field of image processing. Researchers are coming up with better techniques to fine tune the whole image processing field, so the learning does not stop here. Keep advancing. Clear current search What is neptune.ai? It’s an experiment tracker and model registry that integrates with any MLOps stack. See product Table of contents
Read nextHow to Do Data Exploration for Image Segmentation and Object Detection (Things I Had to Learn the Hard Way)I’ve been working with object detection and image segmentation problems for many years. An important realization I made is that people don’t put the same amount of effort and emphasis on data exploration and results analysis as they would normally in any other non-image machine learning project.
I believe there are two major reasons for it: People don’t understand object detection and image segmentation models in depth and treat them as black boxes, in that case they don’t even know what to look at and what the assumptions are. It can be quite tedious from a technical point of view as we don’t have good image data exploration tools.In my opinion image datasets are not really an exception, understanding how to adjust the system to match our data is a critical step to success. In this article I will share with you how I approach data exploration for image segmentation and object detection problems. Specifically: How to read data from image in Python?The Python Library
Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and “read” the text embedded in images. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine.
How do I get image properties in Python?Showing Image Properties Using Python. filename: The name of the image file.. format: The file format such as JPG, GIF.. mode: The image mode such as RGB, RFBA, CMYK, or YCbCr.. size: The image size in pixels displayed as a width, height tuple.. width: The width of the image in pixels.. height: The height of the image in pixels.. How do you view image information?How to access and view photo metadata. Locate and right-click the intended digital image file.. Select 'Properties'. Click the 'Details' tab at the top of the popup window.. Now simply scroll down through the resulting window until you find the metadata section you require.. How do I find the data type of an image in Python?Python provide library to determine the type of an image, on such library is imghdr. The python imghdr package determines the type of image contained in a file or byte stream.
|