Otsu’s Binarization

For global thresholding methods we gather the threshold values by trial and error. However suppose the image is Bimodal(Basically a bimodal image has two peaks in its histogram). For this the threshold value is gained by taking the value in between the peaks. This is what is done by Otsu’s Binarization. To use it we simply pass an extra flag cv2.THRESH_OTSU to the cv2.threshold function. Pass the maxVal as 0.

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('image.jpg',0)

# global thresholding
ret1,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)

# Otsu's thresholding
ret2,th2 = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(5,5),0)
ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# plot all the images and their histograms
images = [img, 0, th1,
img, 0, th2,
blur, 0, th3]
titles = ['Original Noisy Image','Histogram','Global Thresholding (v=127)',
'Original Noisy Image','Histogram',"Otsu's Thresholding",
'Gaussian filtered Image','Histogram',"Otsu's Thresholding"]

for i in xrange(3):
plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])

The result:

OpenCV+Python:Part3–Image Thresholding

Simple Thresholding

This is as simple as it sounds. You take a threshold pixel value. Anything above(or below) that value is assigned a certain predefined pixel value that you wish. The function cv2.threshold is used .
The function has four parameters. The first is the source image. Second is the threshold value. Third is the maxVal which is the pixel value assigned if the current value is above(or below) the threshold. The fourth parameter is the style in which thresholding can be performed.

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('image.jpg',0)
ret,thresh1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
ret,thresh2 = cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
ret,thresh3 = cv2.threshold(img,127,255,cv2.THRESH_TRUNC)
ret,thresh4 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO)
ret,thresh5 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO_INV)

titles = ['Original Image','BINARY','BINARY_INV','TRUNC','TOZERO','TOZERO_INV']
images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]

for i in xrange(6):


The result:

Adaptive Thresholding

Similar to simple thresholding except that now the image is divided into several regions and the threshold value for each region is calculated by an algorithm according to the illumination of the region. Three parameters are needed.
1.) Adaptive method- Calculates the threshold value
a.)cv2.ADAPTIVE_THRESH_MEAN_C : threshold value is the mean of neighbourhood area.
b.)cv2.ADAPTIVE_THRESH_GAUSSIAN_C : threshold value is the weighted sum of neighbourhood values
where weights are a gaussian window.

2.)Block Size- Defines the size of the region.

3.)C- just a constant which is subtracted from the mean.

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('image.jpg',0)
img = cv2.medianBlur(img,5)

ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\

titles = ['Original Image', 'Global Thresholding (v = 127)',
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]

for i in xrange(4):

The result:
adpative thresholding

OpenCV+Python:Part3–Geometric Transformations

In this post I will explain how to go about rotating or translating images.


Scaling can be done by using the cv2.resize() function.The size can be provided manually or a scaling factor can be given.

import cv2
import numpy as np
img = cv2.imread('image.jpg')
height, width = img.shape[:2]
res = cv2.resize(img,(2*width, 2*height), interpolation = cv2.INTER_CUBIC)

The interpolation method used here is cv2.INTER_CUBIC
The default interpolation function is cv2.INTER_LINEAR


Shifting any objects location can be done using the cv2.warpAffine
To shift an image by (x,y) a transformation matrix M =[(1,0,Tx),(0,1,Ty)] using numpy array type np.float32. The following example code shifts the image by (200,100).

import cv2
import numpy as np
img = cv2.imread('image.jpg',0)
rows,cols = img.shape
M = np.float32([[1,0,100],[0,1,50]])
dst = cv2.warpAffine(img,M,(cols,rows))

This results into:


The cv2.warpAffine function takes in three arguments. The first is the image. Second is the transformation matrix for shifting. And the third is the output size.


OpenCV provides rotation with an adjustable center of rotation and a scaling factor. The transformation matrix for rotation M is:




import numpy as np
import cv2
img = cv2.imread('image.jpg',0)
rows,cols = img.shape

M = cv2.getRotationMatrix2D((cols/2,rows/2),90,1)
dst = cv2.warpAffine(img,M,(cols,rows))
cv2.waitKey(0) & 0xFF

To apply this transformation matrix we used the OpenCV function cv2.getRotationMatrix2D. We scale the image by half and rotate it by 90 degrees anticlockwise.
The result is:

Affine Transformation

In this transformation all the parallel lines are kept parallel in the final image.

import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('image.jpg')
rows,cols,ch = img.shape

pts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[10,100],[200,50],[100,250]])

M = cv2.getAffineTransform(pts1,pts2)

dst = cv2.warpAffine(img,M,(cols,rows))

cv2.waitKey(0) & 0xFF

To obtain the transformation matrix we need three points from the source image and three points of the destination image to define the planes of transformation. Then using the function cv2.getAffineTransform we get a 2×3 matrix which we pass into the cv2.warpAffine function.

The result looks like this:

Perspective Transform

This transformation leads to change in the point of view of the image. The straight lines remain as it is. For this transformation we need 4 points from the source and output image of which 3 should be non-collinear to define a plane. From these points we define a 3×3 matrix using cv2.getPerspectiveTransform and pass the resulting matrix into cv2.warpPerspective.

import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('image.jpg')
rows,cols,ch = img.shape

pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])

M = cv2.getPerspectiveTransform(pts1,pts2)

dst = cv2.warpPerspective(img,M,(300,300))

cv2.waitKey(0) & 0xFF

The result looks like this :

Thats all !!

OpenCV+Python:Part 3–Tracking Object using ColorSpaces

In this post I will explain how to extract a ROI using the OpenCV functions cv2.cvtColor()
The following code snippet tracks any object of blue color in the video.

import cv2
import numpy as np

cap = cv2.VideoCapture(0)


# Take each frame
_, frame = cap.read()

# Convert BGR to HSV
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

# define range of blue color in HSV
lower_blue = np.array([110,50,50])
upper_blue = np.array([130,255,255])

# Threshold the HSV image to get only blue colors
mask = cv2.inRange(hsv, lower_blue, upper_blue)

# Bitwise-AND mask and original image
res = cv2.bitwise_and(frame,frame, mask= mask)

k = cv2.waitKey(5) & 0xFF
if k == 27:


First of all we start a normal video capture object. The using cv2.cvtColor() we change the color space from BGR to HSV. There are about 150 or more color spaces but the following code uses HSV. To know more about color spaces got to–LINK.
To know more about HSV colorspace goto–LINK.
Then we set the threshold range for the color green using the lower and upper green variables.
Then we mask every other color so that only the color green is visible.

How to find the HSV values to Track

This is a very frequent question.

>>> green = np.uint8([[[0,255,0 ]]])
>>> hsv_green = cv2.cvtColor(green,cv2.COLOR_BGR2HSV)
>>> print hsv_green

Now for the given output just take [H-10, 100,100] and [H+10, 255, 255] as lower bound and upper bound. If the result is not clear increase the range.

The output to the above code looks somthing like this.

OpenCV+Python:Part 2–Image Arithmetics


You can add two images either using OpenCV: cv2.add() or Numpy: result = img1 + img2
(Both images should be of same depth and type.)There is a major difference between these two.

>>> x = np.uint8([250])
>>> y = np.uint8([10])

>>> print cv2.add(x,y) # 250+10 = 260 => 255

>>> print x+y # 250+10 = 260 % 256 = 4

*OpenCV provides better results


Adding images using the previous method is very blunt. Using blending you can get cool transition between two images.
Blending is done by using the OpenCV function cv2.addWeighted() using the formula:
f(x)=a*img1 + (1-a)img2 + z
where a is the weight.
What we basically do is provide weights to the two images such that they mix with different intensities.

The following code adds two images with weights 0.7 and 0.3.
(Both images should be of same depth and type)

img1 = cv2.imread('img1.png')
img2 = cv2.imread('img2.jpg')

result = cv2.addWeighted(img1,0.7,img2,0.3,0) # z is taken as 0


The final result looks somewhat like this:

OpenCV+Python:Part 2–Working with Images

–Access and Edit Pixel Values

All of the following steps can be performed using the Python terminal.
First of all load the image:

>>>import cv2
>>>import numpy as np
>>>img = cv2.imread('image.jpg')

To get a pixel value of a particular position:

>>>pix = img[x,y] #x,y are the coordinates
>>>print pix

To modify the pixel value of a particular point (x,y)

>>>img[x,y]=[B,G,R] #where B,G,R are integer values

A much faster method is using Numpy functions array.item() and array.itemset() to access and edit pixel values.However it only returns a scalar value.So to access the B,G,R values you need call the function array.item() separately for all.

–Image Properties
1.)>>>print image.shape
Its returns the a tuple with number of rows,columns and channels.
2.)>>>print image.size
Returns the numbers of pixels accessed by the image.
3.)>>>print img.dtype
Returns the Image datatype.

To select a particular region of image:
>>>part = img[x1:y1,x2:y2]

To paste the selected ROI at some other location:
>>>img[p1:q1,p2:q2] = part

–Splitting and Merging Channels

If you want to split B,G,R channels or merge them back use:

>>>b,g,r = cv2.split(img)
>>>img = cv2.merge(b,g,r)

However if you want to edit a particular channel a faster method would be to use numpy.
E.g. to set all red pixels to zero:

>>> img[:,:,2] = 0

Thats all in this post..

OpenCV+Python Part 1–Working with Videos

Learn how to load, display and save videos. I’ll explain this using the code snippets.The following program captures a video from the camera (I am using the in-built webcam of my laptop) and displays it.

import numpy as np
import cv2

x = cv2.VideoCapture(0)

# Capture frame-by-frame
ret, frame = x.read()

# Our operations on the frame come here
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Display the resulting frame
if cv2.waitKey(1) & 0xFF == ord('q'):

# When everything done, release the capture

The first thing that we need to do is to create a Video Capture object ‘x’.The argument passed to it is either the Device Index(a number to specify which camera) or the name of a video file. Normally since only one camera is connected to the system a 0 is passed. To select a second camera you can pass a 1 and so on.

cap.read() checks if the frame is read correctly and returns a boolean value.

Next lets play a video from a file.

First of all–
Go to : OpenCV\3rdparty\ffmpeg\
Copy the dll files opencv_ffmpeg.dll or opencv_ffmpeg_64.dll (depending on your system architecture) and paste them into C:\Python27\

Now rename both these to opencv_ffmpeg24x.dll or opencv_ffmpeg24x_64.dll where x is the version of opencv you are using. For example I am using OpenCV 2.4.6 so I renamed them
opencv_ffmpeg246.dll or opencv_ffmpeg246_64.dll.

Then using the following code snippet you can play any video from the current directory.

import numpy as np
import cv2

cap = cv2.VideoCapture('video.mp4')

ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
if cv2.waitKey(1) & 0xFF == ord('q'):


Now the final step is to save a video from a cam.
The code captures from a Camera, flips every frame vertically and saves the video.

import numpy as np
import cv2

cap = cv2.VideoCapture(0)

# Define the codec and create VideoWriter object
fourcc = cv2.cv.CV_FOURCC(*'XVID')
out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))

ret, frame = cap.read()
if ret==True:
frame = cv2.flip(frame,0)

# write the flipped frame

if cv2.waitKey(1) & 0xFF == ord('q'):

# Release everything if job is finished

For images we can easily used the function cv2.imwrite().However with videos it gets a bit tough.
Now along with a VideoCapture object we create a VideoWriter object where the arguments are the following:
1.)The name of the output file.
2.)Then we specify the FourCC code.FourCC is a 4-byte code used to specify the video codec.
Download the codec file for windows from — FourCC Codec.
3.)The number of frames per second(fps).
4.)The frame size.

That is all in this post..! All the best.


Matplotlib is a plotting library for Python which gives you wide variety of plotting methods.You can zoom images, save it etc using Matplotlib.
First of all to use matplotlib you need to have certain other libraries too:

All of these can be found at — LINK

After setting up everything you will now be able to peacefully use matplotlib.

Here is an example code:

import numpy as np
import cv2
from matplotlib import pyplot as plt

img = cv2.imread('image.jpg',0)
plt.imshow(img, cmap = 'gray', interpolation = 'bicubic')
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis

For further info on matplotlib visit — LINK

OpenCV+Python:Part1–Working with Images

This post is about opening,displaying and saving images.


This function is used to read an image. Either the image is in the current working directory or the full path is provided as an argument.

The second argument specifies the way the image is read.

1.)cv2.IMREAD_COLOR : x = cv2.imread('image.jpg',1)— Loads a color image. Any transparency of image will be neglected. It is the default flag.

2.)cv2.IMREAD_GRAYSCALE : x = cv2.imread('image.jpg',0)— Loads image in gray-scale mode.
3.)cv2.IMREAD_UNCHANGED : x = cv2.imread('image.jpg',-1)— Loads image as such including alpha channel.

The full code to load an image would look something like this:

import numpy as np
import cv2
x = cv2.imread('image.jpg',1)



This function is used to display an image in a window.The window automatically fits the image size.
There are two arguments again.The first argument provides a name to the window given as a string. The second argument provides the variable in which the image is stored.


The full code to create a window that stays until an exit key is pressed is as follows:


cv2.waitKey(0) takes arguments as milliseconds. Passing 0 makes the function wait until any key is pressed. For 64-bit machine the code has to of this form : cv2.waitKey(0) & 0xFF

cv2.destroyAllWindows() does as the name suggests.



This function is used to save the image after processing.It takes in two arguments. First is the name of the file to written. Second is the variable in which the image is saved.



The whole post can be summarized by the following code. It loads an image from the current working directory and saves it as a png black n white image.

import numpy as np
import cv2
x = cv2.imread('image.jpg',0) #load a jpg image
cv2.imshow('image',x) #display image
cv2.waitKey(0) & 0xFF #wait for key press
cv2.imwrite('image.png',x) #save the image as png
cv2.destroyAllWindows() #destroy all windows

Thats all !!


Assuming you know a bit of Python(and if you don’t take a crash course at Codecademy or Learn Python the Hard Way) the next few posts will be about how to work on Open-CV with Python.
You also need to know a bit of numpy for this so if you are not familiar with this visit :Numpy Tutorials
The following steps are for Windows.

So first of all lets set up your machine:
Download the following and install:

4.)DownloadOpenCV-2.4.x and extract.

After installing the above go to opencv/build/python/2.7
Copy cv2.pyd to C:/Python27/lib/site-packages

To test the installations:
1.) Open IDLE and type import numpy. If this returns no error then numpy has been installed correctly.
2.) Now type import cv2 and print cv2.__version__

If the results are printed without any errors the installation has been successful