Document Scanner using Python

Document Scanner/

Programming Language used- Python. This project is nothing but a scanner that scans your images. It can clear the background of your image and help to remove other noisy elements in your image.

Hello guys, Wanted to make a start Python Projects? Confused about where to start?

If you have these issues you are not alone, everyone is confused, everyone wants to learn more and grow more.

But I think now you are able to take yourself away from the crowd. Why? Because if you hereby reading this article then you have already planned your big career.

To properly learn everything from scratch be with me and continue reading. Be a good reader guys.

Superb, So lets first know about the project.

Project Details

This project is nothing but a scanner that scans your images - it can clear the background of your image. It also helps to remove other noisy elements in your image.
This will give you 4 output screens - Gray Scale Image, Blurred image, Canny Image, Scanned Image.
I have used some of the most common libraries of python such as OpenCV, Numpy.
Things covered in the project -
1) Basics of OpenCV
2) Simple and Easy Explanation of the code.
3) To do background blur
4) Function to detect end-points of an Image
5) Use that function in the code and scanning of Image

Now let's get started

Step 1: Importing the libraries

import cv2
import numpy as np

Here we are importing the cv2 (opencv) library which is used in the whole project. cv2 is used for image processing, image reading, editing and many more.

Then numpy, NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Step 2: Start processing the image

In this step, we will read the image, resize it and make it in grayscale image. The following code is for this conversion-

pic=cv2.imread('imagedocscanner.jpeg')   
pic=cv2.resize(pic,(1300,800)) 
temp=pic.copy()
cnvgray=cv2.cvtColor(pic,cv2.COLOR_BGR2GRAY)

Now converting the image into a Blurred image, given the code-

blur=cv2.GaussianBlur(cnvgray,(5,5),0)

Then converting into Canny Image-

edge=cv2.Canny(blur,30,50)

You can see that the canny image is so noisy, so we will clear the noise and convert it to a contour plot i.e, to find the boundaries of the image.

We will use a loop to do this.

contours,hierarchy=cv2.findContours(edge,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
contours=sorted(contours,key=cv2.contourArea,reverse=True)

for c in contours:
s=cv2.arcLength(c,True)
approx=cv2.approxPolyDP(c,0.02*s,True)

if len(approx)==4:
target=approx
break

I have attested the code with proper comments with this article, so just visit the code and learn how the loop is working.

Step 3: Finding endpoints

Now then we call the mapp function which we have defined at the starting of the code. I am again advising you, open the code in your side panel to make it simple to understand.

Hereby mapp function is used to find the endpoints of the image and then store that in endp.

Take a look at the code-

def mapp(h):
h = h.reshape((4,2))
hnew = np.zeros((4,2),dtype = np.float32)

add = h.sum(1)
hnew[0] = h[np.argmin(add)]
hnew[2] = h[np.argmax(add)]

diff = np.diff(h,axis = 1)
hnew[1] = h[np.argmin(diff)]
hnew[3] = h[np.argmax(diff)]

return hnew

Now, then we use getPerspectiveTransform to get the top-down view (or) bird eye view.This will give us a transformed matrix.

And after this we use the warpPerspective function, in this, we pass our transformed matrix and our image, and the output is a warped image, which gives a top-down view.

op=cv2.getPerspectiveTransform(endp,pts)
dst=cv2.warpPerspective(temp,op,(800,800))

Just look at the code everything will get clear.

At last, you will get your scanned image. Cool right.
I hereby congrats you that now you have done this project completely and also advise you to try learning more projects like this one.

Coders Packet

Document Scanner using Python

Comments