By Rohan J

An exciting and useful app created using Python that can solve a system of linear equations given an image containing the problem

The app created aims to build the functionality of recognizing a problem containing a set of simultaneous equations uploaded as an image input by the user and providing the solution to the problem in a student-friendly manner. The technologies used include image preprocessing and object localization (using OpenCV libraries), Optical Character Recognition (OCR) for possible characters that occur in simultaneous linear equations and mathematical tools for solving the equations. This app would be of great use to school students who often find great difficulty in solving problems from textbooks where solutions are not available. they could simply take a picture of the problem where they are facing difficulty and upload it onto the app which could then provide them a detailed solution.

In the first step an image is uploaded by the user. Since the image could technically be a color image, it is first converted to black and white. A histogram of the black and white image is then constructed and based on this histogram, a threshold value is decided using Otsu's thresholding technique. The threshold value is used to binarize the image where the textual characters appear in black and the background is white. Some other image preprocessing techniques like image dilation are used to make it easier to localize and detect objects in the image. The final result obtained here is the preprocessed image which is ready for the object localization step.

Object localization was done by contour detection. A contour is a line joining similar intensity values in the image. Thus, a separate contour is created for each of the characters in the image. For each contour a bounding rectangle was created around the contour by using the extreme locations of the pixels and a separate image was then created for each of the characters. Care also needed to be taken to ensure that only characters in the image which correspond to the equation should be recognized and no decorative effects (such as a box drawn around the equation should be considered). Each of the images was then fed into an OCR model to recognize the characters.

For character recognition, instead of using a standard deep learning approach, specific landmark points in the image were used along with the size of the images were used to distinguish the characters. For instance, a minus sign would normally be longer than an '=' sign. Further, the intensity values of pixels at different locations would differ for different numbers. Using these two differences, it was possible to recognize all the possible characters in the image. Further, the locations of each of the bounding rectangles was also stored. This made it possible to exactly construct the two equations from the image. At this stag, we have two strings each of which represent the two equations in the image.

String processing operations were then used to extract the coefficients of x and y in each of the equations and the equations were then solved using the method of cross-multiplication. The final result was then displayed to the user.

Some further improvements to this application would be the creating of a web or mobile app that uses the code uploaded as backend and returns a solution. The app also currently returns only the final answer and could be modified to provide more detailed solutions. Further the app could be extended to solve general mathematical problems and not just simultaneous equations.

Submitted by Rohan J (Rohanj999)

Download packets of source code on Coders Packet

## Comments