How to get the website URL of any company despite entering the misspelled name?

Join me in curating a Python code to fetch the URL of any company irrespective of the spelling mistakes committed while entering the company’s name. The name entered may be misspelled but should be a close match for the program to return accurate results. In this tutorial, we shall use two libraries of utmost importance BeautifulSoup and requests imported from bs4.

Pre-requisite knowledge required: Fundamentals of Python and Sheer Curiosity to implement the code all by yourself.

Let’s get started.

 

 

Importing the libraries

 

Beautiful Soup is a popular Python library used for web scrapping purposes to fetch meaningful and relevant data required by a user.

requests is used to help us by sending HTTP requests to interact with the web-servers.

Code to import both the libraries:

from bs4
import BeautifulSoup
import requests

Creating the function

 

  • We shall now create a function which will take the string input from the user and further process it to send a GET request to Google with the query.
  • After which it parses the returned HTML to check if Google suggests a corrected spelling in case of a misspelled name entered.
  • It returns the first search URL link if the entered name is spelled correctly or the URL of the corrected search if a suggestion is found. Href tags are parsed to obtain the required URLs.

Code to implement the above-mentioned steps:

def search_company(search_query):
    url = 'https://www.google.com/search'
    headers = {
        'Accept' : '*/*',
        'Accept-Language': 'en-US,en;q=0.5',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:92.0) Gecko/20100101 Firefox/92.0',
    }
    parameters = {'q': search_query}
    
    content = requests.get(url, headers=headers, params=parameters).text
    soup = BeautifulSoup(content, 'html.parser')
    
   
    correction_tag = soup.find('a', string=lambda text: text and "Showing results for" in text)
    if correction_tag:
        corrected_url = correction_tag['href']
        print(f"Google suggests: {correction_tag.get_text()}")
        return "https://www.google.com" + corrected_url

   
    search_results = soup.find(id='search')
    if search_results:
        first_link = search_results.find('a')
        if first_link:
            return first_link['href']
    
    return None

We are including the dictionary named “headers” to tackle the problem of being identified as a bot.

 

Obtaining the input from the user

 

Store the input entered by the user as company_name.

Code:

company_name=input("Enter the name")

 

Calling the function

 

Create a variable named “url” to store the output obtained after calling the function created earlier.

Code:

url=search_company(company_name)

Final call

 

Use a basic if-else statement code to print the URL of the company if spotted.

Code:

if url:
    print("Company URL:", url)
else:
    print("No results found.")

 

Example:
Company name to be entered = Yaho

Output:

Enter the company's name: Yaho
Company URL: https://in.yahoo.com/

 

 

 

 

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top