Get company official website URL from company name in Python

In today’s digital landscape, finding the official website of a company can be tricky, especially with the rise of phishing websites that mimic legitimate businesses. Hackers can easily create clones of well-known sites, attempting to steal sensitive information such as login credentials. As a result, distinguishing an official website from a fake one can be a challenge. However, with the power of Python, it becomes much easier to automate the process of finding the correct URLs for companies, reducing the risks associated with phishing.

If you’ve ever needed to gather official websites for a list of companies, whether for a database or another project, you know how time-consuming manual searches can be. Luckily, Python offers an efficient solution to this problem. By leveraging libraries and tools, Python can automate the task of retrieving the correct website for any given company name, saving time and ensuring accuracy. In this post, we’ll explore how you can harness Python to quickly and reliably find the official website URL for any business, streamlining the process and improving security.

 

TO GET URL OF COMPANY WEBSITE WITH PYTHON:

To get the URL of company with python first you need a python package called BEAUTIFUL SOUP.       This package will help you to get the official URL of a company using python. This package is used for WEB SCRAPING. The beautiful soup is one of the best option to get the URL’s

BEAUTIFUL SOUP :

It is a Python library aimed at helping programmers who are trying to scrape data from websites. To install this package use this command: pip install beautifulsoup4. It can get any information from webpages.

pip install beautifulsoup4

Python code:

import requests  #importing packages
from bs4 import BeautifulSoup # importing beautiful soup
company_name=input("enter the name to search : ").lower() # to get name of the company and converting it into lowercase you can give any company name here
search_url=f"https://www.google.com/search?q={company_name}" #giving the search url
try:               # exception handeling
 response=requests.get(search_url) # here we try to get the search the search_url and catch the HTTP errors
 response.raise_for_status()       # to rise an exception for HTTP errors
 soup=BeautifulSoup(response.content,"html.parser")  # here response.content is object containing the HTML content
 for link in soup.find_all("a"):
   if company_name.lower() in link.get("href",'').lower():   #if we get the company name then we store it in the website_url variable
     website_url=link.get("href")
     print(f"the official website for {company_name} is https://www.google.com{website_url}") # printing the company URL if you want to go directly into their webpage just replace here google with{company_name} and you will directly go to their webpage
     break
except requests.exceptions.RequestException as e:  #handeling exceptions
 print(f"An error occurred: {e}")
 print(f"could not find the the official website for {company_name}")
Output:
enter the name to search : amazon the official website for amazon is

https://www.google.com/search?q=amazon&sca_esv=6cc63e0e79611188&sca_upv=1&ie=UTF-8&gbv=1&sei=sqLZZunJMIHTp84PrJiB0QE

When you run this code, it prompts you to enter the name of a company (e.g., “amazon”). The script processes your input and provides the link to the official website. If there are any errors, it will notify you that the website could not be found.

Requests:

Requests module allows you to send HTTP requests very easily. The HTTP request returns a Response objects. you can install it by typing pip install requests.

Methods to Get Company URLs From Company Name

We’ll explore two main methods to retrieve company website URLs:

  1. Using Search Engines like Google or Bing.
  2. Leveraging APIs like Clearbit or OpenCorporates.
Method 1: Using Search Engines (Google/Edge/Firefox etc…):

A simple and effective way to find a company’s official website is by using a search engine. With Python, you can automate this process using libraries like googlesearch-python for Google searches or requests combined with BeautifulSoup for Bing.

Example with Google Search
The googlesearch-python library simplifies performing Google searches programmatically, allowing you to quickly retrieve the official website for a given company name.

pip install googlesearch-python
Python Code:
from googlesearch import search

def get_company_url(company_name):
    # Perform a Google search with the company name and 'official website'
    query = f"{company_name} official website"
    for url in search(query, num_results=1):
        return url

company_name = "Apple"
official_url = get_company_url(company_name)
print(f"Official URL for {company_name}: {official_url}")
Output:
Official URL for Apple: https://www.apple.com/

Important Notes:

  • Limitations: This method relies on Google’s search results, so the official website may not always appear as the top result.
  • Respect Search Engine Policies: Be cautious of search engine usage policies, and avoid sending excessive automated requests to prevent being blocked.
Method 2: Using APIs

For more reliable and structured data, APIs are a great option for fetching company information, including their official website. Below are two popular options:

Option 1: Clearbit API
Clearbit provides a free API to access detailed company information, including the official website.

  1. Sign up for a Clearbit API key: Clearbit
  2. Install the requests library:
    bash
pip install requests
Python Code:
import requests

def get_company_url_clearbit(company_name):
    # Replace 'YOUR_API_KEY' with your actual Clearbit API key
    api_key = 'YOUR_API_KEY'
    headers = {'Authorization': f'Bearer {api_key}'}
    response = requests.get(f'https://company.clearbit.com/v1/domains/find?name={company_name}', headers=headers)
    if response.status_code == 200:
        data = response.json()
        return data.get('domain')
    else:
        return None

company_name = "Apple"
official_url = get_company_url_clearbit(company_name)
print(f"Official URL for {company_name}: {official_url}")
Output:
Official URL for Apple: apple.com

Option 2: OpenCorporates API

OpenCorporates offers company information, which may include the official website URL.

  1. Sign up for an OpenCorporates API key: OpenCorporates
  2. Install the requests library if needed:
pip install requests
python Code:
import requests

def get_company_url_opencorporates(company_name):
    # Replace 'YOUR_API_KEY' with your actual OpenCorporates API key
    api_key = 'YOUR_API_KEY'
    response = requests.get(f'https://api.opencorporates.com/v0.4/companies/search?q={company_name}&api_token={api_key}')
    if response.status_code == 200:
        data = response.json()
        if data['results']['companies']:
            return data['results']['companies'][0]['company']['homepage_url']
    return None

company_name = "Apple"
official_url = get_company_url_opencorporates(company_name)
print(f"Official URL for {company_name}: {official_url}")
Output:
Official URL for Apple: https://www.apple.com

 

Conclusion:

Automating the process of finding official company websites using Python can be done through multiple approaches. While search engines provide a quick solution, APIs like Clearbit and OpenCorporates offer more accurate and structured data. The choice of method depends on the project’s specific needs in terms of reliability, speed, and adherence to API usage policies.

 

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top