This script is written in python3 and scraps proxy lists from various websites and sorts tham by time taken. these proxies can be directly used to for scraping or other tasks.
this script is written in python3 and scraps proxy lists from various websites and sorts tham by time taken. these proxies can be directly used to for scraping or other tasks.
liberary used:-
requests
pandas
base64
_thread
random
scraps ip table from web page. Below is the Python code:
index = requests.get(urls[0],headers=get_header()) # using pandas to extract ip table tables = pandas.read_html(index.content)[0].dropna() # create ip:port format tables["Port"]=tables["Port"].astype(int) lst = list(tables.aggregate(lambda row:":".join([row[0],str(row[1])]),axis=1)) all_ip.extend(lst)
checks if proxy is able to respond in given time limit
def check_ip(n): global working_ip try: x = requests.get("http://google.com",proxies={'http': 'http://'+all_ip[n]},headers=get_header(),timeout=2) # x.elapsed can be used to sort proxies by time taken working_ip.append((all_ip[n],x.elapsed.microseconds)) return True except: return False
Submitted by Narender Bhadu (nandubhadu001)
Download packets of source code on Coders Packet