【自定义模块】从西刺免费代理获取IP列表
生活随笔
收集整理的這篇文章主要介紹了
【自定义模块】从西刺免费代理获取IP列表
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
這類代碼很多人都已經寫過了。主要用于給另一篇博客參考。
這里筆者整合出一個類,方便使用。
import re import random import requests from bs4 import BeautifulSoupclass IP():def __init__(self,headers="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0"):self.mainURL = "http://www.xicidaili.com/" # 西刺代理首頁self.nnURL = self.mainURL+"nn/" # 國內高匿代理self.ntURL = self.mainURL+"nt/" # 國內普通代理self.wnURL = self.mainURL+"wn/" # 國內HTTPS代理self.wtURL = self.mainURL+"wt/" # 國內HTTP代理self.ipRegulation = r"(([1-9]?\d|1\d{2}|2[0-4]\d|25[0-5]).){3}([1-9]?\d|1\d{2}|2[0-4]\d|25[0-5])"self.headers = {"User-Agent":headers}self.session = requests.Session()self.session.headers = headerswhile True:try:self.session.get(self.mainURL)breakexcept:print("訪問西刺代理失敗!")time.sleep(300)def get_nn_IP(self): # 獲取國內高匿代理address = []interface = []html = self.session.get(self.nnURL).textsoup = BeautifulSoup(html,"lxml") tds = soup.find_all("td")count = 0for td in tds:count += 1if count%10==2:address.append(str(td.string))elif count%10==3:interface.append(str(td.string))return ["{}:{}".format(address[i],interface[i]) for i in range(min(len(address),len(interface)))]def get_nt_IP(self): # 獲取國內普通代理address = []interface = []html = self.session.get(self.ntURL).textsoup = BeautifulSoup(html,"lxml") tds = soup.find_all("td")count = 0for td in tds:count += 1if count%10==2:address.append(str(td.string))elif count%10==3:interface.append(str(td.string))return ["{}:{}".format(address[i],interface[i]) for i in range(min(len(address),len(interface)))]def get_wn_IP(self): # 獲取國內HTTPS代理address = []interface = []html = self.session.get(self.wnURL).textsoup = BeautifulSoup(html,"lxml") tds = soup.find_all("td")count = 0for td in tds:count += 1if count%10==2:address.append(str(td.string))elif count%10==3:interface.append(str(td.string))return ["{}:{}".format(address[i],interface[i]) for i in range(min(len(address),len(interface)))]def get_wt_IP(self): # 獲取國內HTTP代理address = []interface = []html = self.session.get(self.wtURL).textprint(html)soup = BeautifulSoup(html,"lxml") tds = soup.find_all("td")count = 0for td in tds:count += 1if count%10==2:address.append(str(td.string))elif count%10==3:interface.append(str(td.string))return ["{}:{}".format(address[i],interface[i]) for i in range(min(len(address),len(interface)))]分享學習,共同進步!
?
總結
以上是生活随笔為你收集整理的【自定义模块】从西刺免费代理获取IP列表的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 数字后端——ECO
- 下一篇: SQLServer 的存储过程与java