python关键词采集,(2017)新版爱站关键词采集Python脚本
愛站(aizhan)進入2017年經歷了一次大改版,此前的采集腳本無法獲取數據了,現在重新更新針對2017年新版愛站關鍵詞采集工具。
python環境:python3.5
'''
@ 2017新版愛站采集
@ laoding
'''
import requests
from bs4 import BeautifulSoup
import csv
def getHtml(url):
try:
# 替換成自己的agent
headers = {
"Use-Agent":""
}
r = requests.get(url,headers=headers)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return ""
def writeToCsv(filepath,sj):
with open(filepath,"a+",newline="") as f:
f_csv = csv.writer(f)
f_csv.writerow(tuple(sj))
def getSJ(url,filepath):
html = getHtml(url)
soup = BeautifulSoup(html,"html.parser")
ls = soup.select("body > div.baidurank-wrap > div.tabs-content > div.baidurank-list > table > tbody")[0].find_all("tr")
n = len(ls)
for m in range(0,n):
tr = ls[m]
keyword = tr.find_all(class_="title")[0].get_text().strip()
sj =[ele.get_text().strip() for ele in tr.find_all(class_="center")]
sj.insert(0, keyword)
writeToCsv(filepath,sj)
print("%s done" %m)
def main():
filepath = "F:/test.csv" # 替換成自己的導出結果的文件路徑
for n in range(1,51):
url = "http://baidurank.aizhan.com/baidu/xxx.com/-1/0/{}/".format(n) # xxx.com 替換成查詢的地址
getSJ(url,filepath)
print("%s finish" %n)
if __name__ == '__main__':
main()
結果如下:
總結
以上是生活随笔為你收集整理的python关键词采集,(2017)新版爱站关键词采集Python脚本的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Ubuntu16.04(14.04) 安
- 下一篇: 分数换算小数补0法_高考志愿填报时“线差