老男孩爬虫实战密训课第一季,2018.6,初识爬虫训练-实战2-自动登陆抽屉网
較上次內(nèi)容增加的內(nèi)容: 自動(dòng)登陸抽屜網(wǎng),自動(dòng)點(diǎn)贊,自動(dòng)換頁(yè)
1.自動(dòng)登陸網(wǎng)站 ? ?
登陸時(shí)故意在瀏覽器輸錯(cuò)出現(xiàn)login,查看內(nèi)容,獲取form data
2.cookies的使用 ? ?
本次爬取的網(wǎng)站采用了cookies授權(quán)機(jī)制,得先訪問(wèn)總網(wǎng)站,分配到未授權(quán)的cookies,登陸后帶著cookies去授權(quán)
代碼:
import requests from bs4 import BeautifulSoup #1.先訪問(wèn)抽屜,獲取cookie(未授權(quán)),點(diǎn)贊前肯定會(huì)訪問(wèn)此網(wǎng)站 r1 = requests.get(url='https://dig.chouti.com/all/hot/recent/1',headers={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36'} ) r1_cookie_dict=r1.cookies.get_dict()#2.發(fā)送用戶名和密碼認(rèn)為認(rèn)證 + cookie(未授權(quán)) response_login = requests.post(url='https://dig.chouti.com/login',data={'phone':'8613026354610','password':'halou445513','oneMonth':'1'},headers={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36'},cookies=r1_cookie_dict ) # 1.獲取點(diǎn)贊id for page_num in range(1,3):response_index = requests.get(url='https://dig.chouti.com/all/hot/recent/%s'%page_num,headers={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36'})# print(response_index.text)soup = BeautifulSoup(response_index.text,"html.parser")div = soup.find(attrs={'id':'content-list'})items = div.find_all(attrs={'class':'item'})for item in items:tag = item.find(attrs={'class':'part2'})if not tag:continuenid = tag.get('share-linkid')print(nid)#點(diǎn)贊r1 = requests.post(url='https://dig.chouti.com/link/vote?linksId=%s'%nid,headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36'},cookies = r1_cookie_dict)print(r1.text)其他知識(shí)點(diǎn):
1.requests常用參數(shù):
url,params,headers,cookies:
?
data,json:data傳的是字典,json傳的是字符串
?
files:上傳文件(stream分段上傳,此處不列出)
?
?
auth:瀏覽器內(nèi)置彈窗的數(shù)據(jù)
?
?
proxies:代理
?
cert,verify:與證書相關(guān),比較少見(jiàn)
?
1.驗(yàn)證碼問(wèn)題(與人工智能相關(guān)):
- pil模塊可以搞定簡(jiǎn)單的模塊,簡(jiǎn)單的70-80%通過(guò)率,
- 買第三方服務(wù)
?
轉(zhuǎn)載于:https://www.cnblogs.com/yhstcxx/p/10952107.html
總結(jié)
以上是生活随笔為你收集整理的老男孩爬虫实战密训课第一季,2018.6,初识爬虫训练-实战2-自动登陆抽屉网的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: IAR下连仿真器可以正常运行,程序下载到
- 下一篇: OpenWRT AR9331 mjpg