當前位置：首頁 > 编程语言 > python >内容正文

python

python爬_python爬虫--模拟登录知乎

發布時間：2024/9/18 python 21 豆豆

生活随笔收集整理的這篇文章主要介紹了 python爬_python爬虫--模拟登录知乎小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1、處理登錄表單

處理登錄表單可以分為2步：

第一、查看網站登錄的表單，構建POST請求的參數字典；

第二、提交POST請求。

打開知乎登錄界面，https://www.zhihu.com/#signin，

按f12，打開開發者界面：

在這里面找到headers信息，

現在在用戶名和密碼處查找信息，

發現用戶名的屬性為account，account中的內容為我們的用戶名；

同理，password中的內容為我們的密碼。

在登錄表單中，有些key值在瀏覽器中設置了hidden值，不會顯示出來，這個時候我們需要去審查元素中去查找，

發現了，cookie中有一個_xsrf的屬性，類似于token的作用。而這個東西的存在，就讓我們在模擬登錄的時候，必須將這個屬性作為參數一起加在請求中發送出去。

而獲取_xsrf則可以用之前的BeautifulSoup獲取

importrequestsfrom bs4 importBeautifulSoup as bs

session=requests.session()

post_url= ‘https://www.zhihu.com/#signin‘agent= ‘Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Maxthon/5.1.2.3000 Chrome/55.0.2883.75 Safari/537.36‘headers={"Host": "www.zhihu.com","Referer":"http://www.zhihu.com/",‘User-Agent‘:agent

}

postdata={‘password‘: ‘*****‘,‘account‘: ‘******‘,

}

response= bs(requests.get(‘http://www.zhihu.com/#signin‘,headers=headers).content, ‘html.parser‘)

xsrf= response.find(‘input‘,attrs={‘name‘:‘_xsrf‘})[‘value‘]

postdata[‘_xsrf‘] =xsrf

responed= session.post(‘http://www.zhihu.com/login/email‘,headers=headers,data=postdata)print(responed)

結果顯示：

；

代碼做一些修改：

import requests

from bs4 import BeautifulSoup

session = requests.session()

agent = ‘Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Maxthon/5.1.2.3000 Chrome/55.0.2883.75 Safari/537.36‘

headers = {

"Host": "www.zhihu.com",

"Origin":"https://www.zhihu.com/",

"Referer":"http://www.zhihu.com/",

‘User-Agent‘:agent

}

postdata = {

‘password‘: ‘*****‘,

‘account‘: ‘******‘,

}

response = session.get("https://www.zhihu.com", headers=headers)

soup = BeautifulSoup(response.content, "html.parser")

xsrf = soup.find(‘input‘, attrs={"name": "_xsrf"}).get("value")

postdata[‘_xsrf‘] =xsrf

login_page = session.post(‘http://www.zhihu.com/login/email‘, data=postdata, headers=headers)

print(login_page.status_code)

運行結果：200

代表響應的狀態為請求成功，可以成功登錄表單。

原文地址：http://www.cnblogs.com/leon507/p/7633012.html

總結

以上是生活随笔為你收集整理的python爬_python爬虫--模拟登录知乎的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。