python requests用法总结
requests是一個很實用的Python?HTTP客戶端庫,編寫爬蟲和測試服務器響應數據時經常會用到。可以說,Requests 完全滿足如今網絡的需求
本文全部來源于官方文檔 http://docs.python-requests.org/en/master/
安裝方式一般采用$ pip install requests。其它安裝方式參考官方文檔
?
HTTP - requests
?
import?requests
?
GET請求
?
r? = requests.get('http://httpbin.org/get')
?
傳參
>>>?payload?=?{'key1':?'value1',?'key2':?'value2', 'key3':?None}
>>>?r?=?requests.get('http://httpbin.org/get',?params=payload)
?
http://httpbin.org/get?key2=value2&key1=value1
?
Note that any dictionary key whose value is?None?will not be added to the URL's query string.
?
參數也可以傳遞列表
?
>>>?payload?=?{'key1':?'value1',?'key2': ['value2',?'value3']}
>>>?r?=?requests.get('http://httpbin.org/get',?params=payload)
>>>?print(r.url)
http://httpbin.org/get?key1=value1&key2=value2&key2=value3
r.text?返回headers中的編碼解析的結果,可以通過r.encoding = 'gbk'來變更解碼方式
r.content返回二進制結果
r.json()返回JSON格式,可能拋出異常
r.status_code
r.raw返回原始socket respons,需要加參數stream=True
?
>>>?r?=?requests.get('https://api.github.com/events',?stream=True)
>>>?r.raw
<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
>>>?r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
將結果保存到文件,利用r.iter_content()
?
with?open(filename,?'wb')?as?fd:
????for?chunk?in?r.iter_content(chunk_size):
????????fd.write(chunk)
?
傳遞headers
?
>>>?headers?=?{'user-agent':?'my-app/0.0.1'}
>>>?r?=?requests.get(url,?headers=headers)
?
傳遞cookies
?
>>>?url?=?'http://httpbin.org/cookies'
>>>?r?=?requests.get(url,?cookies=dict(cookies_are='working'))
>>>?r.text
'{"cookies": {"cookies_are": "working"}}'
?
?
POST請求
?
傳遞表單
r?=?requests.post('http://httpbin.org/post',?data?=?{'key':'value'})
?
通常,你想要發送一些編碼為表單形式的數據—非常像一個HTML表單。 要實現這個,只需簡單地傳遞一個字典給?data?參數。你的數據字典 在發出請求時會自動編碼為表單形式:
?
?
>>>?payload?=?{'key1':?'value1',?'key2':?'value2'}
>>>?r?=?requests.post("http://httpbin.org/post",?data=payload)
>>>?print(r.text)
{
? ...
? "form": {
??? "key2": "value2",
??? "key1": "value1"
? },
? ...
}
很多時候你想要發送的數據并非編碼為表單形式的。如果你傳遞一個?string?而不是一個dict?,那么數據會被直接發布出去。
?
>>>?url?=?'https://api.github.com/some/endpoint'
>>>?payload?=?{'some':?'data'}
?
>>>?r?=?requests.post(url,?data=json.dumps(payload))
或者
>>>?r?=?requests.post(url,?json=payload)
?
?
傳遞文件
?
url?=?'http://httpbin.org/post'
>>>?files?=?{'file':?open('report.xls',?'rb')}
>>>?r?=?requests.post(url,?files=files)
配置files,filename, content_type and headers
files?=?{'file': ('report.xls',?open('report.xls',?'rb'),?'application/vnd.ms-excel', {'Expires':?'0'})}
?
files?=?{'file': ('report.csv',?'some,data,to,send\nanother,row,to,send\n')}
?
響應
?
r.status_code
r.heards
r.cookies
?
?
跳轉
?
By default Requests will perform location redirection for all verbs except HEAD.
?
>>>?r?=?requests.get('http://httpbin.org/cookies/set?k2=v2&k1=v1')
>>>?r.url
'http://httpbin.org/cookies'
>>>?r.status_code
200
>>>?r.history
[<Response [302]>]
?
If you're using HEAD, you can enable redirection as well:
?
r=requests.head('http://httpbin.org/cookies/set?k2=v2&k1=v1',allow_redirects=True)
?
You can tell Requests to stop waiting for a response after a given number of seconds with the?timeoutparameter:
?
requests.get('http://github.com',?timeout=0.001)
?
?
高級特性
?
來自?<http://docs.python-requests.org/en/master/user/advanced/#advanced>
?
session,自動保存cookies,可以設置請求參數,下次請求自動帶上請求參數
?
s?=?requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r?=?s.get('http://httpbin.org/cookies')
print(r.text)
# '{"cookies": {"sessioncookie": "123456789"}}'
session可以用來提供默認數據,函數參數級別的數據會和session級別的數據合并,如果key重復,函數參數級別的數據將覆蓋session級別的數據。如果想取消session的某個參數,可以在傳遞一個相同key,value為None的dict
?
s?=?requests.Session()
s.auth?=?('user',?'pass') #權限認證
s.headers.update({'x-test':?'true'})
# both 'x-test' and 'x-test2' are sent
s.get('http://httpbin.org/headers',?headers={'x-test2':?'true'})
函數參數中的數據只會使用一次,并不會保存到session中
?
如:cookies僅本次有效
r?=?s.get('http://httpbin.org/cookies',?cookies={'from-my':?'browser'})
?
session也可以自動關閉
?
with?requests.Session()?as?s:
????s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
?
響應結果不僅包含響應的全部信息,也包含請求信息
?
r?=?requests.get('http://en.wikipedia.org/wiki/Monty_Python')
r.headers
r.request.headers
?
?
SSL證書驗證
?
?
Requests可以為HTTPS請求驗證SSL證書,就像web瀏覽器一樣。要想檢查某個主機的SSL證書,你可以使用?verify?參數:
?
?
>>>?requests.get('https://kennethreitz.com',?verify=True)
requests.exceptions.SSLError: hostname 'kennethreitz.com' doesn't match either of '*.herokuapp.com', 'herokuapp.com'
在該域名上我沒有設置SSL,所以失敗了。但Github設置了SSL:
>>>?requests.get('https://github.com',?verify=True)
<Response [200]>
對于私有證書,你也可以傳遞一個CA_BUNDLE文件的路徑給?verify?。你也可以設置REQUEST_CA_BUNDLE?環境變量。
?
>>>?requests.get('https://github.com',?verify='/path/to/certfile')
?
如果你將?verify?設置為False,Requests也能忽略對SSL證書的驗證。
?
>>>?requests.get('https://kennethreitz.com',?verify=False)
<Response [200]>
默認情況下,?verify?是設置為True的。選項?verify?僅應用于主機證書。
你也可以指定一個本地證書用作客戶端證書,可以是單個文件(包含密鑰和證書)或一個包含兩個文件路徑的元組:
?
>>>?requests.get('https://kennethreitz.com',?cert=('/path/server.crt',?'/path/key'))
<Response [200]>
響應體內容工作流
?
默認情況下,當你進行網絡請求后,響應體會立即被下載。你可以通過?stream?參數覆蓋這個行為,推遲下載響應體直到訪問?Response.content?屬性:
?
tarball_url?=?'https://github.com/kennethreitz/requests/tarball/master'
r?=?requests.get(tarball_url,?stream=True)
此時僅有響應頭被下載下來了,連接保持打開狀態,因此允許我們根據條件獲取內容:
?
if?int(r.headers['content-length'])?<?TOO_LONG:
??content?=?r.content
??...
如果設置stream為True,請求連接不會被關閉,除非讀取所有數據或者調用Response.close。
?
可以使用contextlib.closing來自動關閉連接:
?
?
import?requests
from?contextlib
import?closing
tarball_url?=?'https://github.com/kennethreitz/requests/tarball/master'
file?=?r'D:\Documents\WorkSpace\Python\Test\Python34Test\test.tar.gz'
?
with?closing(requests.get(tarball_url,?stream=True))?as?r:
with?open(file,?'wb')?as?f:
for?data?in?r.iter_content(1024):
f.write(data)
?
Keep-Alive
?
來自?<http://docs.python-requests.org/en/master/user/advanced/>
?
同一會話內你發出的任何請求都會自動復用恰當的連接!
注意:只有所有的響應體數據被讀取完畢連接才會被釋放為連接池;所以確保將?stream設置為?False?或讀取?Response?對象的?content?屬性。
?
流式上傳
Requests支持流式上傳,這允許你發送大的數據流或文件而無需先把它們讀入內存。要使用流式上傳,僅需為你的請求體提供一個類文件對象即可:
讀取文件請使用字節的方式,這樣Requests會生成正確的Content-Length
with?open('massive-body',?'rb')?as?f:
????requests.post('http://some.url/streamed',?data=f)
?
分塊傳輸編碼
?
對于出去和進來的請求,Requests也支持分塊傳輸編碼。要發送一個塊編碼的請求,僅需為你的請求體提供一個生成器
注意生成器輸出應該為bytes
def?gen():
????yield?b'hi'
????yield?b'there'
requests.post('http://some.url/chunked',?data=gen())
For chunked encoded responses, it's best to iterate over the data using?Response.iter_content(). In an ideal situation you'll have set?stream=True?on the request, in which case you can iterate chunk-by-chunk by calling?iter_content?with a chunk size parameter of?None. If you want to set a maximum size of the chunk, you can set a chunk size parameter to any integer.
POST Multiple Multipart-Encoded Files
?
來自?<http://docs.python-requests.org/en/master/user/advanced/>
?
<input type="file" name="images" multiple="true" required="true"/>
?
To do that, just set files to a list of tuples of?(form_field_name,?file_info):
?
>>>?url?=?'http://httpbin.org/post'
>>>?multiple_files?=?[
??????? ('images', ('foo.png', open('foo.png', 'rb'), 'image/png')),
??????? ('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))]
>>>?r?=?requests.post(url,?files=multiple_files)
>>>?r.text
{
? ...
? 'files': {'images': ' ....'}
? 'Content-Type': 'multipart/form-data; boundary=3131623adb2043caaeb5538cc7aa0b3a',
? ...
}
Custom Authentication
Requests allows you to use specify your own authentication mechanism.
Any callable which is passed as the?auth?argument to a request method will have the opportunity to modify the request before it is dispatched.
Authentication implementations are subclasses of?requests.auth.AuthBase, and are easy to define. Requests provides two common authentication scheme implementations in?requests.auth:HTTPBasicAuth?and?HTTPDigestAuth.
Let's pretend that we have a web service that will only respond if the?X-Pizza?header is set to a password value. Unlikely, but just go with it.
from?requests.auth?import?AuthBase
class?PizzaAuth(AuthBase):
????"""Attaches HTTP Pizza Authentication to the given Request object."""
????def?__init__(self,?username):
????????# setup any auth-related data here
????????self.username?=?username
def?__call__(self,?r):
????????# modify and return the request
????????r.headers['X-Pizza']?=?self.username
????????return?r
Then, we can make a request using our Pizza Auth:
>>>?requests.get('http://pizzabin.org/admin',?auth=PizzaAuth('kenneth'))
<Response [200]>
?
來自?<http://docs.python-requests.org/en/master/user/advanced/>
?
流式請求
?
r?=?requests.get('http://httpbin.org/stream/20',?stream=True)
for?line?in?r.iter_lines():
?
代理
?
If you need to use a proxy, you can configure individual requests with the?proxies?argument to any request method:
import?requests
proxies?=?{
??'http':?'http://10.10.1.10:3128',
??'https':?'http://10.10.1.10:1080',
}
requests.get('http://example.org',?proxies=proxies)
?
To use HTTP Basic Auth with your proxy, use the?http://user:password@host/?syntax:
proxies?=?{'http':?'http://user:pass@10.10.1.10:3128/'}
?
超時
?
?
If you specify a single value for the timeout, like this:
?
r?=?requests.get('https://github.com',?timeout=5)
?
The timeout value will be applied to both the?connect?and the?read?timeouts. Specify a tuple if you would like to set the values separately:
?
r?=?requests.get('https://github.com',?timeout=(3.05,?27))
?
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
?
r?=?requests.get('https://github.com',?timeout=None)
?
來自?<http://docs.python-requests.org/en/master/user/advanced/>
轉載于:https://www.cnblogs.com/linkxu1989/p/9197406.html
總結
以上是生活随笔為你收集整理的python requests用法总结的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: REM,你这磨人的小妖精!
- 下一篇: 基于Spring Boot的“课程设计”