11 单线程+多任务异步协程 爬虫
生活随笔
收集整理的這篇文章主要介紹了
11 单线程+多任务异步协程 爬虫
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
#
from lxml import etree import asyncio import aiohttp import time def callback(task): # 回調函數page = task.result()tree = etree.HTML(page)name = tree.xpath('/html/body/div[3]/div[4]/ul/li/a/span[2]/p[1]/text()')print(name)# print('I am callback', task.result()) #接收task的return async def get_page(url): # async with aiohttp.ClientSession() as session:async with await session.get(url=url) as response:page_text = await response.text() # read() 二進制形式的響應數據,json()return page_text# print('響應數據:',page_text)# print('ok %s'%url) start = time.time() urls = ['http://ly6080.com.cn/vod/type/id/1.html','http://ly6080.com.cn/vod/type/id/2.html','http://ly6080.com.cn/vod/type/id/3.html', ] tasks = [] #任務列表 放置多個任務對象 loop = asyncio.get_event_loop() for url in urls:c = get_page(url)task = asyncio.ensure_future(c)tasks.append(task)task.add_done_callback(callback) # 添加要執行的回調函數 loop.run_until_complete(asyncio.wait(tasks))print('總耗時',time.time()-start)?
轉載于:https://www.cnblogs.com/zhangchen-sx/p/11093805.html
總結
以上是生活随笔為你收集整理的11 单线程+多任务异步协程 爬虫的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: div 显示与隐藏
- 下一篇: [Hadoop in China 201