Python 爬取斗图啦图片
生活随笔
收集整理的這篇文章主要介紹了
Python 爬取斗图啦图片
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
斗圖啦
requests
BeautifulSoup4
代碼
# -*- coding:utf-8 -*- # pip install requests 框架 import requests # pip install beautifulsoup4 框架 # pip install lxml 解析器 from bs4 import BeautifulSoup import osclass doutuSpider(object):headers = {"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"}def get_url(self, url):data = requests.get(url, headers=self.headers)soup = BeautifulSoup(data.content, 'lxml')totals = soup.findAll("a", {"class": "list-group-item"})for one in totals:sub_url = one.get('href')global pathpath = 'E:\\img' + '\\' + sub_url.split('/')[-1]os.mkdir(path)try:self.get_img_url(sub_url)except:passpasspassdef get_img_url(self, url):data = requests.get(url, headers = self.headers)soup = BeautifulSoup(data.content, 'lxml')totals = soup.findAll('div', {'class': 'artile_des'})for one in totals:img = one.find('img')try:sub_url = img.get('src')except Exception as e:raise efinally:urls = sub_urltry:self.get_img(urls)except:print urlspasspasspassdef get_img(self, url):filename = url.split('/')[-1]global pathimg_path = path + '\\' + filenameimg = requests.get(url, headers = self.headers)try:with open(img_path, 'wb') as f:f.write(img.content)except:passpassdef create(self):for count in range(1,10):url = 'https://www.doutula.com/article/list/?page={}'.format(count)print 'download {} page'.format(count)self.get_url(url)passpassif __name__ == '__main__':doutu = doutuSpider()doutu.create() 超強(qiáng)干貨來襲 云風(fēng)專訪:近40年碼齡,通宵達(dá)旦的技術(shù)人生總結(jié)
以上是生活随笔為你收集整理的Python 爬取斗图啦图片的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: YAML基础知识及搭建一台简洁版gues
- 下一篇: 【二分答案】丢瓶盖