生活随笔
收集整理的這篇文章主要介紹了
王一博豆瓣电影海报抓取
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
代碼基本通用,只要換個(gè)名字就可以下載到你喜歡的明星電影海報(bào)。
直接上代碼和效果圖,注意要把chromedriver下載下來,然后輸入正確的路徑信息才行。
import requests
from lxml
import etree
from selenium
import webdriver
import osname
= '王一博'def download
(src, id
):
if not os.path.isdir
("Xpath的翻頁圖片包"):os.mkdir
("Xpath的翻頁圖片包")dir = os.path.join
("Xpath的翻頁圖片包/", str
(id
) +
'.webp')try:pic
= requests.get
(src,
timeout = 10
)with open
(dir,
'wb') as d:d.write
(pic.content
)except requests.exceptions.ConnectionError:print
("圖片無法下載")def down_load
(request_url
):driver.get
(request_url
)html
= etree.HTML
(driver.page_source
)src_xpath
= "//div[@class='item-root']/a[@class='cover-link']/img[@class='cover']/@src"title_xpath
= "//div[@class='item-root']/div[@class='detail']/div[@class='title']/a[@class='title-text']"srcs
= html.xpath
(src_xpath
)titles
= html.xpath
(title_xpath
)num
= len
(srcs
)if num
> 15:srcs
= srcs
[1:
]titles
= titles
[1:
]for src, title
in zip
(srcs, titles
):
if title is None:
continueprint
(src
)download
(src, title.text
)print
('OK')print
(num
)if num
>= 1:
return Trueelse:
return False
if __name__
== '__main__':requests_url
= "https://movie.douban.com/subject_search?search_text=" + namedriver
= webdriver.Chrome
(executable_path
=r
'C:\Users\×××\AppData\Local\Google\Chrome\Application\chromedriver.exe')driver.get
(requests_url
)html
= etree.HTML
(driver.page_source
)print
(html
)base_url
= 'https://movie.douban.com/subject_search?search_text=' + name +
'&cat=1002&start='start
= 0
while start
< 70:request_url
= base_url + str
(start
)flag
= down_load
(request_url
)if flag:start +
= 15else:
breakprint
("結(jié)束")
代碼有較強(qiáng)的可移植性,換個(gè)名字基本就可以下載。主要方法是利用了Xpath路徑翻頁查詢下載,親測個(gè)別明星可能不使用。而且下載的圖片是webp格式,所以這也是一個(gè)缺陷,后期慢慢改進(jìn)!先看效果圖吧:
總結(jié)
以上是生活随笔為你收集整理的王一博豆瓣电影海报抓取的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。