當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

爬虫之Selenium

發布時間：2025/5/22 编程问答 12 豆豆

生活随笔收集整理的這篇文章主要介紹了爬虫之Selenium 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

簡介

selenium最初是一個自動化測試工具,而爬蟲中使用它主要是為了解決requests無法直接執行JavaScript代碼的問題 selenium本質是通過驅動瀏覽器，完全模擬瀏覽器的操作，比如跳轉、輸入、點擊、下拉等，來拿到網頁渲染之后的結果，可支持多種瀏覽器

作用：可以讓瀏覽器完成相關自動化的操作

和爬蟲的關聯：

模擬登陸
可以獲取動態加載的頁面數據

編碼流程：

導包
實例化瀏覽器對象（驅動）
制定相關自動化的行為動作

環境安裝

下載安裝selenium：pip install selenium
下載瀏覽器驅動程序：
- http://chromedriver.storage.googleapis.com/index.html
查看驅動和瀏覽器版本的映射關系：
- http://blog.csdn.net/huilan_same/article/details/51896672

簡單使用/效果展示

01：

from selenium import webdriver from time import sleep bro = webdriver.Chrome(executable_path='./chromedriver.exe') bro.get('https://www.baidu.com') #獲取的連接頁面 sleep(2) #標簽定位 tag_input = bro.find_element_by_id('kw') tag_input.send_keys('人民幣') #標簽中輸入值 sleep(2)btn = bro.find_element_by_id('su') btn.click() #標簽點擊事件 sleep(2)bro.quit() #退出

02.

from selenium import webdriver from time import sleep bro = webdriver.Chrome(executable_path='./chromedriver.exe')bro.get('https://xueqiu.com/') sleep(5)#執行js實現滾輪向下滑動 js = 'window.scrollTo(0,document.body.scrollHeight)' #兩個參數一個是X軸，一個是y軸，此時用的是Y軸 bro.execute_script(js) sleep(2) bro.execute_script(js) sleep(2) bro.execute_script(js) sleep(2) bro.execute_script(js) sleep(2)a_tag = bro.find_element_by_xpath('//*[@id="app"]/div[3]/div/div[1]/div[2]/div[2]/a') a_tag.click() sleep(5) #獲取當前瀏覽器頁面數據(動態) print(bro.page_source) bro.quit()

PhantomJs及谷歌無頭瀏覽器無可視化：

#PhantomJs是一款無可視化界面的瀏覽器（免安裝）from selenium import webdriverfrom time import sleepbro = webdriver.PhantomJS(executable_path=r'C:\Users\Administrator\Desktop\爬蟲+數據\爬蟲day03\phantomjs-2.1.1-windows\bin\phantomjs.exe')bro.get('https://xueqiu.com/')sleep(2)bro.save_screenshot('./1.png')#執行js實現滾輪向下滑動 js = 'window.scrollTo(0,document.body.scrollHeight)'bro.execute_script(js)sleep(2)bro.execute_script(js)sleep(2)bro.execute_script(js)sleep(2)bro.execute_script(js)sleep(2)bro.save_screenshot('./2.png')# a_tag = bro.find_element_by_xpath('//*[@id="app"]/div[3]/div/div[1]/div[2]/div[2]/a')# bro.save_screenshot('./2.png')# a_tag.click() sleep(2)#獲取當前瀏覽器頁面數據(動態)print(bro.page_source)bro.quit()現在用的很少，知道即可

from selenium import webdriverfrom time import sleepfrom selenium.webdriver.chrome.options import Options# 創建一個參數對象，用來控制chrome以無界面模式打開 chrome_options = Options()chrome_options.add_argument('--headless')chrome_options.add_argument('--disable-gpu')bro = webdriver.Chrome(executable_path='./chromedriver.exe',options=chrome_options)bro.get('https://www.baidu.com')sleep(2)bro.save_screenshot('1.png')#標簽定位 tag_input = bro.find_element_by_id('kw')tag_input.send_keys('人民幣')sleep(2)btn = bro.find_element_by_id('su')btn.click()sleep(2)print(bro.page_source)bro.quit()谷歌無頭瀏覽器

轉載于:https://www.cnblogs.com/pythonz/p/10933858.html

總結

以上是生活随笔為你收集整理的爬虫之Selenium的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。