基于ThinkPHP5 使用QueryList爬取 并存入mysql数据库
生活随笔
收集整理的這篇文章主要介紹了
基于ThinkPHP5 使用QueryList爬取 并存入mysql数据库
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
QueryList4教程 地址:
https://doc.querylist.cc/site/index/doc/45在ThinkPHP5代碼根目錄執行composer命令安裝QueryList:
composer require jaeger/querylist如果出現 以下錯誤
Loading composer repositories with package information Updating dependencies (including require-dev)Authentication required (packagist.phpcomposer.com):Username:出現這樣的 情況
使用
composer config -g repo.packagist composer https://packagist.laravel-china.org1-下面演示在Index控制器中使用QueryList:
use QL\QueryList;public function qulist(){$data = QueryList::get('http://maoyan.com/board/4')// 設置采集規則->rules([// 爬取圖片地址"src"=>array(".board-wrapper dd img.board-img","data-src"),// 爬取電影名"name"=>array(".board-wrapper dd .movie-item-info .name","html"),// 爬取電影主演信息"star"=>array(".board-wrapper dd .movie-item-info .star","html"),// 爬取上映時間"releasetime"=>array(".board-wrapper dd .movie-item-info .releasetime","html"),])->query()->getData();$excel_array=$data->all();$city = [];foreach($excel_array as $k=>$v) {$city[$k]['src'] = $v['src'];$city[$k]['name'] = $v['name'];$city[$k]['star'] = $v['star'];$city[$k]['releasetime'] = $v['releasetime'];}Db::name("article")->insertAll($city);}如果沒有錯的 則插入到數據庫的
截圖
2-如果想繼續抓取下一頁 數據 則要根據規律來取數據
public function qulist(){for($i=0;$i<20;$i++){$page=$i*10;$data = QueryList::get('http://maoyan.com/board/4?offset='.$page)// 設置采集規則->rules([// 爬取圖片地址"src"=>array(".board-wrapper dd img.board-img","data-src"),// 爬取電影名"name"=>array(".board-wrapper dd .movie-item-info .name","html"),// 爬取電影主演信息"star"=>array(".board-wrapper dd .movie-item-info .star","html"),// 爬取上映時間"releasetime"=>array(".board-wrapper dd .movie-item-info .releasetime","html"),])->query()->getData();$excel_array=$data->all();$city = [];foreach($excel_array as $k=>$v) {$city[$k]['src'] = $v['src'];$city[$k]['name'] = $v['name'];$city[$k]['star'] = $v['star'];$city[$k]['releasetime'] = $v['releasetime'];}Db::name("article")->insertAll($city);}}3- 繼續抓取下一頁 數據 并判斷數據庫是否存在 存在不爬蟲 不存在繼續填滿 ,并且休眠幾秒再爬取
public function qulist(){set_time_limit(0); //防止程序響應30秒后 報錯for($i=0;$i<20;$i++){$page=$i*10;$data = QueryList::get('http://maoyan.com/board/4?offset='.$page)// 設置采集規則->rules([// 爬取圖片地址"src"=>array(".board-wrapper dd img.board-img","data-src"),// 爬取電影名"name"=>array(".board-wrapper dd .movie-item-info .name","html"),// 爬取電影主演信息"star"=>array(".board-wrapper dd .movie-item-info .star","html"),// 爬取上映時間"releasetime"=>array(".board-wrapper dd .movie-item-info .releasetime","html"),])->query()->getData();$excel_array=$data->all();$city = [];foreach($excel_array as $k=>$v) {$find = Db::name("article")->where("src", "=",$excel_array[$k]["src"])->find();if (empty($find)) {$city[$k]['src'] = $v['src'];$city[$k]['name'] = $v['name'];$city[$k]['star'] = $v['star'];$city[$k]['releasetime'] = $v['releasetime'];}}if (!empty($city)){Db::name("article")->insertAll($city);}/** 暫停時間 為2秒執行*/sleep(2);}}總結
以上是生活随笔為你收集整理的基于ThinkPHP5 使用QueryList爬取 并存入mysql数据库的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 翟婉明院士:中国高铁发展面临的科技挑战与
- 下一篇: 如何解决连接共享打印机时需要输入密码的问