微信公众号文章跨域展示
幫朋友做了個整站,更新新聞的時候他用不慣我寫的后臺,老是發微信公眾號,讓我幫忙發到網站上,我覺得太麻煩了,就寫了個調用的方法。
微信公眾號自帶保護,不讓跨域訪問。
只能用http://cors-anywhere.herokuapp.com/跨域訪問接口。
http://cors-anywhere.herokuapp.com/+公眾號文章地址就可以得到文章的源代碼,加以正則匹配就可以得到文章的標題,正文。
匹配標題: /<h2 class=\"rich_media_title\" id=\"activity-name\">([\s\S]*)<\/h2>/ig
匹配正文: /js_content\">([\s\S]*?)<\/div>/ig
最后需要做的是把圖片的路徑用正則替換一下,因為公眾號采取的是lazy-load,只有瀏覽到了才會加載,result.replace(/data-src="/g,'src="http://img01.store.sogou.com/net/a/04/link?appid=100520029&url=')。
其中替換后sogou部分的網址是加載圖片的API,不加這個的話,同樣會提示禁止站外加載圖片。
效果如下:
https://codepen.io/2bt/full/joBKJg
(2021年更新)
前端的方法API總是失效,現在改用本地PHP進行Curl訪問圖片,再封裝內容給到前端。
源碼如下,需要自己調整一下正則和輸出的樣式:
<?php class WxCrawler {//微信內容div正則private $wxContentDiv = '/<div class="rich_media_content " id="js_content" style="visibility: hidden;">(.*?)<\/div>/s';//微信圖片樣式private $imageStyle = 'style="width: 100% !important;height: auto !important;visibility: visible !important;"';/*** 爬取內容* @param $url* @return false|string* @author bignerd* @since 2016-08-16T10:13:58+0800*/private function _get($url){$context = stream_context_create(array('http'=> array('header'=>'Connection:close')));return file_get_contents($url,false,$context);}public function crawByUrl($url){$content = $this->_get($url);$basicInfo = $this->articleBasicInfo($content);list($content_html, $content_text) = $this->contentHandle($content);return array_merge($basicInfo,['content_html' => $content_html,'content_text' => $content_text]);}/*** 處理微信文章源碼,提取文章主體,處理圖片鏈接* @author bignerd* @since 2016-08-16T15:59:27+0800* @param $content 抓取的微信文章源碼* @return [帶圖html文本,無圖html文本]*/private function contentHandle($content){$content_html_pattern = $this->wxContentDiv;preg_match_all($content_html_pattern, $content, $html_matchs);if(empty(array_filter($html_matchs))) {return http_response_code(404);exit();}$content_html = $html_matchs[0][0];//去除掉hidden隱藏$content_html = str_replace('style="visibility: hidden;"','',$content_html);//過濾掉iframe$content_html = preg_replace('/<iframe(.*?)<\/iframe>/','',$content_html);$path = 'article/';/** @var 帶圖片html文本 */$content_html = preg_replace_callback('/data-src="(.*?)"/', function($matches) use ($path){return 'src="https://img-blog.csdnimg.cn/2022010705000841016.png'.$this->getImg($matches[1]).'" '.$this->imageStyle;}, $content_html);//添加微信樣式$content_html = '<div style="max-width: 50%;margin-left: auto;margin-right: auto;transform: scale(1.5);transform-origin: top;">'.$content_html. '</div>';/** @var 無圖html文本 */$content_text = preg_replace('/<img.*?>/s','',$content_html);return [$content_html,$content_text];}/*** 獲取文章的基本信息* @author bignerd* @since 2016-08-16T17:16:32+0800* @param $content 文章詳情源碼* @return $basicInfo*/private function articleBasicInfo($content){//待獲取item$item = ['ct' => 'date',//發布時間'msg_title' => 'title',//標題'msg_desc' => 'digest',//描述'msg_link' => 'content_url',//文章鏈接'msg_cdn_url' => 'cover',//封面圖片鏈接'nickname' => 'wechatname',//公眾號名稱];$basicInfo = ['author' => '','copyright_stat' => '',];foreach ($item as $k => $v) {if($k == 'msg_title')$pattern = '/ var '.$k.' = (.*?)\.html\(false\);/s';else$pattern = '/ var '.$k.' = "(.*?)";/s';preg_match_all($pattern,$content,$matches);if(array_key_exists(1, $matches) && !empty($matches[1][0])){$basicInfo[$v] = $this->htmlTransform($matches[1][0]);}else{$basicInfo[$v] = '';}}return $basicInfo;}/*** 特殊字符轉換* @author bignerd* @since 2016-08-16T17:30:52+0800* @param $string* @return $string*/private function htmlTransform($string){$string = str_replace('"','"',$string);$string = str_replace('&','&',$string);$string = str_replace('amp;','',$string);$string = str_replace('<','<',$string);$string = str_replace('>','>',$string);$string = str_replace(' ',' ',$string);$string = str_replace("\\", '',$string);return $string;}/*** @param $url* @return string*/private function getImg($url){$refer = "http://www.qq.com/";$opt = ['http'=>['header'=>"Referer: " . $refer,"Connection:" => "close"]];$context = stream_context_create($opt);//接受數據流$file_contents = file_get_contents($url,false, $context);$imageSteam = Imagecreatefromstring($file_contents);$path = 'article/';//if(!file_exists($path))//mkdir($path,0777,true);//$fileName = time().rand(0,99999) . '.jpg';//生成新圖片//imagejpeg($imageSteam, $path . $fileName);return base64_encode($file_contents);} }ini_set('default_socket_timeout', 1);$url = $_GET['url']; $crawler = new WxCrawler(); $content = $crawler->crawByUrl($url);echo json_encode($content); ?>使用JS調用 axios.get('crawler.php?url=<?=$rs['url']?>')
總結
以上是生活随笔為你收集整理的微信公众号文章跨域展示的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: php微信群发41005,media d
- 下一篇: uniapp 自定义标题情况下,让标题和