【爬虫实战】9应用Python网络爬虫——利用Post定向爬取下载慕课MOOC视频
慕課MOOC視頻Post定向爬蟲
- 前言
- 下載中國大學MOOC視頻思路講解
- 下載中國大學MOOC視頻代碼講解
- 小結
前言是在分析為什么直接爬不行,需要用 POST,不感興趣可直接看思路
前言
以下內容為原創內容,歡迎參考與指正,歡迎借鑒,請標明出處即可
本文想爬取北京理工大學劉兆龍 、馮艷全 、石宏霆老師的大學物理典型問題解析—力學與熱學,網址如下:https://www.icourse163.org/learn/BIT-1001605006?tid=1460672441#/learn/content?type=detail&id=1236923009&cid=1256673028
在原網頁按F12進入開發者模式,發現我們需要的代碼塊為:
<div id="g-container">再往下層的:
<div id="g-body"> <div class="g-wrap f-cb"> <div class="g-mn1"> ……最后我們需要的是:
<video autoplay="" id="auto-id-1610848163145"><source src="https://mooc1vod.stu.126.net/nos/mp4/2016/11/24/1005374032_f9e9a7ba99504a9aa6e121572965d7a0_sd.mp4?ak=7909bff134372bffca53cdc2c17adc27a4c38c6336120510aea1ae1790819de8930cd88db9da48c02b04a7eb0a8f0d1e6fa059c082ad9815b8bd231838a240543059f726dc7bb86b92adbc3d5b34b132acf7000706a439679744c1c7146151614426afeac364f76a817da3b2623cd41e" type="video/mp4" id="auto-id-1610848163147"></video>后面的網站就是我們需要的視頻:也可以直接寫成https://mooc1vod.stu.126.net/nos/mp4/2016/11/24/1005374032_f9e9a7ba99504a9aa6e121572965d7a0_sd.mp4
最簡單的爬取代碼如下:
import requestsdef getHTMLText(url):try:r = requests.get(url, timeout=30)print(r.status_code)r.raise_for_status() # 如果狀態不是200,引發HTTPError異常r.encoding = r.apparent_encodingreturn rexcept:print('爬取失敗')if __name__ == "__main__":url = 'https://www.icourse163.org/learn/BIT-1001605006?tid=1460672441#/learn/content?type=detail&id=1236923009&cid=1256673028'print(getHTMLText(url).text)結果如下:
200 <!DOCTYPE html> <script> …… # 太長了,前面省略 <div class="m-learnbox" id="courseLearn-inner-box"></div> <div class="m-recommend-content" id="j-recommend-content"></div> …… # 太長了,后面省略 </body> </html>注意到需要的是:
<div class="m-learnbox" id="courseLearn-inner-box"></div> <div class="m-recommend-content" id="j-recommend-content"></div>這兩段中間的以下內容,但是沒有爬取到,好像深度不夠?
需要下視頻可以直接通過以上url就下吧
問題分析,頁面的內容是js加載的動態信息,網上解決辦法:xpath、selenium、pyspider等,這里由于MOOC的POST內容與之前相比也有更新,參考github上一位博主的思路修改一下MOOC-Download 中國大學慕爬蟲,思路是通過Post來代替JS發送url參數
下載中國大學MOOC視頻思路講解
思路如下,可以跟著一步一步動手試一遍:
首先打開網站,大學物理典型問題解析—力學與熱學,F12進入開發者模式,再按ctrl+R刷新一下界面,點擊上方Network,只有以下的資源在運行,而且就卡著不動了
這就是收到了數據需要重新發送數據了吧猜測?
這時只需要按下F12關閉開發者模式,立馬再按F12打開就有了,(注意快速雙擊,間隔太長有的JS就發送或者接受完就沒了,錯誤如下圖所示1),點擊上方的XHR,
正確操作得到的XHR如下所示:
可以點一下第一個看一看,這就是Post的請求方式
點Response,看看響應得到的結果是什么
看一下沒有什么有用的結果,繼續往下翻,這個信息就有點用了,細心的會發現這個學期Id:termId就是我們網站的一項:https://www.icourse163.org/learn/BIT-1001605006?tid=1460672441#/learn/content?type=detail&id=1236923004&cid=1256673007
所以這也是我們爬取MOOC視頻的直接、唯一標識
再往下看:
這個sxx.chapterID和sxx.id就是之后以后需要post的數據了,為什么?再往下看
這個網站其實就是視頻所在的網站了,這個response還可以選擇清晰度以及格式(FLV、MP4),
.mp4之后的其實不加也可以,google瀏覽器打開之后如下所示,之后下載視頻就和下載圖像一樣簡單,可以參考實例4:網絡圖片的爬取和存儲
下面來說前文提到的這個:
這個sxx.chapterID和sxx.id就是之后以后需要post的數據了,為什么?再往下看
首先了解一下payload,這是是一種以JSON格式進行數據傳輸的一種方式,載著信息的那部分數據。通常在傳輸數據時,為了使數據傳輸更可靠,要把原始數據分批傳輸,并且在每一批數據的頭和尾都加上一定的輔助信息。可以理解為就是POST向url發送的
繼續看,返回第一個有用的response,看一下它的Headers,翻到最底下,看一下它的request Payload,這個劃重點,編程要用的,POST發的就是這個
同樣,看一下response是視頻url的request Payload:
網頁發的什么,得到的什么,搞清楚之后,接下來就是編程實現了
以上有一個地方有個坑(后來發現不是坑,是本人爬的時候MOOC沒有登陸,加上一個Headers就好了),參考[原創源碼] 中國大學 MOOC 免費課程及課件爬蟲 :
對比分析按F12得到的reques payload,這三個很關鍵的參數在上圖中沒找見
'c0-methodName':'getMocTermDto','c0-param1': 'number:1', # or 0'c0-param2': 'boolean:true',參考以上網站所以先下載fiddler試試,此過程叫抓包
fiddler安裝與使用(此過程后面發現其實不需要,就是一個瀏覽器F12看Network的過程,還不如瀏覽器好用呢
請直接往后看代碼部分,不用看fiddler有關內容
官網下載之后,安裝好之后界面如下所示:
參考Fiddler的安裝和使用教程(詳細)以及fiddler抓包教程,配置一下
下載錯了上面是什么Fiddler Everywhere……真是服了,下完之后圖標應該是下面的Foddler 4
再參考Fiddler抓包工具總結……
下載中國大學MOOC視頻代碼講解
前面踩了一點坑,下面正式開始寫代碼
至于清晰度、下載位置、格式美觀在這里不是重點,我也不寫了,重點講基本的爬取并下載MOOC視頻的代碼感謝python爬蟲(四)爬蟲的溯源(爬取mooc某個系列課程)讓我知道POST的正確用法,是可以加headers的,這樣就不用登陸了,我說不登錄怎么getLastLearnedMocTermDto,獲得上次學習內容呢
首先簡單的代碼如下所示:
import requests import re import time# request Payload data_get_ID = { # https://www.icourse163.org/dwr/call/plaincall/CourseBean.getLastLearnedMocTermDto.dwr'callCount': '1','scriptSessionId': '${scriptSessionId}190','httpSessionId': 'e41c678f2cb044c0a12c4778412c7344','c0-scriptName': 'CourseBean','c0-methodName': 'getLastLearnedMocTermDto',# 'c0-methodName':'getMocTermDto', # 注釋掉的是MOOC更新之前的寫法,現在也還能用,但是不知道MOOC更新前POST的內容,就沒多大參考價值'c0-id': '0','c0-param0': 'number:1460672441',# 'c0-param1': 'number:1',# 'c0-param2': 'boolean:true','batchId': '1611048630796' }# request Payload data_get_VideoUrl = { # https://www.icourse163.org/dwr/call/plaincall/CourseBean.getLessonUnitLearnVo.dwr'callCount': '1','scriptSessionId': '${scriptSessionId}190','httpSessionId': 'e41c678f2cb044c0a12c4778412c7344','c0-scriptName': 'CourseBean','c0-methodName': 'getLessonUnitLearnVo','c0-id': '0','c0-param0': 'number:1005926247','c0-param1': 'number:1','c0-param2': 'number:0','c0-param3': 'number:1256673007','batchId': '1611048630815' }def batch_id():return round(time.time() * 1000)def postHTMLText(url, headers=None, data = None):try:r = requests.post(url,headers=headers, data=data) # 注意,這里是POSTr.raise_for_status()r.encoding = r.apparent_encodingreturn r.textexcept:print('Get HTML Error!')return ''def get_ID_func(url, tid):data_get_ID['c0-param0'] = 'number:{}'.format(tid)data_get_ID['batchId'] = batch_id() # 雖然參考這里還加了一個隨機函數產生批處理時間作為'batchId'的參數,個人覺得沒有必要kv = {'cookie': 'EDUWEBDEVICE=0ab7115b3d854dae8ce1b0d71a1c966a; WM_TID=g5T0XKWOfThBQUFFAQI%2BPb3bZ726b4qo; __yadk_uid=MbnLeOb6ON3kezNoejvPZ46bbsVQNVq7; bpmns=1; MOOC_PRIVACY_INFO_APPROVED=true; hasVolume=true; videoVolume=0.8; WM_NI=4nhovjBd8NMvGeMBnAwhablVmi43V9kTLY8augfa4lpHGMj21Vz3g2oz3nACXnlCJIfo0ptu6Taonh3idP9qGnB%2FbqbpowUvtYBUz6dHRcdX%2BGfV9JAxBoSHLEMy4tt%2BbW4%3D; WM_NIKE=9ca17ae2e6ffcda170e2e6eed0f63489b5a1adee6288928fb6c45f939f8fbbf56ebcba969bef52a3898a85d02af0fea7c3b92af8b2a890f56282bcf8afd07e85bb83a3ed4bb5b8af83ef5eafb7fcd4d533a98c84babb6bb7b9aa82b768b49889aee93af38ab6d4e63aada6fd8fc4499cb2a4a4c943a6b10091f77ff68caf90ed64f1b7acb1bc7b85b28ed0e17ba2b0a198d87eb8878eb6f93ea8989e84e754e999aab7bc4283a88e90d244909b839ab780b8f5aeb8b737e2a3; CLIENT_IP=124.64.19.205; hb_MA-A976-948FFA05E931_source=cn.bing.com; NTESSTUDYSI=78e13ab1a4f442abb8b46c840eec8586; STUDY_INFO="1967325755@qq.com|11|1030121983|1611067128305"; STUDY_SESS="n1oSkC6ko6uwOFVCPjH6m8uTeduOsKbkwyDflnyQJXOl3vVC3smdubFKwXv7eRBs1zJXtNgUxjw/2K+9/6Riu/2HLxIDVPCEj+mrsdXzBxFHs70kcq5hQnJD9qU5wP0+F6d4HXTPLYMQLs0DjTlrR00af401D3EvKQd5Q0jzPgknppr6KrivyjY6FmKs/Qou"; STUDY_PERSIST="aB3nOkxcPNgwAe5iJFKKSMbImXt1xp+axKXoEPw5DZ3lUpL+NmaTG1meuRlRql2LBBR5z8N4snwDLC1Hk/JCU63udBm/atzzoMz64SOql9klhAV7c93tBmAyJI0XyhAuO+LDeVP0/uSOQ09BRmYkwxE9FxMvdcDjX+4jCmicQa9q7M6nWWwj41VWB7QksJLLV5YWrvhR05r0yr9iVIzOallWsU/2+Zmtx30Ukv4njbZOSdFA6J5jrZLCRv8JU8qN8WQLi3xTJ45sq/acjsEWiA=="; NETEASE_WDA_UID=1030121983#|#1505730973277; utm="eyJjIjoiIiwiY3QiOiIiLCJpIjoiIiwibSI6IiIsInMiOiIiLCJ0IjoiIn0=|aHR0cHM6Ly9jbi5iaW5nLmNvbS8="; Hm_lvt_77dc9a9d49448cf5e629e5bebaa5500b=1611060574,1611060970,1611066948,1611068902; Hm_lpvt_77dc9a9d49448cf5e629e5bebaa5500b=1611069210','user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'}try:HTML = postHTMLText(url, headers=kv, data=data_get_ID)print(HTML)except requests.HTTPError:print('Get ID Error!')raisedef main():tid = 1460672441 # 修改為你想下載的課程的tidget_ID_url = 'https://www.icourse163.org/dwr/call/plaincall/CourseBean.getLastLearnedMocTermDto.dwr'get_ID_func(get_ID_url, int(tid))if __name__ == '__main__':main()以上代碼很好懂,看過requests庫的都知道,就是用POST往指定url發了個Payload,結果:
//#DWR-INSERT //#DWR-REPLY var s0={};var s1=[];var s3={};var s13=[];var s16={};var s17={};var s14=[];var s18={};var s22=[];var s23={};var s24={};var s25={};var s26={};var s27={};var s28={};var s19={};var s29=[];var s30={};var s31={};var s32={};var s33={};var s34={};var s35={};var s20={};var s36=[];var s37={};var s38={};var s39={};var s40={};var s21={};var s41=[];var s42={};var s43={};var s44={};var s45={};var s46={};var s15=[];var s4={};var s47=[];var s50={};var s51={};var s48=[];var s52={};var s56=[];var s57={};var s58={};var s59={};var s60={};var s61={};var s62={};var s53={};var s63=[];var s64={};var s65={};var s66={};var s67={};var s68={};var s54={};var s69=[];var s70={};var s71={};var s72={};var s73={};var s55={};var s74=[];var s75={};var s76={};var s77={};var s78={};var s49=[];var s5={};var s79=[];var s82={};var s83={};var s80=[];var s84={};var s88=[];var s89={};var s90={};var s91={};var s92={};var s93={};var s94={};var s85={};var s95=[];var s96={};var s97={};var s98={};var s99={};var s100={};var s86={};var s101=[];var s102={};var s103={};var s104={};var s105={};var s106={};var s107={};var s87={};var s108=[];var s109={};var s110={};var s111={};var s112={};var s113={};var s114={};var s115={};var s81=[];var s6={};var s116=[];var s119={};var s120={};var s117=[];var s121={};var s125=[];var s126={};var s127={};var s128={};var s129={};var s130={};var s131={};var s122={};var s132=[];var s133={};var s134={};var s135={};var s136={};var s123={};var s137=[];var s138={};var s139={};var s140={};var s141={};var s142={};var s124={};var s143=[];var s144={};var s145={};var s146={};var s147={};var s118=[];var s7={};var s148=[];var s151={};var s152={};var s149=[];var s153={};var s157=[];var s158={};var s159={};var s160={};var s161={};var s162={};var s163={};var s154={};var s164=[];var s165={};var s166={};var s167={};var s168={};var s169={};var s155={};var s170=[];var s171={};var s172={};var s173={};var s174={};var s175={};var s156={};var s176=[];var s177={};var s178={};var s179={};var s180={};var s181={};var s150=[];var s8={};var s182=[];var s185={};var s186={};var s183=[];var s187={};var s191=[];var s192={};var s193={};var s194={};var s195={};var s196={};var s197={};var s198={};var s188={};var s199=[];var s200={};var s201={};var s202={};var s203={};var s189={};var s204=[];var s205={};var s206={};var s207={};var s208={};var s209={};var s210={};var s211={};var s190={};var s212=[];var s213={};var s214={};var s215={};var s216={};var s217={};var s218={};var s219={};var s220={};var s221={};var s222={};var s184=[];var s9={};var s223=[];var s226={};var s227={};var s224=[];var s228={};var s232=[];var s233={};var s234={};var s235={};var s236={};var s237={};var s238={};var s239={};var s240={};var s229={};var s241=[];var s242={};var s243={};var s244={};var s245={};var s246={};var s230={};var s247=[];var s248={};var s249={};var s250={};var s251={};var s252={};var s231={};var s253=[];var s254={};var s255={};var s256={};var s257={};var s258={};var s259={};var s225=[];var s10={};var s260=[];var s263={};var s264={};var s261=[];var s265={};var s269=[];var s270={};var s271={};var s272={};var s273={};var s274={};var s275={};var s266={};var s276=[];var s277={};var s278={};var s279={};var s280={};var s281={};var s282={};var s283={};var s267={};var s284=[];var s285={};var s286={};var s287={};var s288={};var s289={};var s290={};var s291={};var s268={};var s292=[];var s293={};var s294={};var s295={};var s296={};var s297={};var s298={};var s299={};var s262=[];var s11={};var s300=[];var s303={};var s304={};var s301=[];var s305={};var s309=[];var s310={};var s311={};var s312={};var s313={};var s314={};var s315={};var s316={};var s306={};var s317=[];var s318={};var s319={};var s320={};var s321={};var s322={};var s323={};var s307={};var s324=[];var s325={};var s326={};var s327={};var s328={};var s329={};var s330={};var s331={};var s332={};var s308={};var s333=[];var s334={};var s335={};var s336={};var s337={};var s338={};var s339={};var s302=[];var s12={};var s340=[];var s343={};var s344={};var s341=[];var s345={};var s349=[];var s350={};var s351={};var s352={};var s353={};var s354={};var s355={};var s356={};var s346={};var s357=[];var s358={};var s359={};var s360={};var s361={};var s362={};var s363={};var s347={};var s364=[];var s365={};var s366={};var s367={};var s368={};var s369={};var s370={};var s348={};var s371=[];var s372={};var s373={};var s374={};var s375={};var s376={};var s377={};var s378={};var s379={};var s380={};var s342=[];var s2=[];var s381={};var s382={};var s384=[];var s383={};s0.achievementStatus=null;s0.announcementDtos=null;s0.applyMoocStatus=null;s0.bgKnowledge=null;s0.bigPhoto=null;s0.channel=1;s0.chapters=s1;s0.chargeableCert=null;s0.chiefLector=null;s0.chiefLectorDto=null;s0.closeVisableStatus=1;s0.copied=null;s0.copyTime=null;s0.courseId=1001605006;s0.courseLoad=null;s0.courseName="\u5927\u5B66\u7269\u7406\u5178\u578B\u95EE\u9898\u89E3\u6790\u2014\u529B\u5B66\u4E0E\u70ED\u5B66";s0.courseStyle=null;s0.coverPhoto="";s0.description=null;s0.descriptionForCert=null;s0.detailDraftStatus=0;s0.duration=null;s0.end=false;s0.endTime=1610897400000;s0.enrollCount=null;s0.enrolled=null;s0.exams=s2;s0.extraInfo=null;s0.faq=null;s0.firstPublishTime=null;s0.fromTermId=null;s0.fromTermMode=null;s0.gmtCreate=null;s0.gmtModified=null;s0.hasFreePreviewVideo=false;s0.hasPaid=null;s0.id=1460672441;s0.jsonContent=null;s0.lessonsCount=null;s0.mobDescription=null;s0.mode=0;s0.needPassword=false;s0.originCopyRightTermId=null;s0.originalPrice=null;s0.outLineStructureDtos=null;s0.outline=null;s0.outlineStructure=null;s0.previousCourseDtos=null;s0.price=null;s0.productType=null;s0.publishStatus=null;s0.reommendRead=null;s0.requirements=null;s0.requirementsForCert=null;s0.schoolId=8007;s0.smallPhoto=null;s0.specialChargeableTerm=false;s0.staffAssistDtos=null;s0.staffAssists=null;s0.staffLectorDtos=null;s0.staffLectors=null;s0.start=false;s0.startTime=1600048800000;s0.target=null;s0.timeToFreeze=null;s0.times=1;s0.videoId=null; s1[0]=s3;s1[1]=s4;s1[2]=s5;s1[3]=s6;s1[4]=s7;s1[5]=s8;s1[6]=s9;s1[7]=s10;s1[8]=s11;s1[9]=s12; s3.contentId=null;s3.contentType=1;s3.draftStatus=0;s3.gmtCreate=1599924675803;s3.gmtModified=1599924675803;s3.hasFreePreviewVideo=false;s3.homeworks=s13;s3.id=1222604001;s3.lessons=s14;s3.name="\u7B2C1\u5468 \u8FD0\u52A8\u5B66";s3.position=-1;s3.published=true;s3.quizs=s15;s3.releaseTime=1600048800000;s3.termId=1460672441; s13[0]=s16; s16.chapterId=1222604001;s16.contentId=1236227401;s16.contentType=3;s16.gmtCreate=1599924694187;s16.gmtModified=1599924694187;s16.id=1236923008;s16.isTestChecked=true;s16.name="\u7B2C1\u5468\u4F5C\u4E1A";s16.position=0;s16.releaseTime=1600050600000;s16.termId=1460672441;s16.test=s17;s16.testDraftStatus=0;s16.units=null;s16.viewStatus=0; s17.bonusScore=null;s17.deadline=1602428400000;s17.enableEvaluation=false;s17.evaluateEnd=1603035000000;s17.evaluateJudgeType=1;s17.evaluateNeedTrain=2;s17.evaluateScoreReleaseTime=1603679400000;s17.evaluateStart=1602430200000;s17.examId=-1;s17.id=1236227401;s17.name="\u7B2C1\u5468\u4F5C\u4E1A";s17.releaseTime=1600050600000;s17.scorePubStatus=2;s17.testTime=null;s17.totalScore=45.00;s17.trytime=null;s17.type=3;s17.usedTryCount=0;s17.userScore=null; s14[0]=s18;s14[1]=s19;s14[2]=s20;s14[3]=s21; s18.chapterId=1222604001;s18.contentId=null;s18.contentType=1;s18.gmtCreate=1599924675805;s18.gmtModified=1599924675805;s18.id=1236923004;s18.isTestChecked=false;s18.name="\u7B2C1\u8BB2 \u8FD0\u52A8\u5B66-1";s18.position=0;s18.releaseTime=1600048800000;s18.termId=1460672441;s18.test=null;s18.testDraftStatus=0;s18.units=s22;s18.viewStatus=3; s22[0]=s23;s22[1]=s24;s22[2]=s25;s22[3]=s26;s22[4]=s27;s22[5]=s28; s23.anchorQuestions=null;s23.attachments=null;s23.chapterId=1222604001;s23.contentId=1004450070;s23.contentType=3;s23.durationInSeconds=null;s23.freePreview=0;s23.gmtCreate=1599924675813;s23.gmtModified=1611068927112;s23.id=1256673006;s23.jsonContent=null;s23.learnCount=null;s23.lessonId=1236923004;s23.live=null;s23.liveInfoDto=null;s23.name="\u7B2C1\u8BB2\u9898\u76EE";s23.position=0;s23.resourceInfo=null;s23.termId=1460672441;s23.unitId=null;s23.viewStatus=5; s24.anchorQuestions=null;s24.attachments=null;s24.chapterId=1222604001;s24.contentId=1005926247;s24.contentType=1;s24.durationInSeconds=null;s24.freePreview=0;s24.gmtCreate=1599924675816;s24.gmtModified=1611107391359;s24.id=1256673007;s24.jsonContent=null;s24.learnCount=null;s24.lessonId=1236923004;s24.live=null;s24.liveInfoDto=null;s24.name="\u5E8F\u4E0E\u4E3B\u8981\u77E5\u8BC6\u56DE\u987E";s24.position=1;s24.resourceInfo=null;s24.termId=1460672441;s24.unitId=null;s24.viewStatus=5; s25.anchorQuestions=null;s25.attachments=null;s25.chapterId=1222604001;s25.contentId=1005318028;s25.contentType=1;s25.durationInSeconds=null;s25.freePreview=0;s25.gmtCreate=1599924675818;s25.gmtModified=1611061788344;s25.id=1256673008;s25.jsonContent=null;s25.learnCount=null;s25.lessonId=1236923004;s25.live=null;s25.liveInfoDto=null;s25.name="\u4F4D\u7F6E\u77E2\u91CF\u4E0E\u4F4D\u79FB";s25.position=2;s25.resourceInfo=null;s25.termId=1460672441;s25.unitId=null;s25.viewStatus=5; s26.anchorQuestions=null;s26.attachments=null;s26.chapterId=1222604001;s26.contentId=1005316207;s26.contentType=1;s26.durationInSeconds=null;s26.freePreview=0;s26.gmtCreate=1599924675820;s26.gmtModified=1611061793811;s26.id=1256673009;s26.jsonContent=null;s26.learnCount=null;s26.lessonId=1236923004;s26.live=null;s26.liveInfoDto=null;s26.name="\u629B\u4F53\u8FD0\u52A8\u7684\u52A0\u901F\u5EA6";s26.position=3;s26.resourceInfo=null;s26.termId=1460672441;s26.unitId=null;s26.viewStatus=5; s27.anchorQuestions=null;s27.attachments=null;s27.chapterId=1222604001;s27.contentId=1005315192;s27.contentType=1;s27.durationInSeconds=null;s27.freePreview=0;s27.gmtCreate=1599924675823;s27.gmtModified=1599924675823;s27.id=1256673010;s27.jsonContent=null;s27.learnCount=null;s27.lessonId=1236923004;s27.live=null;s27.liveInfoDto=null;s27.name="\u8F68\u9053\u7684\u66F2\u7387\u534A\u5F84";s27.position=4;s27.resourceInfo=null;s27.termId=1460672441;s27.unitId=null;s27.viewStatus=0; s28.anchorQuestions=null;s28.attachments=null;s28.chapterId=1222604001;s28.contentId=1004450071;s28.contentType=3;s28.durationInSeconds=null;s28.freePreview=0;s28.gmtCreate=1599924675825;s28.gmtModified=1599924675825;s28.id=1256673011;s28.jsonContent=null;s28.learnCount=null;s28.lessonId=1236923004;s28.live=null;s28.liveInfoDto=null;s28.name="\u7B2C1\u8BB2\u9898\u76EE\u89E3\u7B54";s28.position=5;s28.resourceInfo=null;s28.termId=1460672441;s28.unitId=null;s28.viewStatus=0; s19.chapterId=1222604001;s19.contentId=null;s19.contentType=1;s19.gmtCreate=1599924675826;s19.gmtModified=1599924675826;s19.id=1236923005;s19.isTestChecked=false;s19.name="\u7B2C2\u8BB2 \u8FD0\u52A8\u5B66-2";s19.position=1;s19.releaseTime=1600048800000;s19.termId=1460672441;s19.test=null;s19.testDraftStatus=0;s19.units=s29;s19.viewStatus=0; s29[0]=s30;s29[1]=s31;s29[2]=s32;s29[3]=s33;s29[4]=s34;s29[5]=s35; s30.anchorQuestions=null;s30.attachments=null;s30.chapterId=1222604001;s30.contentId=1004449063;s30.contentType=3;s30.durationInSeconds=null;s30.freePreview=0;s30.gmtCreate=1599924675829;s30.gmtModified=1599924675829;s30.id=1256673012;s30.jsonContent=null;s30.learnCount=null;s30.lessonId=1236923005;s30.live=null;s30.liveInfoDto=null;s30.name="\u7B2C2\u8BB2\u9898\u76EE";s30.position=6;s30.resourceInfo=null;s30.termId=1460672441;s30.unitId=null;s30.viewStatus=0;…… # 太多了,省略s382.allowUpload=null;s382.analyseSetting=2;s382.avgScore=0.0;s382.chapterId=-1;s382.deadline=1610119800000;s382.description=null;s382.draftStatus=0;s382.evaluateEnd=null;s382.evaluateJudgeType=2;s382.evaluateNeedTrain=null;s382.evaluateScoreReleaseTime=1610244000000;s382.evaluateStart=null;s382.examId=1219222001;s382.gmtCreate=1609210143379;s382.gmtModified=1610325870732;s382.id=1236227411;s382.isRandom=null;s382.mutualEvaluated=false;s382.name="\u300A\u5927\u5B66\u7269\u7406\u5178\u578B\u95EE\u9898\u89E3\u6790\u2014\u529B\u5B66\u4E0E\u70ED\u5B66\u300B\u671F\u672B\u8003\u8BD5";s382.objTotalScore=0.0;s382.objectiveQList=s384;s382.objectiveScoreType=2;s382.ojQuestionTrytime=-1;s382.positionInExam=null;s382.randomSetting=null;s382.releaseTime=1606701600000;s382.sbjTotalScore=100.0;s382.scorePubStatus=2;s382.showAnalysis=false;s382.subjectiveQList=null;s382.submitTestCount=1;s382.taskStatus="published";s382.termId=1460672441;s382.testRandomSetting=null;s382.testTime=14400;s382.totalScore=null;s382.trytime=1;s382.type=3;s382.userEffectStatus=null;s382.userScore=null;s382.userSubmitStatus=null;s383.bonusScore=null;s383.deadline=1610119800000;s383.enableEvaluation=false;s383.evaluateEnd=null;s383.evaluateJudgeType=2;s383.evaluateNeedTrain=null;s383.evaluateScoreReleaseTime=1610244000000;s383.evaluateStart=null;s383.examId=1219222001;s383.id=1236227411;s383.name="\u300A\u5927\u5B66\u7269\u7406\u5178\u578B\u95EE\u9898\u89E3\u6790\u2014\u529B\u5B66\u4E0E\u70ED\u5B66\u300B\u671F\u672B\u8003\u8BD5";s383.releaseTime=1606701600000;s383.scorePubStatus=2;s383.testTime=14400;s383.totalScore=100.0;s383.trytime=1;s383.type=3;s383.usedTryCount=0;s383.userScore=null; dwr.engine._remoteHandleCallback('1611107696147','0',{mocTermDto:s0,lastLearnUnitId:1256673007});由思路分析里面的圖,可知.contentId和.id就是我們需要的,下面用Re庫把這連個參數提出來,順便還有.name名稱
問題:
python 打印\u7B2C1\u8BB2\u9898\u76EE,參考Python輸出\u編碼將其轉換成中文和python: 關于解決’\u’開頭的字符串轉中文的方法
利用parse_ID函數整理一下爬取的.name、.contentId和.id并看一下結果
def parse_ID(html):ID_list = []tmp = []re_name = r'liveInfoDto=null;s\d*\.name=\".*\"'name_list = re.findall(re_name, html)for item in name_list:item = re.sub(r'liveInfoDto=null;s\d*\.name=\"', '', item)item = item.replace('\"', '')item = item.encode('utf-8').decode("unicode-escape")tmp.append(item)ID_list.append(tmp)tmp = []re_contentId = r'attachments=.*;s\d*.chapterId=\d*;s\d*.contentId=\d*'contentId = re.findall(re_contentId, html)for item in contentId:item = re.sub(r'attachments=.*;s\d*.chapterId=\d*;s\d*.contentId=', '', item)tmp.append(item)ID_list.append(tmp)tmp = []re_ID = r's\d*\.id=\d*;s\d*.jsonContent=.*;s\d*\.learnCount'ID = re.findall(re_ID, html)for item in ID:item = re.sub(r's\d*\.id=', '', item)item = re.sub(r';s\d*.jsonContent=.*;s\d*\.learnCount', '', item)tmp.append(item)ID_list.append(tmp)tmp = [] # 判斷下載類型是pdf還是視頻re_type = r's\d*\.contentId=\d*;s\d*\.contentType=\d;s\d*\.durationInSeconds'type = re.findall(re_type, html)for item in type:item = re.sub(r's\d*\.contentId=\d*;s\d*\.contentType=', '', item)item = re.sub(r';s\d*\.durationInSeconds', '', item)tmp.append(item)ID_list.append(tmp)transpose_ID_list = []if len(ID_list[0]) != len(ID_list[1]) or len(ID_list[0]) != len(ID_list[2]) or len(ID_list[0]) != len(ID_list[3]):print('Length Error!')return ''for i in range(len(ID_list[0])):transpose_ID_list.append([ID_list[0][i], ID_list[1][i], ID_list[2][i], ID_list[3][i]])return transpose_ID_list [['第1講題目', '1004450070', '1256673006', '3'], ['序與主要知識回顧', '1005926247', '1256673007', '1'], ['位置矢量與位移', '1005318028', '1256673008', '1'], ['拋體運動的加速度', '1005316207', '1256673009', '1'], ['軌道的曲率半徑', '1005315192', '1256673010', '1'], ['第1講題目解答', '1004450071', '1256673011', '3'], ['第2講題目', '1004449063', '1256673012', '3'], ['最基本的運動學問題', '1005314212', '1256673013', '1'], ['光點在河岸上的移動速率', '1005317230', '1256673014', '1'], ['旋輪線1', '1005317214', '1256673015', '1'], ['旋輪線2', '1007283343', '1256673016', '1'], ['第2講題目解答', '1005688183', '1256673017', '3'], ['第3講題目', '1004446077', '1256673018', '3'], ['跳水運動員入水后的運動——一類基本的運動學問題', '1005319219', '1256673019', '1'], ['猴子與獵人', '1005314353', '1256673020', '1'], ['第3講題目解答', '1004449064', '1256673021', '3'], ['第4講題目', '1004450072', '1256673022', '3'], ['滑輪邊緣處的速度與加速度', '1005315323', '1256673023', '1'], ['質點的圓周運動', '1005314356', '1256673024', '1'], ['相對運動', '1005391028', '1256673025', '1'], ['第4講題目解答', '1004448080', '1256673026', '3'], ['第5講題目', '1004446082', '1256673027', '3'], ['主要知識點回顧', '1005374032', '1256673028', '1'], ['轉動圓環上的小珠子', '1005373031', '1256673029', '1'], ['靜止在轉臺上的物塊受到的摩擦力', '1005376035', '1256673030', '1'], ['自由下落的單擺', '1005376036', '1256673031', '1'], ['第5講題目解答', '1004446083', '1256673032', '3'], ['第6講題目解答', '1004448083', '1256673033', '3'], ['受打擊后繩中張力', '1005376038', '1256673034', '1'], ['盤繞的繩索', '1005407239', '1256673035', '1'], ['基本的動力學問題', '1005398021', '1256673036', '1'], ['第6講題目解答', '1004449066', '1256673037', '3'], ['第7講題目', '1004448084', '1256673038', '3'], ['星系中的引力', '1005412218', '1256673039', '1'], ['滑輪組', '1005411240', '1256673040', '1'], ['第7講題目解答', '1006737813', '1256673041', '3'], ['第8講題目', '1004448086', '1256673042', '3'], ['當拋體受空氣阻力', '1005410273', '1256673043', '1'], ['拖拉與彈簧相連的物塊', '1005408278', '1256673044', '1'], ['第8講題目解答', '1004451071', '1256673045', '3'], ['第9講題目', '1004456045', '1256673046', '3'], ['主要知識點回顧', '1005408249', '1256673047', '1'], ['一道圖線題', '1005412230', '1256673048', '1'], ['三角形軌道上的質點', '1005319035', '1256673049', '1'], ['系 統 的 質 心', '1005314045', '1256673050', '1'], ['第9講題目解答', '1004650142', '1256673051', '3'], ['第10講題目', '1004457049', '1256673052', '3'], ['質心的計算', '1005412235', '1256673053', '1'], ['圓錐擺擺繩拉力的沖量', '1216163541', '1256673054', '1'], ['臺子的位移', '1005412240', '1256673055', '1'], ['第10講題目解答', '1220859328', '1256673056', '3'], ['第11講題目', '1004453050', '1256673057', '3'], ['提起繩子', '1005410288', '1256673058', '1'], ['變質量問題的一般動力學方程', '1005315040', '1256673059', '1'], ['探索火星', '1007018265', '1256673060', '1'], ['松開的繩子', '1005408266', '1256673061', '1'], ['第11講題目解答', '1004453051', '1256673062', '3'], ['題目', '1004453052', '1256673063', '3'], ['角動量部分主要知識點回顧', '1005317049', '1256673064', '1'], ['猴子與獵人(2)——角動量', '1005314053', '1256673065', '1'], ['圓 錐 擺 ——力矩', '1005316052', '1256673066', '1'], ['在豎直圓軌道上的滑動', '1005319041', '1256673067', '1'], ['人造地球衛星的運動——有心力', '1005410307', '1256673068', '1'], ['第12講題目解答', '1004457050', '1256673069', '3'], ['第13講題目', '1004457074', '1256673070', '3'], ['主要知識點回顧', '1005317082', '1256673071', '1'], ['摩擦力的功', '1005318082', '1256673072', '1'], ['保守力的判定', '1005318087', '1256673073', '1'], ['下 落 的 鏈 條', '1005458032', '1256673074', '1'], ['第13講題目解答', '1004455066', '1256673075', '3'], ['第14講題目', '1004454075', '1256673076', '3'], ['系在彈簧上的滑塊', '1005407264', '1256673077', '1'], ['楔塊與滑塊的滑動', '1005363030', '1256673078', '1'], ['第14講題目解答', '1004455067', '1256673079', '3'], ['第15講題目', '1004457075', '1256673080', '3'], ['在豎直彈簧上的碰撞', '1005318096', '1256673081', '1'], ['圓軌道上的碰撞', '1005409179', '1256673082', '1'], ['粒子的質量', '1005314103', '1256673083', '1'], ['第15講解答', '1004456067', '1256673084', '3'], ['第16講題目', '1004455068', '1256673085', '3'], ['衛星到地面的最遠與最近距離', '1005488093', '1256673086', '1'], ['兩個質點', '1005407279', '1256673087', '1'], ['第16講解答', '1004456068', '1256673088', '3'], ['第17講題目', '1004454090', '1256673089', '3'], ['主要知識點回顧', '1005316114', '1256673090', '1'], ['一道基本的運動學題', '1005314109', '1256673091', '1'], ['轉動慣量的計算(1)', '1005318113', '1256673092', '1'], ['轉動慣量的計算(2)', '1005314115', '1256673093', '1'], ['第17講題目解答', '1004453096', '1256673094', '3'], ['第18講題目', '1004458088', '1256673095', '3'], ['一道基本題', '1005319117', '1256673096', '1'], ['滑輪問題', '1005318151', '1256673097', '1'], ['細桿的轉動', '1005317154', '1256673098', '1'], ['第18講題目解答', '1004455087', '1256673099', '3'], ['第19講題目', '1004457087', '1256673100', '3'], ['轉動的圓盤(1)', '1005318163', '1256673101', '1'], ['轉動的圓盤(2)', '1005314160', '1256673102', '1'], ['兩 輪 磨 合', '1005315151', '1256673103', '1'], ['第19講題目解答', '1004453097', '1256673104', '3'], ['第20講題目', '1004453098', '1256673105', '3'], ['誰更快呢?', '1005317164', '1256673106', '1'], ['子彈打細桿', '1005391034', '1256673107', '1'], ['細桿與小蟲', '1005410309', '1256673108', '1'], ['第20講題目解答', '1004455088', '1256673109', '3'], ['第21講題目', '1004455089', '1256673110', '3'], ['轉動的細管與滑動的小球 (1)', '1005408130', '1256673111', '1'], ['轉動的細管與滑動的小球 (2)', '1005409178', '1256673112', '1'], ['轉軸的作用力(1)', '1005489102', '1256673113', '1'], ['轉軸的作用力(2)', '1005492094', '1256673114', '1'], ['轉軸的作用力(3)', '1005411115', '1256673115', '1'], ['第21講題目解答', '1004455090', '1256673116', '3'], ['第22講題目', '1004455091', '1256673117', '3'], ['細桿的傾倒', '1005407137', '1256673118', '1'], ['小泥團與細桿的碰撞', '1005642170', '1256673119', '1'], ['第22講題目解答', '1004455092', '1256673120', '3'], ['題目', '1004751301', '1256673121', '3'], ['內容復習', '1006072406', '1256673122', '1'], ['Q1. 氧氣瓶中的氧氣', '1006068456', '1256673123', '1'], ['Q2. 水銀氣壓計中混入氣泡', '1006067434', '1256673124', '1'], ['Q3. 容器內壁吸附的氣體分子影響真空度', '1006070416', '1256673125', '1'], ['Q4. 混合氣體的宏觀狀態參量', '1006068457', '1256673126', '1'], ['解答', '1004751302', '1256673127', '3'], ['題目', '1004798039', '1256673128', '3'], ['內容復習', '1006070419', '1256673129', '1'], ['Q1. 氣體壓強的產生', '1006070420', '1256673130', '1'], ['Q2. 理想氣體的宏觀特征', '1006072413', '1256673131', '1'], ['Q3. 氣體宏觀量與微觀量的關聯', '1006070421', '1256673132', '1'], ['Q4. 分子速度分量的平均值', '1006196069', '1256673133', '1'], ['Q5. 特高溫下質子的動能和速率', '1006192084', '1256673134', '1'], ['Q6. 分子方均根速率與氣體宏觀量的關系', '1006194087', '1256673135', '1'], ['Q7. 溫度的微觀意義', '1006196070', '1256673136', '1'], ['解答', '1004851092', '1256673137', '3'], ['題目', '1004801011', '1256673138', '3'], ['內容復習', '1006131017', '1256673139', '1'], ['Q1. 分子的轉動對壓強沒有貢獻', '1006133019', '1256673140', '1'], ['Q2. 分子的平均動能與宏觀量的關系', '1006132019', '1256673141', '1'], ['Q3. 混合氣體的內能和溫度', '1006133016', '1256673142', '1'], ['Q4. “風”與“寒”', '1006134037', '1256673143', '1'], ['Q5. 高鐵車廂內的分子', '1006134012', '1256673144', '1'], ['解答', '1004801012', '1256673145', '3'], ['題目', '1004802009', '1256673146', '3'], ['內容復習', '1006130014', '1256673147', '1'], ['Q1. 含有速率分布函數的統計表達式', '1006134013', '1256673148', '1'], ['Q2. 方均根速率與平均速率的大小關系', '1006133020', '1256673149', '1'], ['解答', '1004798012', '1256673150', '3'], ['題目', '1004799008', '1256673151', '3'], ['內容復習', '1006131018', '1256673152', '1'], ['Q1. 求一個平均值', '1006132015', '1256673153', '1'], ['Q2. 簡化的麥克斯韋速率分布函數', '1006131019', '1256673154', '1'], ['解答', '1004798013', '1256673155', '3'], ['題目', '1004798014', '1256673156', '3'], ['Q1. 不同氣體分子的特征速率的比較', '1006133022', '1256673157', '1'], ['Q2. 兩個速率區間內兩個溫度下的統計', '1006134014', '1256673158', '1'], ['Q3. 分子按速率大小排序,排列一半時的速率是多少?', '1006192085', '1256673159', '1'], ['Q4. 按平動動能對分子進行統計', '1006129018', '1256673160', '1'], ['解答', '1004852073', '1256673161', '3'], ['題目', '1004803016', '1256673162', '3'], ['內容復習', '1006131020', '1256673163', '1'], ['Q1. 分子直徑及氣體壓強和溫度對平均自由程的影響', '1006132050', '1256673164', '1'], ['Q2. 分子擴散的平均速度', '1006134015', '1256673165', '1'], ['Q3. 氧氣在肺泡壁和毛細血管壁中的擴散', '1006130012', '1256673166', '1'], ['解答', '1004800031', '1256673167', '3'], ['題目', '1004799033', '1256673168', '3'], ['內容復習', '1006129021', '1256673169', '1'], ['Q1. 車胎爆裂的熱力學', '1006133015', '1256673170', '1'], ['Q2. 受熱膨脹的橡皮球', '1006130023', '1256673171', '1'], ['Q3. 用p-V圖中的面積來表示', '1006133017', '1256673172', '1'], ['Q4. p-V圖中負斜率直線過程的性質', '1006134019', '1256673173', '1'], ['解答', '1228999319', '1256673174', '3'], ['題目', '1004803023', '1256673175', '3'], ['內容復習', '1006134020', '1256673176', '1'], ['Q1. 理想氣體摩爾定壓熱容大于摩爾定體熱容的原因', '1006132028', '1256673177', '1'], ['Q2. 比熱容比與等體過程和等壓過程', '1006130011', '1256673178', '1'], ['Q3. 固體的熱容', '1006132030', '1256673179', '1'], ['Q4. p-V圖中負斜率直線過程的熱容', '1006129042', '1256673180', '1'], ['解答', '1229002307', '1256673181', '3'], ['題目', '1004802024', '1256673182', '3'], ['內容復習', '1006131037', '1256673183', '1'], ['Q1. 等體過程與等壓過程', '1006133018', '1256673184', '1'], ['Q2. 初末態分別相同的兩個過程(一)', '1006130035', '1256673185', '1'], ['Ans. 初末態分別相同的兩個過程(一)思考題解答', '1006133012', '1256673186', '1'], ['Q3. 初末態分別相同的兩個過程(二)', '1006129052', '1256673187', '1'], ['解答', '1004803024', '1256673188', '3'], ['題目', '1004801034', '1256673189', '3'], ['內容復習', '1006129055', '1256673190', '1'], ['Q1.用絕熱過程討論其他準靜態過程', '1006133039', '1256673191', '1'], ['Q2.壓縮氦氣出現過熱現象', '1006132018', '1256673192', '1'], ['Q3.準靜態絕熱過程與絕熱自由膨脹過程', '1006130013', '1256673193', '1'], ['Q4.單原子和雙原子分子氣體的絕熱過程', '1006133040', '1256673194', '1'], ['解答', '1004799029', '1256673195', '3'], ['題目', '1004801039', '1256673196', '3'], ['內容復習', '1006129572', '1256673197', '1'], ['Q1. 判斷一個過程吸熱還是放熱', '1006129571', '1256673198', '1'], ['Q2. 兩個卡諾循環比較', '1006132582', '1256673199', '1'], ['Q3. 兩個可逆熱機比較', '1006134539', '1256673200', '1'], ['解答', '1004801040', '1256673201', '3'], ['題目', '1004799034', '1256673202', '3'], ['內容復習', '1006131549', '1256673203', '1'], ['Q1. 由等體、等壓、絕熱過程構成的循環', '1006131550', '1256673204', '1'], ['Q2. 由等體、等壓、等溫、絕熱過程構成的循環', '1006129574', '1256673205', '1'], ['Q3. 狄塞爾熱機的效率', '1006130543', '1256673206', '1'], ['Q4. 三角形循環的效率', '1006133517', '1256673207', '1'], ['Q5. 用負斜率直線過程構成循環', '1006133516', '1256673208', '1'], ['解答', '1004801041', '1256673209', '3'], ['題目', '1004803355', '1256673210', '3'], ['內容復習', '1006133664', '1256673211', '1'], ['Q1. 熱循環與制冷循環的關系', '1006134546', '1256673212', '1'], ['Q2. 由等體過程和等溫過程構成的循環', '1006131551', '1256673213', '1'], ['Q3. 用空調器波保持室內恒溫', '1006131552', '1256673214', '1'], ['解答', '1004801371', '1256673215', '3'], ['題目', '1004847083', '1256673216', '3'], ['內容復習', '1006195074', '1256673217', '1'], ['Q1. 熱力學第一定律和第二定律', '1006193085', '1256673218', '1'], ['Q2. 熱力學第二定律的內涵', '1006195073', '1256673219', '1'], ['Q3. 熱力學過程方向性的討論', '1006192079', '1256673220', '1'], ['Q4. 證明電流生熱過程是不可逆過程', '1006192078', '1256673221', '1'], ['解答', '1004851078', '1256673222', '3'], ['題目', '1004847084', '1256673223', '3'], ['內容復習', '1006193089', '1256673224', '1'], ['Q1. 判斷可逆過程', '1006193086', '1256673225', '1'], ['Q2. 微觀機制與玻耳茲曼熵', '1006196066', '1256673226', '1'], ['Q3. 玻耳茲曼熵的一個示例', '1006193088', '1256673227', '1'], ['解答', '1004848071', '1256673228', '3'], ['題目', '1004847085', '1256673229', '3'], ['內容復習', '1006197079', '1256673230', '1'], ['Q1. 熵變的判斷', '1006195075', '1256673231', '1'], ['Q2. 熱機效率與熱力學第二定律', '1006193090', '1256673232', '1'], ['Q3. 溫熵圖', '1006195080', '1256673233', '1'], ['解答', '1004849079', '1256673234', '3'], ['題目', '1004847086', '1256673235', '3'], ['內容復習', '1006192128', '1256673236', '1'], ['Q1. p-V 圖中的可逆過程', '1006192129', '1256673237', '1'], ['Q2. 初態相同,末態相同的三個可逆過程', '1006195125', '1256673238', '1'], ['Q3. 可逆與不可逆過程的熵變', '1006192130', '1256673239', '1'], ['Q4. 物體在恒溫環境中熱傳導的熵變', '1006196105', '1256673240', '1'], ['Q5. 熱機引起的熵變', '1006192131', '1256673241', '1'], ['解答', '1004851079', '1256673242', '3'], ['結語', '1004852151', '1256673243', '3']]之后再Post一下contentID和ID就可以得到response 里的下載地址了
問題:
//#DWR-REPLY if (window.dwr) dwr.engine._remoteHandleBatchException({ name:'java.lang.NumberFormatException', message:'For input string: \"0=number\"' }); else if (window.parent.dwr) window.parent.dwr.engine._remoteHandleBatchException({ name:'java.lang.NumberFormatException', message:'For input string: \"0=number\"' });分析:原requests payload:
callCount=1 scriptSessionId=${scriptSessionId}190 httpSessionId=12f6d9e2101a4fcf8f9cce372aa5a138 c0-scriptName=CourseBean c0-methodName=getLessonUnitLearnVo c0-id=0 c0-param0=number:1004450070 c0-param1=number:3 c0-param2=number:0 c0-param3=number:1256673006 batchId=1611107366738我寫成了:
'callCount': '1','scriptSessionId': '${scriptSessionId}190','httpSessionId': 'e41c678f2cb044c0a12c4778412c7344','c0-scriptName': 'CourseBean','c0-methodName': 'getLessonUnitLearnVo','c0-id': '0','c0-param0': 'number:1005926247','c0-param1': 'number:1', # 1 for pdf or 3 or video'c0-param2': 'number:0','c0-param3': 'number:1256673007','batchId': '1611048630815'然后直接調用:
data_get_VideoUrl['c0-param0=number'] = ID_list[i][1]data_get_VideoUrl['c0-param3=number'] = ID_list[i][2]data_get_VideoUrl['c0-param1=number'] = ID_list[i][3]應該是:
data_get_VideoUrl['c0-param0'] = 'number:{}'.format(ID_list[i][1])data_get_VideoUrl['c0-param3'] = 'number:{}'.format(ID_list[i][2])data_get_VideoUrl['c0-param1'] = 'number:{}'.format(ID_list[i][3])獲得視頻url的結果形如下:
//#DWR-INSERT //#DWR-REPLY dwr.engine._remoteHandleCallback('1611048630815','0',{contentId:null,contentType:null,duration:null,hdMp4Url:null,htmlContent:null,id:null,learnedPosition:0,origSrtUrl:null,paper:null,parsedSrtUrl:null,post:null,randomKey:null,sdMp4Url:null,shdMp4Url:null,srtKeys:null,textOrigUrl:"http://nos.netease.com/edu-lesson-pdfsrc/A6EE5575B44A392358FE74FF27BDCC2D-1483014849819?download=%E5%8A%9B%E5%AD%A6%E7%AC%AC1%E8%AE%B2.pdf&Signature=%2BJkA76Kzjp3Dm8r50kjt1%2FGn49gLy1TTTRabCcrey0M%3D&Expires=1611115768&NOSAccessKeyId=7db2f370ff9a412987155d36d55a6ead",textPageWhRatio:1.41,textPages:4,textUrl:"http://nos.netease.com/edu-lesson-pdfsrc/A6EE5575B44A392358FE74FF27BDCC2D-1483014849819?download=%E5%8A%9B%E5%AD%A6%E7%AC%AC1%E8%AE%B2.pdf&Signature=%2BJkA76Kzjp3Dm8r50kjt1%2FGn49gLy1TTTRabCcrey0M%3D&Expires=1611115768&NOSAccessKeyId=7db2f370ff9a412987155d36d55a6ead",type:null,unitId:null,videoHDUrl:null,videoId:null,videoImgUrl:null,videoLearnTime:0,videoSHDUrl:null,videoUrl:null,videoVo:null});//#DWR-INSERT //#DWR-REPLY var s0={};s0.clientEncryptKeyVersion=null;s0.duration=435;s0.encrypt=false;s0.flvCaption=null;s0.flvHdUrl="http://v.stu.126.net/mooc-video/nos/flv/2017/03/10/1005926247_fb0d3f4509ec4c1d87a395d2aa5b5fb7_hd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554148dc25eb0604e9c5001f9b114ddf69aeff6a55d15491982836e42383e13363ebf8e6969298709bada9999184ddbb0211e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvSdUrl="http://v.stu.126.net/mooc-video/nos/flv/2017/03/10/1005926247_76e732106bed48a688d76ee2665596e8_sd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554148dc25eb0604e9c5001f9b114ddf69aeff6a55d15491982836e42383e13363ebf8e6969298709bada9999184ddbb0211e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvShdUrl="http://v.stu.126.net/mooc-video/nos/flv/2017/03/10/1005926247_ded27c2408f641deabf65f849bf7cd97_shd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554148dc25eb0604e9c5001f9b114ddf69aeff6a55d15491982836e42383e13363ebf8e6969298709bada9999184ddbb0211e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.isEncrypt=false;s0.key="a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554148dc25eb0604e9c5001f9b114ddf69aeff6a55d15491982836e42383e13363ebf8e6969298709bada9999184ddbb0211e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.m3u8HdSize=null;s0.m3u8HdUrl=null;s0.m3u8SdSize=null;s0.m3u8SdUrl=null;s0.m3u8ShdSize=null;s0.m3u8ShdUrl=null;s0.mp4Caption=null;s0.mp4HdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_d60d29305cd041d18f4f43afb90f8041_hd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554148dc25eb0604e9c5001f9b114ddf69aeff6a55d15491982836e42383e13363ebf8e6969298709bada9999184ddbb0211e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4SdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_561be13927f9499f89818520ed65507b_sd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554148dc25eb0604e9c5001f9b114ddf69aeff6a55d15491982836e42383e13363ebf8e6969298709bada9999184ddbb0211e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4ShdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_57a9189bfa2b4151b985836d9a039661_shd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554148dc25eb0604e9c5001f9b114ddf69aeff6a55d15491982836e42383e13363ebf8e6969298709bada9999184ddbb0211e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.needKeyTimeValidate=false;s0.playerCollection=3;s0.signature=null;s0.srtKeys=null;s0.start=9;s0.status=null;s0.videoDecryptData=null;s0.videoId=1005926247;s0.videoImgUrl="http://nos.netease.com/mooc-video/nos/mp4/2017/03/10/1005926247_big.jpg";s0.videoProtectedDataDto=null; dwr.engine._remoteHandleCallback('1611048630815','0',{contentId:null,contentType:null,duration:null,hdMp4Url:null,htmlContent:null,id:null,learnedPosition:0,origSrtUrl:null,paper:null,parsedSrtUrl:null,post:null,randomKey:null,sdMp4Url:null,shdMp4Url:null,srtKeys:null,textOrigUrl:"",textPageWhRatio:null,textPages:0,textUrl:"",type:null,unitId:null,videoHDUrl:null,videoId:null,videoImgUrl:null,videoLearnTime:9,videoSHDUrl:null,videoUrl:null,videoVo:s0});//#DWR-INSERT //#DWR-REPLY var s0={};s0.clientEncryptKeyVersion=null;s0.duration=355;s0.encrypt=false;s0.flvCaption=null;s0.flvHdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/16/1005318028_bd20f1f5131c4bf3a139599731ea1fb2_hd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff5546477e1296b8c0337b8338cff408493b7eff6a55d15491982836e42383e13363e82d4a91158d63a5c646d19f0320e737a1e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvSdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/16/1005318028_cab56664be6f47a7867ec46d9799fdf0_sd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff5546477e1296b8c0337b8338cff408493b7eff6a55d15491982836e42383e13363e82d4a91158d63a5c646d19f0320e737a1e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvShdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/16/1005318028_e3fd3e20ce3649d0863207cbb3baae2c_shd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff5546477e1296b8c0337b8338cff408493b7eff6a55d15491982836e42383e13363e82d4a91158d63a5c646d19f0320e737a1e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.isEncrypt=false;s0.key="a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff5546477e1296b8c0337b8338cff408493b7eff6a55d15491982836e42383e13363e82d4a91158d63a5c646d19f0320e737a1e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.m3u8HdSize=null;s0.m3u8HdUrl=null;s0.m3u8SdSize=null;s0.m3u8SdUrl=null;s0.m3u8ShdSize=null;s0.m3u8ShdUrl=null;s0.mp4Caption=null;s0.mp4HdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_454463d885e341f9868c7b6098b6661f_hd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff5546477e1296b8c0337b8338cff408493b7eff6a55d15491982836e42383e13363e82d4a91158d63a5c646d19f0320e737a1e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4SdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_8de73c893c8646ebb7515071b52477e9_sd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff5546477e1296b8c0337b8338cff408493b7eff6a55d15491982836e42383e13363e82d4a91158d63a5c646d19f0320e737a1e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4ShdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_688a167061fd4dfcabb9fec729eaa23d_shd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff5546477e1296b8c0337b8338cff408493b7eff6a55d15491982836e42383e13363e82d4a91158d63a5c646d19f0320e737a1e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.needKeyTimeValidate=false;s0.playerCollection=3;s0.signature=null;s0.srtKeys=null;s0.start=354;s0.status=null;s0.videoDecryptData=null;s0.videoId=1005318028;s0.videoImgUrl="http://nos.netease.com/mooc-video/nos/mp4/2016/11/16/1005318028_big.jpg";s0.videoProtectedDataDto=null; dwr.engine._remoteHandleCallback('1611048630815','0',{contentId:null,contentType:null,duration:null,hdMp4Url:null,htmlContent:null,id:null,learnedPosition:0,origSrtUrl:null,paper:null,parsedSrtUrl:null,post:null,randomKey:null,sdMp4Url:null,shdMp4Url:null,srtKeys:null,textOrigUrl:"",textPageWhRatio:null,textPages:0,textUrl:"",type:null,unitId:null,videoHDUrl:null,videoId:null,videoImgUrl:null,videoLearnTime:354,videoSHDUrl:null,videoUrl:null,videoVo:s0});//#DWR-INSERT //#DWR-REPLY var s0={};s0.clientEncryptKeyVersion=null;s0.duration=631;s0.encrypt=false;s0.flvCaption=null;s0.flvHdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/20/1005316207_1d842ca493464ad6a4b03e45ac8b7fdd_hd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff55467be39b7e9697f97dc297a31554ab895eff6a55d15491982836e42383e13363e3e37f4b2cce194b9104b3e081ca457631e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvSdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/20/1005316207_aa308ae709a8437bb52217607a4a42e5_sd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff55467be39b7e9697f97dc297a31554ab895eff6a55d15491982836e42383e13363e3e37f4b2cce194b9104b3e081ca457631e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvShdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/20/1005316207_91c0ab8d4ad34c64a866a0c5d5f16d6d_shd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff55467be39b7e9697f97dc297a31554ab895eff6a55d15491982836e42383e13363e3e37f4b2cce194b9104b3e081ca457631e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.isEncrypt=false;s0.key="a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff55467be39b7e9697f97dc297a31554ab895eff6a55d15491982836e42383e13363e3e37f4b2cce194b9104b3e081ca457631e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.m3u8HdSize=null;s0.m3u8HdUrl=null;s0.m3u8SdSize=null;s0.m3u8SdUrl=null;s0.m3u8ShdSize=null;s0.m3u8ShdUrl=null;s0.mp4Caption=null;s0.mp4HdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005316207_61b2b85a784e4e61a5bb5ac969beef37_hd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff55467be39b7e9697f97dc297a31554ab895eff6a55d15491982836e42383e13363e3e37f4b2cce194b9104b3e081ca457631e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4SdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005316207_71e392eea48e407785e8235303d96acb_sd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff55467be39b7e9697f97dc297a31554ab895eff6a55d15491982836e42383e13363e3e37f4b2cce194b9104b3e081ca457631e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4ShdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005316207_65c062fc1b2c4dc1a5a7cd87981c0c46_shd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff55467be39b7e9697f97dc297a31554ab895eff6a55d15491982836e42383e13363e3e37f4b2cce194b9104b3e081ca457631e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.needKeyTimeValidate=false;s0.playerCollection=3;s0.signature=null;s0.srtKeys=null;s0.start=0;s0.status=null;s0.videoDecryptData=null;s0.videoId=1005316207;s0.videoImgUrl="http://nos.netease.com/mooc-video/nos/mp4/2016/11/20/1005316207_big.jpg";s0.videoProtectedDataDto=null; dwr.engine._remoteHandleCallback('1611048630815','0',{contentId:null,contentType:null,duration:null,hdMp4Url:null,htmlContent:null,id:null,learnedPosition:0,origSrtUrl:null,paper:null,parsedSrtUrl:null,post:null,randomKey:null,sdMp4Url:null,shdMp4Url:null,srtKeys:null,textOrigUrl:"",textPageWhRatio:null,textPages:0,textUrl:"",type:null,unitId:null,videoHDUrl:null,videoId:null,videoImgUrl:null,videoLearnTime:0,videoSHDUrl:null,videoUrl:null,videoVo:s0});//#DWR-INSERT //#DWR-REPLY var s0={};s0.clientEncryptKeyVersion=null;s0.duration=409;s0.encrypt=false;s0.flvCaption=null;s0.flvHdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/20/1005315192_3772bb4206c64137ab6aa7bd45604e81_hd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554e60bd46899b2c0a1f4d3bafe2a03c9b1eff6a55d15491982836e42383e13363e94c2cd46489f81ab0a92c76cd46f55851e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvSdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/20/1005315192_0e6331176fff4a2dbafbbcbd6aff6d13_sd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554e60bd46899b2c0a1f4d3bafe2a03c9b1eff6a55d15491982836e42383e13363e94c2cd46489f81ab0a92c76cd46f55851e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.flvShdUrl="http://v.stu.126.net/mooc-video/nos/flv/2016/11/20/1005315192_95449440fe9e4cc7af9ba9f624aeb38b_shd.flv?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554e60bd46899b2c0a1f4d3bafe2a03c9b1eff6a55d15491982836e42383e13363e94c2cd46489f81ab0a92c76cd46f55851e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.isEncrypt=false;s0.key="a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554e60bd46899b2c0a1f4d3bafe2a03c9b1eff6a55d15491982836e42383e13363e94c2cd46489f81ab0a92c76cd46f55851e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.m3u8HdSize=null;s0.m3u8HdUrl=null;s0.m3u8SdSize=null;s0.m3u8SdUrl=null;s0.m3u8ShdSize=null;s0.m3u8ShdUrl=null;s0.mp4Caption=null;s0.mp4HdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005315192_492d3b65668d4ad9b45f89f5c4ac9220_hd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554e60bd46899b2c0a1f4d3bafe2a03c9b1eff6a55d15491982836e42383e13363e94c2cd46489f81ab0a92c76cd46f55851e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4SdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005315192_24fbaee5af9d401d9c02ffff8f48dc52_sd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554e60bd46899b2c0a1f4d3bafe2a03c9b1eff6a55d15491982836e42383e13363e94c2cd46489f81ab0a92c76cd46f55851e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.mp4ShdUrl="http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005315192_81dd9c14e18c49e39021ae6de73b0a65_shd.mp4?ak=a6e81b52a268083c03651bd30db8b2d156619810e0e8a92035426c84b70ff554e60bd46899b2c0a1f4d3bafe2a03c9b1eff6a55d15491982836e42383e13363e94c2cd46489f81ab0a92c76cd46f55851e46d140b7b30f910299bee40b26a5c2d9e1e3c44585e5de5b539ccdbe8423a821b91261e44e538d2765af73aa008299a7f5cc498d43fe59a782bc973c30c066b767da1f870bc890754ea6567cb70ca9830b67d08aac63e1ac0c534090a89323f6fd9d4e9030d5d8cb0cb4b5fcb8e77c";s0.needKeyTimeValidate=false;s0.playerCollection=3;s0.signature=null;s0.srtKeys=null;s0.start=0;s0.status=null;s0.videoDecryptData=null;s0.videoId=1005315192;s0.videoImgUrl="http://nos.netease.com/mooc-video/nos/mp4/2016/11/20/1005315192_big.jpg";s0.videoProtectedDataDto=null; dwr.engine._remoteHandleCallback('1611048630815','0',{contentId:null,contentType:null,duration:null,hdMp4Url:null,htmlContent:null,id:null,learnedPosition:1,origSrtUrl:null,paper:null,parsedSrtUrl:null,post:null,randomKey:null,sdMp4Url:null,shdMp4Url:null,srtKeys:null,textOrigUrl:"",textPageWhRatio:null,textPages:0,textUrl:"",type:null,unitId:null,videoHDUrl:null,videoId:null,videoImgUrl:null,videoLearnTime:0,videoSHDUrl:null,videoUrl:null,videoVo:s0});//#DWR-INSERT //#DWR-REPLY dwr.engine._remoteHandleCallback('1611048630815','0',{contentId:null,contentType:null,duration:null,hdMp4Url:null,htmlContent:null,id:null,learnedPosition:1,origSrtUrl:null,paper:null,parsedSrtUrl:null,post:null,randomKey:null,sdMp4Url:null,shdMp4Url:null,srtKeys:null,textOrigUrl:"http://nos.netease.com/edu-lesson-pdfsrc/428CBC84B647B86597A5B0C8CB68D11B-1483014878796?download=%E5%8A%9B%E5%AD%A6%E7%AC%AC1%E8%AE%B2%E9%A2%98%E7%9B%AE%E8%A7%A3%E7%AD%94.pdf&Signature=fZSsBxNZoCACEaXAh%2F5BGvxnDAD0epuuDJOZ7w80AME%3D&Expires=1611115770&NOSAccessKeyId=7db2f370ff9a412987155d36d55a6ead",textPageWhRatio:1.41,textPages:6,textUrl:"http://nos.netease.com/edu-lesson-pdfsrc/428CBC84B647B86597A5B0C8CB68D11B-1483014878796?download=%E5%8A%9B%E5%AD%A6%E7%AC%AC1%E8%AE%B2%E9%A2%98%E7%9B%AE%E8%A7%A3%E7%AD%94.pdf&Signature=fZSsBxNZoCACEaXAh%2F5BGvxnDAD0epuuDJOZ7w80AME%3D&Expires=1611115770&NOSAccessKeyId=7db2f370ff9a412987155d36d55a6ead",type:null,unitId:null,videoHDUrl:null,videoId:null,videoImgUrl:null,videoLearnTime:0,videoSHDUrl:null,videoUrl:null,videoVo:null});# ……省略接下來與獲得ID的步驟一樣,利用Re庫提取關鍵地址
首先可以比較一下清晰度,可以ctrl+F搜一下mp4
mp4HdUrl視頻:
s0.mp4SdUrl視頻:
mp4ShdUrl視頻:
這里我們默認獲得高清的mp4格式視頻的url,即mp4ShdUrl,但又有問題了
有的mp4.ShdUrl打不開,我吃了個飯回來又打得開了……那這樣,我把三種清晰度的都爬一遍
代碼如下:
def get_Video_func(url, ID_list):Video_list = []kv = {'cookie': 'EDUWEBDEVICE=0ab7115b3d854dae8ce1b0d71a1c966a; WM_TID=g5T0XKWOfThBQUFFAQI%2BPb3bZ726b4qo; __yadk_uid=MbnLeOb6ON3kezNoejvPZ46bbsVQNVq7; bpmns=1; MOOC_PRIVACY_INFO_APPROVED=true; hasVolume=true; videoVolume=0.8; WM_NI=4nhovjBd8NMvGeMBnAwhablVmi43V9kTLY8augfa4lpHGMj21Vz3g2oz3nACXnlCJIfo0ptu6Taonh3idP9qGnB%2FbqbpowUvtYBUz6dHRcdX%2BGfV9JAxBoSHLEMy4tt%2BbW4%3D; WM_NIKE=9ca17ae2e6ffcda170e2e6eed0f63489b5a1adee6288928fb6c45f939f8fbbf56ebcba969bef52a3898a85d02af0fea7c3b92af8b2a890f56282bcf8afd07e85bb83a3ed4bb5b8af83ef5eafb7fcd4d533a98c84babb6bb7b9aa82b768b49889aee93af38ab6d4e63aada6fd8fc4499cb2a4a4c943a6b10091f77ff68caf90ed64f1b7acb1bc7b85b28ed0e17ba2b0a198d87eb8878eb6f93ea8989e84e754e999aab7bc4283a88e90d244909b839ab780b8f5aeb8b737e2a3; hb_MA-A976-948FFA05E931_source=cn.bing.com; NTESSTUDYSI=12f6d9e2101a4fcf8f9cce372aa5a138; utm="eyJjIjoiIiwiY3QiOiIiLCJpIjoiIiwibSI6IiIsInMiOiIiLCJ0IjoiIn0=|aHR0cHM6Ly9jbi5iaW5nLmNvbS8="; STUDY_INFO="1967325755@qq.com|11|1030121983|1611107359528"; STUDY_SESS="n1oSkC6ko6uwOFVCPjH6m8uTeduOsKbkwyDflnyQJXOl3vVC3smdubFKwXv7eRBshzRpplicxt/Lt+S95t5iVP2HLxIDVPCEj+mrsdXzBxFBRrud3Ty9KuRSRuYgfdtF8eBNDDHujPECpAnL16GoLfPvtLQ9YjxiCYO+hzQ+WHknppr6KrivyjY6FmKs/Qou"; STUDY_PERSIST="aB3nOkxcPNgwAe5iJFKKSMbImXt1xp+axKXoEPw5DZ3lUpL+NmaTG1meuRlRql2LBBR5z8N4snwDLC1Hk/JCU4tzj384k/jS5ImlAFuGLeFPfk4QSkfUmixD8VSvXvR1MOptyNyayqNARnneZ4gm5kgJbHWNPJAhxlLhIBcVZRL8/IlsIGclqeQxlFgxlY5u5P1u5JBpOK2FN6WFsEHtKMGIeqgGaqrbui2Xkv/Gn+9OSdFA6J5jrZLCRv8JU8qN8WQLi3xTJ45sq/acjsEWiA=="; NETEASE_WDA_UID=1030121983#|#1505730973277; Hm_lvt_77dc9a9d49448cf5e629e5bebaa5500b=1611060970,1611066948,1611068902,1611107360; Hm_lpvt_77dc9a9d49448cf5e629e5bebaa5500b=1611107366','user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'}for i in range(len(ID_list)):data_get_VideoUrl['c0-param0'] = 'number:{}'.format(ID_list[i][1])data_get_VideoUrl['c0-param3'] = 'number:{}'.format(ID_list[i][2])data_get_VideoUrl['c0-param1'] = 'number:{}'.format(ID_list[i][3])try:HTML = postHTMLText(url, headers=kv, data=data_get_VideoUrl)Video_list = parse_Video(HTML, ID_list[i][0])print(Video_list)except:print('Get Video Error!')def parse_Video(html, name):video_url_list =[]video_url_list.append(name)re_video = r's\d*\.mp4[S|H]?h?dUrl=\".*?\.mp4' # 最小匹配video_url = re.findall(re_video, html)for item in video_url:item = re.sub(r's0.mp4[S|H]?h?dUrl=', '', item)video_url_list.append(item)return video_url_list結果:
['第1講題目'] ['序與主要知識回顧', '"http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_d60d29305cd041d18f4f43afb90f8041_hd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_561be13927f9499f89818520ed65507b_sd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_57a9189bfa2b4151b985836d9a039661_shd.mp4'] ['位置矢量與位移', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_454463d885e341f9868c7b6098b6661f_hd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_8de73c893c8646ebb7515071b52477e9_sd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_688a167061fd4dfcabb9fec729eaa23d_shd.mp4'] ['拋體運動的加速度', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005316207_61b2b85a784e4e61a5bb5ac969beef37_hd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005316207_71e392eea48e407785e8235303d96acb_sd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005316207_65c062fc1b2c4dc1a5a7cd87981c0c46_shd.mp4'] ['軌道的曲率半徑', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005315192_492d3b65668d4ad9b45f89f5c4ac9220_hd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005315192_24fbaee5af9d401d9c02ffff8f48dc52_sd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/20/1005315192_81dd9c14e18c49e39021ae6de73b0a65_shd.mp4'] ['第1講題目解答'] ['第2講題目'] …… # 太多了,省略列表后面沒有顯示的是pdf文件,以’第1講題目’為例,通過F12我們可以找到pdf的下載地址為:https://nos.netease.com/edu-lesson-pdfsrc/A6EE5575B44A392358FE74FF27BDCC2D-1483014849819?download=%E5%8A%9B%E5%AD%A6%E7%AC%AC1%E8%AE%B2.pdf&Signature=ddBAI%2F%2B88KDZKfDGNVV6ZwhZK9K0hBJKUbHkJpT9Ezc%3D&Expires=1611115325&NOSAccessKeyId=7db2f370ff9a412987155d36d55a6ead
問題:
在以上代碼中點擊就可以下載了,但是這個網站打開是這樣的:
所以如何下載PDF文件呢?
首先改一下原來視頻的匹配規則,加入有pdf的:
def parse_Video(html, name):video_url_list = []video_url_list.append(name)re_video = r's\d*\.mp4[S|H]?h?dUrl=\".*?\.mp4' # 最小匹配video_url = re.findall(re_video, html)if video_url == []: # 說明這是個PDF文件,所以沒有找到對應的視頻re_pdf = r':".*?\.pdf'pdf_url = re.search(re_pdf, html)pdf_url = pdf_url.group(0).replace(':"','')video_url_list.append(pdf_url)else:for item in video_url:item = re.sub(r's0.mp4[S|H]?h?dUrl=', '', item)video_url_list.append(item)return video_url_list結果:
['第1講題目', 'http://nos.netease.com/edu-lesson-pdfsrc/A6EE5575B44A392358FE74FF27BDCC2D-1483014849819?download=%E5%8A%9B%E5%AD%A6%E7%AC%AC1%E8%AE%B2.pdf'] ['序與主要知識回顧', '"http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_d60d29305cd041d18f4f43afb90f8041_hd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_561be13927f9499f89818520ed65507b_sd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2017/03/10/1005926247_57a9189bfa2b4151b985836d9a039661_shd.mp4'] ['位置矢量與位移', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_454463d885e341f9868c7b6098b6661f_hd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_8de73c893c8646ebb7515071b52477e9_sd.mp4', '"http://v.stu.126.net/mooc-video/nos/mp4/2016/11/16/1005318028_688a167061fd4dfcabb9fec729eaa23d_shd.mp4'] …… # 太多了,省略問題:
通過url get不到視頻,猜想不能只以mp4結尾,需要把后面那一大串也搞出來
下載:Python網絡爬蟲——爬取視頻網站源視頻,
問題:
優化:
1、可以看下載的進度
比如以上視頻下了大概一分鐘,然后程序就停在這,讓人以為是不是陷入什么死循環了,其實不是,我連的手機熱點,看得到下載走流量非常快,不然早就停止運行了
法I(參考python3 爬蟲利用Requests 實現下載進度條):
elif len(list)==5: # 視頻下載try:url = list[4]path = root + list[0].replace('序號:','') + list[1] + '.mp4'if not os.path.exists(path):r = getHTMLText(url, headers=kv_video, stream=True) # stream=True控制分塊下載if 'Content-Length' not in r.headers:raise r.HTTPError('No Content Length')content_size = int(r.headers['Content-Length']) # 實際大小,or content-lengthdata_count = 0with open(path, 'wb') as f:for data in r.iter_content(chunk_size=1024):f.write(data)data_count += len(data)download_progress = (data_count / content_size) * 100print('視頻下載進度:{}%%'.format(download_progress))print('文件保存成功')else:print('文件已存在')except:print('Download Video Error!')部分結果:
視頻下載進度:69.17667138720871%% 視頻下載進度:69.17746162551836%% 視頻下載進度:69.178251863828%% 視頻下載進度:69.17904210213763%% 視頻下載進度:69.17983234044726%%一閃而過的進度,沒什么用,感覺大部分時間不是在分塊讀寫,而是在從url接收視頻……
法II:
try:url = list[4]path = root + list[0].replace('序號:', '') + list[1] + '.mp4'if not os.path.exists(path):urllib.urlretrieve(url, root, progress_callfunc)print('文件保存成功')else:print('文件已存在')except:print('Download Video Error!')沒調出來,有問題,不會解決,換一種方法。
法III:
是對法I的完善,參考python 爬蟲下載視頻加進度條,實際上只需要修改一行代碼:
把原來的requests.get注釋掉,換成以上代碼即可
2、可以選擇下載所有視頻或者下載某一個視頻或者哪一個PDF
運行效果:
【爬蟲實戰】應用Python網絡爬蟲——利用Post定向爬取下載慕課MOOC視頻
總代碼:
import requests import re import time import os from contextlib import closingcount = 0kv_video = {'cookie': 'STUDY_PERSIST="aB3nOkxcPNgwAe5iJFKKSMbImXt1xp+axKXoEPw5DZ3lUpL+NmaTG1meuRlRql2LBBR5z8N4snwDLC1Hk/JCU5wogOkvdb25CLmrCR592cxxt1TYn3vOJnxEQjDwU3QOSBFq7tLt4/0TY4P/yWEtEKTlwrbLnoNaNhoRba10Guvw9tyk7C+ocDLDdFL2CikkBmjrmzvA4pJPOwea1JsWn10I7TB5rN2wUSeHvW1YKYJOSdFA6J5jrZLCRv8JU8qN8WQLi3xTJ45sq/acjsEWiA=="; STUDY_SESS="n1oSkC6ko6uwOFVCPjH6m8uTeduOsKbkwyDflnyQJXOl3vVC3smdubFKwXv7eRBsk0PiktV4/v0kYjeepJ6w/P2HLxIDVPCEj+mrsdXzBxEdKx2NprgKRmA2ASqWiN88+bvOgwkdnyLMPsKBqr6QkQOJKpJ7RGlYQMGBRvKDF9Inppr6KrivyjY6FmKs/Qou"; STUDY_INFO="1967325755@qq.com|11|1030121983|1611107359799"','user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36','content-type': 'application/x-www-form-urlencoded','accept-encoding': 'gzip, deflate, br','accept-language': 'zh-CN,zh;q=0.9','origin': 'https://www.icourse163.org','referer': 'https://www.icourse163.org/','sec-fetch-dest': 'empty','sec-fetch-mode': 'cors','sec-fetch-site': 'cross-site' } kv_pdf = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36' }# request Payload data_get_ID = { # https://www.icourse163.org/dwr/call/plaincall/CourseBean.getLastLearnedMocTermDto.dwr'callCount': '1','scriptSessionId': '${scriptSessionId}190','httpSessionId': 'e41c678f2cb044c0a12c4778412c7344','c0-scriptName': 'CourseBean','c0-methodName': 'getLastLearnedMocTermDto',# 'c0-methodName':'getMocTermDto', # 注釋掉的是MOOC更新之前的寫法,現在也還能用,但是不知道MOOC更新前POST的內容,就沒多大參考價值'c0-id': '0','c0-param0': 'number:1460672441',# 'c0-param1': 'number:1',# 'c0-param2': 'boolean:true','batchId': '1611048630796' }# request Payload data_get_VideoUrl = { # https://www.icourse163.org/dwr/call/plaincall/CourseBean.getLessonUnitLearnVo.dwr'callCount': '1','scriptSessionId': '${scriptSessionId}190','httpSessionId': 'e41c678f2cb044c0a12c4778412c7344','c0-scriptName': 'CourseBean','c0-methodName': 'getLessonUnitLearnVo','c0-id': '0','c0-param0': 'number:1005926247','c0-param1': 'number:1', # 1 for pdf or 3 or video'c0-param2': 'number:0','c0-param3': 'number:1256673007','batchId': '1611048630815' }def batch_id():return round(time.time() * 1000)def postHTMLText(url, headers=None, data=None):try:r = requests.post(url, headers=headers, data=data) # 注意,這里是POSTr.raise_for_status()r.encoding = r.apparent_encodingreturn r.textexcept:print('Get HTML Error!')return ''def get_ID_func(url, tid):data_get_ID['c0-param0'] = 'number:{}'.format(tid)data_get_ID['batchId'] = batch_id() # 雖然參考這里還加了一個隨機函數產生批處理時間作為'batchId'的參數,個人覺得沒有必要kv = {'cookie': 'EDUWEBDEVICE=0ab7115b3d854dae8ce1b0d71a1c966a; WM_TID=g5T0XKWOfThBQUFFAQI%2BPb3bZ726b4qo; __yadk_uid=MbnLeOb6ON3kezNoejvPZ46bbsVQNVq7; bpmns=1; MOOC_PRIVACY_INFO_APPROVED=true; hasVolume=true; videoVolume=0.8; WM_NI=4nhovjBd8NMvGeMBnAwhablVmi43V9kTLY8augfa4lpHGMj21Vz3g2oz3nACXnlCJIfo0ptu6Taonh3idP9qGnB%2FbqbpowUvtYBUz6dHRcdX%2BGfV9JAxBoSHLEMy4tt%2BbW4%3D; WM_NIKE=9ca17ae2e6ffcda170e2e6eed0f63489b5a1adee6288928fb6c45f939f8fbbf56ebcba969bef52a3898a85d02af0fea7c3b92af8b2a890f56282bcf8afd07e85bb83a3ed4bb5b8af83ef5eafb7fcd4d533a98c84babb6bb7b9aa82b768b49889aee93af38ab6d4e63aada6fd8fc4499cb2a4a4c943a6b10091f77ff68caf90ed64f1b7acb1bc7b85b28ed0e17ba2b0a198d87eb8878eb6f93ea8989e84e754e999aab7bc4283a88e90d244909b839ab780b8f5aeb8b737e2a3; CLIENT_IP=124.64.19.205; hb_MA-A976-948FFA05E931_source=cn.bing.com; NTESSTUDYSI=78e13ab1a4f442abb8b46c840eec8586; STUDY_INFO="1967325755@qq.com|11|1030121983|1611067128305"; STUDY_SESS="n1oSkC6ko6uwOFVCPjH6m8uTeduOsKbkwyDflnyQJXOl3vVC3smdubFKwXv7eRBs1zJXtNgUxjw/2K+9/6Riu/2HLxIDVPCEj+mrsdXzBxFHs70kcq5hQnJD9qU5wP0+F6d4HXTPLYMQLs0DjTlrR00af401D3EvKQd5Q0jzPgknppr6KrivyjY6FmKs/Qou"; STUDY_PERSIST="aB3nOkxcPNgwAe5iJFKKSMbImXt1xp+axKXoEPw5DZ3lUpL+NmaTG1meuRlRql2LBBR5z8N4snwDLC1Hk/JCU63udBm/atzzoMz64SOql9klhAV7c93tBmAyJI0XyhAuO+LDeVP0/uSOQ09BRmYkwxE9FxMvdcDjX+4jCmicQa9q7M6nWWwj41VWB7QksJLLV5YWrvhR05r0yr9iVIzOallWsU/2+Zmtx30Ukv4njbZOSdFA6J5jrZLCRv8JU8qN8WQLi3xTJ45sq/acjsEWiA=="; NETEASE_WDA_UID=1030121983#|#1505730973277; utm="eyJjIjoiIiwiY3QiOiIiLCJpIjoiIiwibSI6IiIsInMiOiIiLCJ0IjoiIn0=|aHR0cHM6Ly9jbi5iaW5nLmNvbS8="; Hm_lvt_77dc9a9d49448cf5e629e5bebaa5500b=1611060574,1611060970,1611066948,1611068902; Hm_lpvt_77dc9a9d49448cf5e629e5bebaa5500b=1611069210','user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'}try:HTML = postHTMLText(url, headers=kv, data=data_get_ID)return HTMLexcept requests.HTTPError:print('Get ID Error!')raisedef get_Video_func(url, ID_list):kv = {'cookie': 'EDUWEBDEVICE=0ab7115b3d854dae8ce1b0d71a1c966a; WM_TID=g5T0XKWOfThBQUFFAQI%2BPb3bZ726b4qo; __yadk_uid=MbnLeOb6ON3kezNoejvPZ46bbsVQNVq7; bpmns=1; MOOC_PRIVACY_INFO_APPROVED=true; hasVolume=true; videoVolume=0.8; WM_NI=4nhovjBd8NMvGeMBnAwhablVmi43V9kTLY8augfa4lpHGMj21Vz3g2oz3nACXnlCJIfo0ptu6Taonh3idP9qGnB%2FbqbpowUvtYBUz6dHRcdX%2BGfV9JAxBoSHLEMy4tt%2BbW4%3D; WM_NIKE=9ca17ae2e6ffcda170e2e6eed0f63489b5a1adee6288928fb6c45f939f8fbbf56ebcba969bef52a3898a85d02af0fea7c3b92af8b2a890f56282bcf8afd07e85bb83a3ed4bb5b8af83ef5eafb7fcd4d533a98c84babb6bb7b9aa82b768b49889aee93af38ab6d4e63aada6fd8fc4499cb2a4a4c943a6b10091f77ff68caf90ed64f1b7acb1bc7b85b28ed0e17ba2b0a198d87eb8878eb6f93ea8989e84e754e999aab7bc4283a88e90d244909b839ab780b8f5aeb8b737e2a3; hb_MA-A976-948FFA05E931_source=cn.bing.com; NTESSTUDYSI=12f6d9e2101a4fcf8f9cce372aa5a138; utm="eyJjIjoiIiwiY3QiOiIiLCJpIjoiIiwibSI6IiIsInMiOiIiLCJ0IjoiIn0=|aHR0cHM6Ly9jbi5iaW5nLmNvbS8="; STUDY_INFO="1967325755@qq.com|11|1030121983|1611107359528"; STUDY_SESS="n1oSkC6ko6uwOFVCPjH6m8uTeduOsKbkwyDflnyQJXOl3vVC3smdubFKwXv7eRBshzRpplicxt/Lt+S95t5iVP2HLxIDVPCEj+mrsdXzBxFBRrud3Ty9KuRSRuYgfdtF8eBNDDHujPECpAnL16GoLfPvtLQ9YjxiCYO+hzQ+WHknppr6KrivyjY6FmKs/Qou"; STUDY_PERSIST="aB3nOkxcPNgwAe5iJFKKSMbImXt1xp+axKXoEPw5DZ3lUpL+NmaTG1meuRlRql2LBBR5z8N4snwDLC1Hk/JCU4tzj384k/jS5ImlAFuGLeFPfk4QSkfUmixD8VSvXvR1MOptyNyayqNARnneZ4gm5kgJbHWNPJAhxlLhIBcVZRL8/IlsIGclqeQxlFgxlY5u5P1u5JBpOK2FN6WFsEHtKMGIeqgGaqrbui2Xkv/Gn+9OSdFA6J5jrZLCRv8JU8qN8WQLi3xTJ45sq/acjsEWiA=="; NETEASE_WDA_UID=1030121983#|#1505730973277; Hm_lvt_77dc9a9d49448cf5e629e5bebaa5500b=1611060970,1611066948,1611068902,1611107360; Hm_lpvt_77dc9a9d49448cf5e629e5bebaa5500b=1611107366','user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'}for i in range(len(ID_list)):data_get_VideoUrl['c0-param0'] = 'number:{}'.format(ID_list[i][1])data_get_VideoUrl['c0-param3'] = 'number:{}'.format(ID_list[i][2])data_get_VideoUrl['c0-param1'] = 'number:{}'.format(ID_list[i][3])try:HTML = postHTMLText(url, headers=kv, data=data_get_VideoUrl)Video_list = parse_Video(HTML, ID_list[i][0], i)# print(Video_list)download_Video(Video_list)except:print('Get Video Error!')def getHTMLText(url, headers=None, stream=False):try:r = requests.get(url, headers=headers, stream=False)# r.raise_for_status() # 如果狀態不是200,引發HTTPError異常# r.encoding = r.apparent_encoding # ? 存疑return rexcept:print('爬取失敗')def download_Video(list):root = 'E://My_Downloads/MOOC/'if not os.path.exists(root):os.mkdir(root)if len(list) == 3: # PDF下載try:url = list[2]# print(url)path = root + list[0].replace('序號:', '') + list[1] + '.pdf'if not os.path.exists(path):r = getHTMLText(url, headers=kv_pdf, stream=True)with open(path, 'wb') as f:f.write(r.content)print('文件保存成功')else:print('文件已存在')except:print('Download PDF Error!')elif len(list) == 5: # 視頻下載try:url = list[4]path = root + list[0].replace('序號:', '') + list[1] + '.mp4'if not os.path.exists(path):# r = getHTMLText(url, headers=kv_video, stream=True) # stream=True控制分塊下載with closing(requests.get(url,headers=kv_video,stream=True)) as r:content_size = int(r.headers['content-length'])chunk_size = 1024 * 128print("下載開始")with open(path, 'wb') as f:count = 1for chunk in r.iter_content(chunk_size=chunk_size):download_progress = count*1024.0*128*100/content_sizef.write(chunk)print('\r視頻下載進度:{}%'.format(download_progress), end='')count += 1print('文件保存成功')else:print('文件已存在')except:print('Download Video Error!')return ''def progress_callfunc(blocknum, blocksize, totalsize):'''回調函數@blocknum : 已經下載的數據塊@blocksize : 數據塊的大小@totalsize: 遠程文件的大小'''percent = 100.0 * blocknum * blocksize / totalsizeif percent > 100:percent = 100print('進度條 %.2f%%' % percent, end='\r')def parse_Video(html, name, count):video_url_list = []video_url_list.append('序號:' + str(count))video_url_list.append(name)re_video = r's\d*\.mp4[S|H]?h?dUrl=\".*?\"' # 最小匹配video_url = re.findall(re_video, html)if video_url == []: # 說明這是個PDF文件,所以沒有找到對應的視頻re_pdf = r'textOrigUrl:".*?"'pdf_url = re.search(re_pdf, html)pdf_url = pdf_url.group(0).replace('textOrigUrl:"', '')pdf_url = pdf_url.replace('\"', '')video_url_list.append(pdf_url)else:for item in video_url:item = re.sub(r's0.mp4[S|H]?h?dUrl="', '', item)item = item.replace('\"', '')video_url_list.append(item)return video_url_listdef parse_ID(html):ID_list = []tmp = []re_name = r'liveInfoDto=null;s\d*\.name=\".*\"'name_list = re.findall(re_name, html)for item in name_list:item = re.sub(r'liveInfoDto=null;s\d*\.name=\"', '', item)item = item.replace('\"', '')item = item.encode('utf-8').decode("unicode-escape")tmp.append(item)ID_list.append(tmp)tmp = []re_contentId = r'attachments=.*;s\d*.chapterId=\d*;s\d*.contentId=\d*'contentId = re.findall(re_contentId, html)for item in contentId:item = re.sub(r'attachments=.*;s\d*.chapterId=\d*;s\d*.contentId=', '', item)tmp.append(item)ID_list.append(tmp)tmp = []re_ID = r's\d*\.id=\d*;s\d*.jsonContent=.*;s\d*\.learnCount'ID = re.findall(re_ID, html)for item in ID:item = re.sub(r's\d*\.id=', '', item)item = re.sub(r';s\d*.jsonContent=.*;s\d*\.learnCount', '', item)tmp.append(item)ID_list.append(tmp)tmp = [] # 判斷下載類型是pdf還是視頻re_type = r's\d*\.contentId=\d*;s\d*\.contentType=\d;s\d*\.durationInSeconds'type = re.findall(re_type, html)for item in type:item = re.sub(r's\d*\.contentId=\d*;s\d*\.contentType=', '', item)item = re.sub(r';s\d*\.durationInSeconds', '', item)tmp.append(item)ID_list.append(tmp)transpose_ID_list = []if len(ID_list[0]) != len(ID_list[1]) or len(ID_list[0]) != len(ID_list[2]) or len(ID_list[0]) != len(ID_list[3]):print('Length Error!')return ''for i in range(len(ID_list[0])):transpose_ID_list.append([ID_list[0][i], ID_list[1][i], ID_list[2][i], ID_list[3][i]])return transpose_ID_listdef show_list(ID_list):tmpl = '{:6}\t{:6}'for i in range(len(ID_list)):print(tmpl.format(i, ID_list[i][0]))def main():tid =input('請輸入你想下載的課程的tid(例1460672441):')method = input('是否需要全部下載,請輸入:yes/no\n')get_ID_url = 'https://www.icourse163.org/dwr/call/plaincall/CourseBean.getLastLearnedMocTermDto.dwr'ID_HTML = get_ID_func(get_ID_url, int(tid))ID_list = parse_ID(ID_HTML)get_VideoUrl_url = 'https://www.icourse163.org/dwr/call/plaincall/CourseBean.getLessonUnitLearnVo.dwr'if method == 'yes':get_Video_func(get_VideoUrl_url, ID_list)elif method == 'no':show_list(ID_list)key = input("請輸入想下載的序號(0-{}):".format(len(ID_list)-1))get_Video_func(get_VideoUrl_url, [ID_list[int(key)]])if __name__ == '__main__':main()運行結果:
請輸入你想下載的課程的tid(例1460672441):1460672441 是否需要全部下載,請輸入:yes/no yes 文件已存在 文件已存在 文件已存在 文件已存在 文件已存在 文件已存在 文件已存在 文件已存在 文件已存在 下載開始 視頻下載進度:5.485017069369092% 請輸入你想下載的課程的tid(例1460672441):1460672441 是否需要全部下載,請輸入:yes/no no0 第1講題目 1 序與主要知識回顧2 位置矢量與位移3 拋體運動的加速度4 軌道的曲率半徑5 第1講題目解答6 第2講題目 7 最基本的運動學問題8 光點在河岸上的移動速率9 旋輪線1 10 旋輪線2 11 第2講題目解答12 第3講題目 13 跳水運動員入水后的運動——一類基本的運動學問題14 猴子與獵人 15 第3講題目解答16 第4講題目 17 滑輪邊緣處的速度與加速度18 質點的圓周運動19 相對運動 20 第4講題目解答21 第5講題目 22 主要知識點回顧23 轉動圓環上的小珠子24 靜止在轉臺上的物塊受到的摩擦力25 自由下落的單擺26 第5講題目解答27 第6講題目解答28 受打擊后繩中張力29 盤繞的繩索 30 基本的動力學問題31 第6講題目解答32 第7講題目 33 星系中的引力34 滑輪組 35 第7講題目解答36 第8講題目 37 當拋體受空氣阻力38 拖拉與彈簧相連的物塊39 第8講題目解答40 第9講題目 41 主要知識點回顧42 一道圖線題 43 三角形軌道上的質點44 系 統 的 質 心45 第9講題目解答46 第10講題目47 質心的計算 48 圓錐擺擺繩拉力的沖量49 臺子的位移 50 第10講題目解答51 第11講題目52 提起繩子 53 變質量問題的一般動力學方程54 探索火星 55 松開的繩子 56 第11講題目解答57 題目 58 角動量部分主要知識點回顧59 猴子與獵人(2)——角動量60 圓 錐 擺 ——力矩61 在豎直圓軌道上的滑動62 人造地球衛星的運動——有心力63 第12講題目解答64 第13講題目65 主要知識點回顧66 摩擦力的功 67 保守力的判定68 下 落 的 鏈 條69 第13講題目解答70 第14講題目71 系在彈簧上的滑塊72 楔塊與滑塊的滑動73 第14講題目解答74 第15講題目75 在豎直彈簧上的碰撞76 圓軌道上的碰撞77 粒子的質量 78 第15講解答79 第16講題目80 衛星到地面的最遠與最近距離81 兩個質點 82 第16講解答83 第17講題目84 主要知識點回顧85 一道基本的運動學題86 轉動慣量的計算(1)87 轉動慣量的計算(2)88 第17講題目解答89 第18講題目90 一道基本題 91 滑輪問題 92 細桿的轉動 93 第18講題目解答94 第19講題目95 轉動的圓盤(1)96 轉動的圓盤(2)97 兩 輪 磨 合98 第19講題目解答99 第20講題目100 誰更快呢? 101 子彈打細桿 102 細桿與小蟲 103 第20講題目解答104 第21講題目105 轉動的細管與滑動的小球 (1)106 轉動的細管與滑動的小球 (2)107 轉軸的作用力(1)108 轉軸的作用力(2)109 轉軸的作用力(3)110 第21講題目解答111 第22講題目112 細桿的傾倒 113 小泥團與細桿的碰撞114 第22講題目解答115 題目 116 內容復習 117 Q1. 氧氣瓶中的氧氣118 Q2. 水銀氣壓計中混入氣泡119 Q3. 容器內壁吸附的氣體分子影響真空度120 Q4. 混合氣體的宏觀狀態參量121 解答 122 題目 123 內容復習 124 Q1. 氣體壓強的產生125 Q2. 理想氣體的宏觀特征126 Q3. 氣體宏觀量與微觀量的關聯127 Q4. 分子速度分量的平均值128 Q5. 特高溫下質子的動能和速率129 Q6. 分子方均根速率與氣體宏觀量的關系130 Q7. 溫度的微觀意義131 解答 132 題目 133 內容復習 134 Q1. 分子的轉動對壓強沒有貢獻135 Q2. 分子的平均動能與宏觀量的關系136 Q3. 混合氣體的內能和溫度137 Q4. “風”與“寒”138 Q5. 高鐵車廂內的分子139 解答 140 題目 141 內容復習 142 Q1. 含有速率分布函數的統計表達式143 Q2. 方均根速率與平均速率的大小關系144 解答 145 題目 146 內容復習 147 Q1. 求一個平均值148 Q2. 簡化的麥克斯韋速率分布函數149 解答 150 題目 151 Q1. 不同氣體分子的特征速率的比較152 Q2. 兩個速率區間內兩個溫度下的統計153 Q3. 分子按速率大小排序,排列一半時的速率是多少?154 Q4. 按平動動能對分子進行統計155 解答 156 題目 157 內容復習 158 Q1. 分子直徑及氣體壓強和溫度對平均自由程的影響159 Q2. 分子擴散的平均速度160 Q3. 氧氣在肺泡壁和毛細血管壁中的擴散161 解答 162 題目 163 內容復習 164 Q1. 車胎爆裂的熱力學165 Q2. 受熱膨脹的橡皮球166 Q3. 用p-V圖中的面積來表示167 Q4. p-V圖中負斜率直線過程的性質168 解答 169 題目 170 內容復習 171 Q1. 理想氣體摩爾定壓熱容大于摩爾定體熱容的原因172 Q2. 比熱容比與等體過程和等壓過程173 Q3. 固體的熱容174 Q4. p-V圖中負斜率直線過程的熱容175 解答 176 題目 177 內容復習 178 Q1. 等體過程與等壓過程179 Q2. 初末態分別相同的兩個過程(一)180 Ans. 初末態分別相同的兩個過程(一)思考題解答181 Q3. 初末態分別相同的兩個過程(二)182 解答 183 題目 184 內容復習 185 Q1.用絕熱過程討論其他準靜態過程186 Q2.壓縮氦氣出現過熱現象187 Q3.準靜態絕熱過程與絕熱自由膨脹過程188 Q4.單原子和雙原子分子氣體的絕熱過程189 解答 190 題目 191 內容復習 192 Q1. 判斷一個過程吸熱還是放熱193 Q2. 兩個卡諾循環比較194 Q3. 兩個可逆熱機比較195 解答 196 題目 197 內容復習 198 Q1. 由等體、等壓、絕熱過程構成的循環199 Q2. 由等體、等壓、等溫、絕熱過程構成的循環200 Q3. 狄塞爾熱機的效率201 Q4. 三角形循環的效率202 Q5. 用負斜率直線過程構成循環203 解答 204 題目 205 內容復習 206 Q1. 熱循環與制冷循環的關系207 Q2. 由等體過程和等溫過程構成的循環208 Q3. 用空調器波保持室內恒溫209 解答 210 題目 211 內容復習 212 Q1. 熱力學第一定律和第二定律213 Q2. 熱力學第二定律的內涵214 Q3. 熱力學過程方向性的討論215 Q4. 證明電流生熱過程是不可逆過程216 解答 217 題目 218 內容復習 219 Q1. 判斷可逆過程220 Q2. 微觀機制與玻耳茲曼熵221 Q3. 玻耳茲曼熵的一個示例222 解答 223 題目 224 內容復習 225 Q1. 熵變的判斷226 Q2. 熱機效率與熱力學第二定律227 Q3. 溫熵圖228 解答 229 題目 230 內容復習 231 Q1. p-V 圖中的可逆過程232 Q2. 初態相同,末態相同的三個可逆過程233 Q3. 可逆與不可逆過程的熵變234 Q4. 物體在恒溫環境中熱傳導的熵變235 Q5. 熱機引起的熵變236 解答 237 結語 請輸入想下載的序號(0-237):22 下載開始 視頻下載進度:100.10316132939414%文件保存成功小結
總重要的是,爬取的時候需不需要加Headers,需不需要r.raise_for_status()、r.encoding = r.apparent_encoding,是不是需要新的庫來擴展功能,比如進度條,總而言之是要有解決問題的能力
總結
以上是生活随笔為你收集整理的【爬虫实战】9应用Python网络爬虫——利用Post定向爬取下载慕课MOOC视频的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 专家系统实例及其骨架系统
- 下一篇: java输出日志_java代码中如何正确