python Demo 01 爬取大学名称
生活随笔
收集整理的這篇文章主要介紹了
python Demo 01 爬取大学名称
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
對icourse、學堂在線等網頁中的大學名稱進行爬取:
# to clean data from icourses fi = open("icourses.txt","r",encoding="utf-8") ls = [] for line in fi:if "alt" in line:tokens = line.split('"')uname = tokens[-2]if "大學生" in uname:continueif "大學" in uname or "學院" in uname:ls.append(uname) print("".join(ls)) print(len(ls)) fi.close()#to claean data from xuetangx fi = open("xuetangx.txt","r",encoding="utf-8") U = set() #使用集合去重 for line in fi:if "慕課" in line:continueif "大學" in line or "學院" in line:U.add(line.strip("\n")) print("".join(U)) print(len(U)) fi.close()# to claen data from cnmooc fi = open("cnmooc.txt","r",encoding="utf-8") U = set() for line in fi:if "大學" in line or "學院" in line:U.add(line.strip("/n")) print("".join(U)) print(len(U)) fi.close()#匯總結果 ic = ''' ''' xt = ''' ''' cm = ''' ''' U =set() U |= set(ic.split()) U |= set(xt.split()) U |= set(cm.split()) ls = list(U) ls.sort() print("",join(ls)) print(len(ls))?
總結
以上是生活随笔為你收集整理的python Demo 01 爬取大学名称的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: CGAN
- 下一篇: Python Demo 02 蒙特卡罗方