python 获得github代码库列表
1.背景
項目需求,要求獲得github的repo的api,以便可以提取repo的數據進行分析。研究了一天,終于解決了這個問題,雖然效率還是比較低下。 ? ? 因為github的那個顯示repo的api,列出了每個repo的詳細信息,而且是json格式的?,F在貌似還沒有找到可以分析多個json格式數據的方法,所以用的是比較蠢得splite加re的方法。如果大家有更好的方法,不發留言討論! ??2.代碼
import re import osdef GetUrl(num):str = os.popen("curl -G https://api.github.com/repositories?since=%d"%(num)).read()pattern = '"url"'pattern1='repos'urls=str.split(',\n') for i in urls:if pattern in i and pattern1 in i:# text1=i.splite(':')text=re.compile('"(.*?)"').findall(i)[1]print textif __name__=='__main__':GetUrl(1000)? ? 其中num的值指的是頁面的id,我們可以做一個循環,不斷增大num的值,就可以無限提取repo。因為github的api對于流量是有限制的,所以這么做是一個可行的方法。 效果如下(提取下來的repo的api地址):
https://api.github.com/repos/wycats/merb-core
https://api.github.com/repos/rubinius/rubinius
https://api.github.com/repos/mojombo/god
https://api.github.com/repos/vanpelt/jsawesome
https://api.github.com/repos/wycats/jspec
https://api.github.com/repos/defunkt/exception_logger
https://api.github.com/repos/defunkt/ambition
https://api.github.com/repos/technoweenie/restful-authentication
https://api.github.com/repos/technoweenie/attachment_fu
https://api.github.com/repos/topfunky/bong
https://api.github.com/repos/Caged/microsis
https://api.github.com/repos/anotherjesse/s3
https://api.github.com/repos/anotherjesse/taboo
https://api.github.com/repos/anotherjesse/foxtracs
https://api.github.com/repos/anotherjesse/fotomatic
https://api.github.com/repos/mojombo/glowstick
https://api.github.com/repos/defunkt/starling
https://api.github.com/repos/wycats/merb-more
https://api.github.com/repos/macournoyer/thin
https://api.github.com/repos/jamesgolick/resource_controller
https://api.github.com/repos/jamesgolick/markaby
https://api.github.com/repos/jamesgolick/enum_field
https://api.github.com/repos/defunkt/subtlety
https://api.github.com/repos/defunkt/zippy
https://api.github.com/repos/defunkt/cache_fu
https://api.github.com/repos/KirinDave/phosphor
???
總結
以上是生活随笔為你收集整理的python 获得github代码库列表的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: pydev-python 链接mysql
- 下一篇: python将字典内容存入mysql