xpath 解析,实战链家二手房项目
生活随笔
收集整理的這篇文章主要介紹了
xpath 解析,实战链家二手房项目
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
目的:為了獲得二手房相關信息(包括小區,地址,戶型,年份,價格等)
1、獲取網頁數據
import requests from lxml import html etree = html.etreeheaders = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36'}url = 'https://hf.lianjia.com/ershoufang/' page_text = requests.get(url=url,headers=headers).text2、數據解析
tree = etree.HTML(page_text)li_list = tree.xpath('//ul[@class="sellListContent"]/li')3、用xpath解析出詳細信息
for li in li_list :title = li.xpath('.//div[@class="title"]/a/text()')houseIcon = li.xpath('.//div[@class="houseInfo"]/text()')target = li.xpath('.//div[@class="positionInfo"]/a/text()')money = li.xpath('.//div[@class="totalPrice totalPrice2"]/span/text()')titles = str(target+title+houseIcon+money) +'萬'+ '\n'print(titles)完整代碼如下:
import requests from lxml import html etree = html.etreeheaders = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36'}url = 'https://hf.lianjia.com/ershoufang/' page_text = requests.get(url=url,headers=headers).text# 數據解析 tree = etree.HTML(page_text)li_list = tree.xpath('//ul[@class="sellListContent"]/li') # print(li_list) fp = open('D:\各種資料\合肥二手房.txt','w',encoding='utf-8')for li in li_list :title = li.xpath('.//div[@class="title"]/a/text()')houseIcon = li.xpath('.//div[@class="houseInfo"]/text()')target = li.xpath('.//div[@class="positionInfo"]/a/text()')money = li.xpath('.//div[@class="totalPrice totalPrice2"]/span/text()')titles = str(target+title+houseIcon+money) +'萬'+ '\n'print(titles)fp.write(titles)總結
以上是生活随笔為你收集整理的xpath 解析,实战链家二手房项目的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Echats关系图les-miserab
- 下一篇: 二维码的纠错码原理及如何纠错(2)