GeoPandas入门 | 03-空间关系和操作
03-空間關(guān)系和操作
源代碼 請看此處
%matplotlib inlineimport pandas as pd import geopandas countries = geopandas.read_file("zip://./data/ne_110m_admin_0_countries.zip") cities = geopandas.read_file("zip://./data/ne_110m_populated_places.zip") rivers = geopandas.read_file("zip://./data/ne_50m_rivers_lake_centerlines.zip")3.1 空間關(guān)系
地理空間數(shù)據(jù)的一個(gè)重要方面是我們可以觀察 spatial relationships :兩個(gè)空間對象如何相互關(guān)聯(lián)(它們是否重疊、相交、包含……)
地理信息系統(tǒng)中的拓?fù)潢P(guān)系、集合論關(guān)系通?;贒E-9IM模型,更多信息請見 https://en.wikipedia.org/wiki/Spatial_relation
(Image by Krauss, CC BY-SA 3.0 )
3.2 單個(gè)對象之間的空間關(guān)系
首先創(chuàng)建一些空間對象:
多邊形:使用 .squeeze() 從 GeoSeries 中提取幾何對象向量
belgium=countries.loc[countries['name']=='Belgium','geometry']#.squeeze() # .squeeze()從數(shù)組的形狀中刪除單維度條目,即把shape中為1的維度去掉 type(belgium) geopandas.geoseries.GeoSeries belgium=countries.loc[countries['name']=='Belgium','geometry'].squeeze() type(belgium) shapely.geometry.polygon.Polygon belgium
兩個(gè)點(diǎn)對象
paris=cities.loc[cities['name']=='Paris','geometry'].squeeze() brussels = cities.loc[cities['name'] == 'Brussels', 'geometry'].squeeze() parisbrussels
線段
from shapely.geometry import LineString line = LineString([paris, brussels]) line
把這4個(gè)幾何對象一起可視化(把它們放在一個(gè)GeoSeries中,以便于用 geopandas.plot() 方法把它們一起顯示出來)
geopandas.GeoSeries([belgium,paris,brussels,line]).plot(cmap='tab10') <matplotlib.axes._subplots.AxesSubplot at 0x7efc106f38d0>
可以認(rèn)出比利時(shí)的抽象形狀
比利時(shí)首都布魯塞爾位于比利時(shí)境內(nèi)。這是一種空間關(guān)系,我們可以使用以下單個(gè)的幾何對象進(jìn)行測試
brussels.within(belgium) True belgium.contains(brussels) True belgium.contains(paris) False paris.within(belgium) False從巴黎到布魯塞爾所畫的直線并不完全位于比利時(shí)境內(nèi),但它確實(shí)與比利時(shí)相交
belgium.contains(line) False line.intersects(belgium) True3.3 GeoDataFrame中的空間關(guān)系
如上所述,在單個(gè) shapely 矢量對象上可用的方法,也可作為 GeoSeries / GeoDataFrame 對象的方法
例如,如果我們在世界數(shù)據(jù)集上調(diào)用帶有 paris 點(diǎn)的 contains 方法,它將對 world 數(shù)據(jù)框架中的每個(gè)國家進(jìn)行這種空間檢查
countries.contains(paris) 0 False 1 False 2 False 3 False 4 False... 172 False 173 False 174 False 175 False 176 False Length: 177, dtype: bool因?yàn)樯厦娼o了我們一個(gè)布爾結(jié)果,我們可以用它來篩選GeoDataFrame
countries[countries.contains(paris)]<div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {vertical-align: top; }.dataframe thead th {text-align: right; }</style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>iso_a3</th> <th>name</th> <th>continent</th> <th>pop_est</th> <th>gdp_md_est</th> <th>geometry</th> </tr> </thead> <tbody> <tr> <th>55</th> <td>FRA</td> <td>France</td> <td>Europe</td> <td>67106161.0</td> <td>2699000.0</td> <td>MULTIPOLYGON (((2.51357 51.14851, 2.65842 50.7...</td> </tr> </tbody> </table> </div>
事實(shí)上,法國是世界上唯一一個(gè)巴黎所在的國家
又如,提取南美洲亞馬遜河的線路圖,我們可以查詢河流流經(jīng)哪些國家。
amazon = rivers[rivers['name'] == 'Amazonas'].geometry.squeeze() amazoncountries[countries.crosses(amazon)]
<div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {vertical-align: top; }.dataframe thead th {text-align: right; }</style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>iso_a3</th> <th>name</th> <th>continent</th> <th>pop_est</th> <th>gdp_md_est</th> <th>geometry</th> </tr> </thead> <tbody> <tr> <th>22</th> <td>BRA</td> <td>Brazil</td> <td>South America</td> <td>207353391.0</td> <td>3081000.0</td> <td>POLYGON ((-57.62513 -30.21629, -56.29090 -28.8...</td> </tr> <tr> <th>35</th> <td>COL</td> <td>Colombia</td> <td>South America</td> <td>47698524.0</td> <td>688000.0</td> <td>POLYGON ((-66.87633 1.25336, -67.06505 1.13011...</td> </tr> <tr> <th>124</th> <td>PER</td> <td>Peru</td> <td>South America</td> <td>31036656.0</td> <td>410400.0</td> <td>POLYGON ((-69.52968 -10.95173, -68.66508 -12.5...</td> </tr> </tbody> </table> </div>
countries[countries.intersects(amazon)]<div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {vertical-align: top; }.dataframe thead th {text-align: right; }</style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>iso_a3</th> <th>name</th> <th>continent</th> <th>pop_est</th> <th>gdp_md_est</th> <th>geometry</th> </tr> </thead> <tbody> <tr> <th>22</th> <td>BRA</td> <td>Brazil</td> <td>South America</td> <td>207353391.0</td> <td>3081000.0</td> <td>POLYGON ((-57.62513 -30.21629, -56.29090 -28.8...</td> </tr> <tr> <th>35</th> <td>COL</td> <td>Colombia</td> <td>South America</td> <td>47698524.0</td> <td>688000.0</td> <td>POLYGON ((-66.87633 1.25336, -67.06505 1.13011...</td> </tr> <tr> <th>124</th> <td>PER</td> <td>Peru</td> <td>South America</td> <td>31036656.0</td> <td>410400.0</td> <td>POLYGON ((-69.52968 -10.95173, -68.66508 -12.5...</td> </tr> </tbody> </table> </div>
空間關(guān)系函數(shù) :
檢查空間關(guān)系的不同函數(shù)概述( 空間謂詞函數(shù) )
- equals
- contains
- crosses
- disjoint
- intersects
- overlaps
- touches
- within
- covers
關(guān)于這些方法的概述,見 https://shapely.readthedocs.io/en/stable/manual.html#predicates-and-relationships
關(guān)于這些操作的語義,詳見 https://en.wikipedia.org/wiki/DE-9IM
3.3 練習(xí)
我們將再次使用巴黎的數(shù)據(jù)集來做一些練習(xí),再次導(dǎo)入
districts = geopandas.read_file("data/paris_districts.geojson").to_crs(epsg=2154) stations = geopandas.read_file("data/paris_bike_stations.geojson").to_crs(epsg=2154)3.3.1 練習(xí)一:埃菲爾鐵塔
埃菲爾鐵塔是一座建于19世紀(jì)的鐵格子塔,可能是巴黎最具代表性的景觀
埃菲爾鐵塔的位置是:x:648237.3,y:6862271.9
- 以埃菲爾鐵塔的坐標(biāo)創(chuàng)建一個(gè) Shapely 點(diǎn)對象,并將其分配給一個(gè)名為 eiffel_tower 的變量,輸出結(jié)果
- 檢查埃菲爾鐵塔是否位于Montparnasse區(qū)
- 檢查 Montparnasse 區(qū)是否包含自行車站
- 計(jì)算埃菲爾鐵塔和自行車站之間的距離(注意:在這種情況下,距離以米為單位返回)。
# 檢查埃菲爾鐵塔是否位于Montparnasse區(qū) district_montparnasse.contains(eiffel_tower) False bike_station = stations.loc[293, 'geometry'] bike_station
# 檢查 Montparnasse 區(qū)是否包含自行車站 district_montparnasse.contains(bike_station) True # 計(jì)算埃菲爾鐵塔和自行車站之間的距離 eiffel_tower.distance(bike_station) 3540.1534488921966
3.3.2 練習(xí)二:埃菲爾鐵塔位于哪個(gè)區(qū)?
在上一個(gè)練習(xí)中,我們?yōu)榘7茽栬F塔的位置構(gòu)造了一個(gè) Point 對象,我們檢查了它不在montparnasse蒙帕納斯區(qū),現(xiàn)在我們的問題是確定它位于巴黎的哪個(gè)區(qū)
- 創(chuàng)建一個(gè)布爾掩模(或篩選器),表示每個(gè)區(qū)是否包含埃菲爾鐵塔。調(diào)用結(jié)果 mask 。
- 用布爾掩碼篩選 districts 數(shù)據(jù)框并打印結(jié)果。
<div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {vertical-align: top; }.dataframe thead th {text-align: right; }</style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>id</th> <th>district_name</th> <th>population</th> <th>geometry</th> </tr> </thead> <tbody> <tr> <th>27</th> <td>28</td> <td>Gros-Caillou</td> <td>25156</td> <td>POLYGON ((649336.752 6861767.761, 649110.815 6...</td> </tr> </tbody> </table> </div>
3.3.3 練習(xí)三:找出距離埃菲爾鐵塔最近的自行車站點(diǎn)
現(xiàn)在,我們可能對埃菲爾鐵塔附近的自行車站感興趣。為了探索它們,讓我們將埃菲爾鐵塔本身以及1公里內(nèi)的自行車站可視化。
要做到這一點(diǎn),我們可以計(jì)算出每個(gè)站點(diǎn)到埃菲爾鐵塔的距離。根據(jù)這個(gè)結(jié)果,我們就可以創(chuàng)建一個(gè)掩模,如果站點(diǎn)在1km以內(nèi),則取 True ,否則取 False ,然后用它來過濾站點(diǎn)的GeoDataFrame。最后,我們對這個(gè)子集進(jìn)行可視化處理。
- 計(jì)算每個(gè)站點(diǎn)到埃菲爾鐵塔的距離,并調(diào)用結(jié)果 dist_eiffel
- 打印距離最近的車站的距離(這是 dist_eiffel 的最小值)
- 選擇距離埃菲爾鐵塔小于1公里的 "站點(diǎn) "GeoDataFrame行(注意,距離是以米為單位)。調(diào)用結(jié)果 stations_eiffel
import contextily contextily.add_basemap(ax) ---------------------------------------------------------------------------KeyboardInterrupt Traceback (most recent call last)<ipython-input-43-2923a97d25c0> in <module>1 import contextily ----> 2 contextily.add_basemap(ax)/usr/local/lib/python3.6/dist-packages/contextily/plotting.py in add_basemap(ax, zoom, source, interpolation, attribution, attribution_size, reset_extent, crs, resampling, url, **extra_imshow_args)142 # Download image143 image, extent = bounds2img( --> 144 left, bottom, right, top, zoom=zoom, source=source, ll=False145 )146 # Warping/usr/local/lib/python3.6/dist-packages/contextily/tile.py in bounds2img(w, s, e, n, zoom, source, ll, wait, max_retries, url)246 x, y, z = t.x, t.y, t.z247 tile_url = _construct_tile_url(provider, x, y, z) --> 248 image = _fetch_tile(tile_url, wait, max_retries)249 tiles.append(t)250 arrays.append(image)/usr/local/lib/python3.6/dist-packages/joblib/memory.py in __call__(self, *args, **kwargs)563 564 def __call__(self, *args, **kwargs): --> 565 return self._cached_call(args, kwargs)[0]566 567 def __getstate__(self):/usr/local/lib/python3.6/dist-packages/joblib/memory.py in _cached_call(self, args, kwargs, shelving)529 530 if must_call: --> 531 out, metadata = self.call(*args, **kwargs)532 if self.mmap_mode is not None:533 # Memmap the output at the first call to be consistent with/usr/local/lib/python3.6/dist-packages/joblib/memory.py in call(self, *args, **kwargs)725 if self._verbose > 0:726 print(format_call(self.func, args, kwargs)) --> 727 output = self.func(*args, **kwargs)728 self.store_backend.dump_item(729 [func_id, args_id], output, verbose=self._verbose)/usr/local/lib/python3.6/dist-packages/contextily/tile.py in _fetch_tile(tile_url, wait, max_retries)301 @memory.cache302 def _fetch_tile(tile_url, wait, max_retries): --> 303 request = _retryer(tile_url, wait, max_retries)304 with io.BytesIO(request.content) as image_stream:305 image = Image.open(image_stream).convert("RGB")/usr/local/lib/python3.6/dist-packages/contextily/tile.py in _retryer(tile_url, wait, max_retries)441 """442 try: --> 443 request = requests.get(tile_url, headers={"user-agent": USER_AGENT})444 request.raise_for_status()445 except requests.HTTPError:/usr/local/lib/python3.6/dist-packages/requests/api.py in get(url, params, **kwargs)74 75 kwargs.setdefault('allow_redirects', True) ---> 76 return request('get', url, params=params, **kwargs)77 78 /usr/local/lib/python3.6/dist-packages/requests/api.py in request(method, url, **kwargs)59 # cases, and look like a memory leak in others.60 with sessions.Session() as session: ---> 61 return session.request(method=method, url=url, **kwargs)62 63 /usr/local/lib/python3.6/dist-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)528 }529 send_kwargs.update(settings) --> 530 resp = self.send(prep, **send_kwargs)531 532 return resp/usr/local/lib/python3.6/dist-packages/requests/sessions.py in send(self, request, **kwargs)641 642 # Send the request --> 643 r = adapter.send(request, **kwargs)644 645 # Total elapsed time of the request (approximately)/usr/local/lib/python3.6/dist-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)447 decode_content=False,448 retries=self.max_retries, --> 449 timeout=timeout450 )451 /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)675 body=body,676 headers=headers, --> 677 chunked=chunked,678 )679 /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)379 # Trigger any extra validation we need to do.380 try: --> 381 self._validate_conn(conn)382 except (SocketTimeout, BaseSSLError) as e:383 # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout./usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _validate_conn(self, conn)974 # Force connect early to allow us to validate the connection.975 if not getattr(conn, "sock", None): # AppEngine might not have `.sock` --> 976 conn.connect()977 978 if not conn.is_verified:/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in connect(self)368 ca_cert_data=self.ca_cert_data,369 server_hostname=server_hostname, --> 370 ssl_context=context,371 )372 /usr/local/lib/python3.6/dist-packages/urllib3/util/ssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data)375 ) or IS_SECURETRANSPORT:376 if HAS_SNI and server_hostname is not None: --> 377 return context.wrap_socket(sock, server_hostname=server_hostname)378 379 warnings.warn(/usr/lib/python3.6/ssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)405 suppress_ragged_eofs=suppress_ragged_eofs,406 server_hostname=server_hostname, --> 407 _context=self, _session=session)408 409 def wrap_bio(self, incoming, outgoing, server_side=False,/usr/lib/python3.6/ssl.py in __init__(self, sock, keyfile, certfile, server_side, cert_reqs, ssl_version, ca_certs, do_handshake_on_connect, family, type, proto, fileno, suppress_ragged_eofs, npn_protocols, ciphers, server_hostname, _context, _session)815 # non-blocking816 raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets") --> 817 self.do_handshake()818 819 except (OSError, ValueError):/usr/lib/python3.6/ssl.py in do_handshake(self, block)1075 if timeout == 0.0 and block:1076 self.settimeout(None) -> 1077 self._sslobj.do_handshake()1078 finally:1079 self.settimeout(timeout)/usr/lib/python3.6/ssl.py in do_handshake(self)687 def do_handshake(self):688 """Start the SSL/TLS handshake.""" --> 689 self._sslobj.do_handshake()690 if self.context.check_hostname:691 if not self.server_hostname:KeyboardInterrupt:
3.4 空間操作
除了返回布爾值的空間謂詞判斷外,Shapely和GeoPandas還提供了返回新幾何對象的操作
二元操作
<table><tr> <td> <img src="img/spatial-operations-base.png"/> </td> <td> <img src="img/spatial-operations-intersection.png"/> </td> </tr> <tr> <td> <img src="img/spatial-operations-union.png"/> </td> <td> <img src="img/spatial-operations-difference.png"/> </td> </tr></table>
緩沖區(qū):
<table><tr> <td> <img src="img/spatial-operations-buffer-point1.png"/> </td> <td> <img src="img/spatial-operations-buffer-point2.png"/> </td> </tr> <tr> <td> <img src="img/spatial-operations-buffer-line.png"/> </td> <td> <img src="img/spatial-operations-buffer-polygon.png"/> </td> </tr></table>
詳見 https://shapely.readthedocs.io/en/stable/manual.html#spatial-analysis-methods。
例如,使用上面的實(shí)驗(yàn)數(shù)據(jù),在布魯塞爾周圍構(gòu)造一個(gè)緩沖區(qū)(返回一個(gè)多邊形)
geopandas.GeoSeries([belgium,brussels.buffer(1)]).plot(alpha=0.5,cmap='tab10') <matplotlib.axes._subplots.AxesSubplot at 0x7efc0e60bfd0>
現(xiàn)在取這兩個(gè)多邊形的交、并、差
brussels.buffer(1).intersection(belgium)brussels.buffer(1).union(belgium)
brussels.buffer(1).difference(belgium)
另一個(gè)有用的屬性是 .unary_union ,它通過取所有這些矢量對象的并集,將GeoDataFrame中的一組矢量對象轉(zhuǎn)換為一個(gè)單一的矢量對象(即矢量數(shù)據(jù)的 融合 )
例如,我們可以為非洲大陸構(gòu)造一個(gè)單一對象
africa_countries = countries[countries['continent'] == 'Africa'] africa=africa_countries.unary_union africaprint(str(africa)[:1000]) MULTIPOLYGON (((32.83012047702888 -26.7421916643362, 32.58026492689768 -27.47015756603182, 32.46213260267845 -28.30101124442056, 32.20338870619304 -28.75240488049007, 31.52100141777888 -29.25738697684626, 31.325561150851 -29.40197763439891, 30.90176272962535 -29.90995696382804, 30.62281334811382 -30.42377573010613, 30.05571618014278 -31.14026946383296, 28.92555260591954 -32.1720411109725, 28.2197558936771 -32.77195281344886, 27.46460818859597 -33.2269637997788, 26.41945234549283 -33.61495045342619, 25.90966434093349 -33.6670402971764, 25.7806282895007 -33.94464609144834, 25.17286176931597 -33.79685149509358, 24.67785322439212 -33.98717579522455, 23.59404340993464 -33.79447437920815, 22.98818891774474 -33.91643075941698, 22.57415734222224 -33.86408253350531, 21.54279910654103 -34.25883879978294, 20.689052768647 -34.41717538832523, 20.07126102059763 -34.79513681410799, 19.61640506356457 -34.81916635512371, 19.19327843595872 -34.46259897230979, 18.85531456876987 -34.44430551527847, 18.424
<div class="alert alert-info" style="font-size:120%">
注意
GeoPandas(和Shapely的單個(gè)對象)提供了大量的基本方法來分析地理空間數(shù)據(jù)(距離、長度、中心點(diǎn)、邊界、凸殼、簡化、變換......),比我們在本教程中能接觸到的幾種方法多得多。
- GeoPandas提供的所有方法的概述可以在這里找到: http://geopandas.readthedocs.io/en/latest/reference.html
</div>
3.5 空間操作練習(xí)
3.5.1 練習(xí)一:塞納河流經(jīng)哪些區(qū)
下面,以類似于GeoJSON的特征字典(創(chuàng)建于 http://geojson.io) 的方式提供了巴黎附近的塞納河的坐標(biāo)
根據(jù)這個(gè) seine 對象,我們想知道哪些地區(qū)離塞納河很近(最多150米)
- 在塞納河周圍建立一個(gè)150米的緩沖區(qū)
- 檢查哪些區(qū)域與該緩沖對象相交
- 將各區(qū)的情況可視化,說明哪些區(qū)位于塞納河附近
seine=s_seine_utm.geometry.squeeze() seine_buffer=seine.buffer(150) seine_buffer
districts_seine = districts[districts.intersects(seine_buffer)] districts_seine.plot() <matplotlib.axes._subplots.AxesSubplot at 0x7efc06f1d748>
fig, ax = plt.subplots(figsize=(16, 7)) districts.plot(ax=ax,color='grey',alpha=0.4,edgecolor='k') districts_seine.plot(ax=ax,color='red',alpha=0.4,edgecolor='k') s_seine_utm.plot(ax=ax,color='blue') <matplotlib.axes._subplots.AxesSubplot at 0x7efc06ea8630>
總結(jié)
以上是生活随笔為你收集整理的GeoPandas入门 | 03-空间关系和操作的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 数据结构试卷及答案(一)
- 下一篇: 安卓模拟器封包抓取加解密