【轨迹数据集】GPS轨迹数据集整理
原博文:
本文主要是整理了GPS軌跡數(shù)據(jù)集免費(fèi)資源庫,從這些庫中能夠免費(fèi)下載到GPS數(shù)據(jù),同時(shí)還整理出了這些數(shù)據(jù)的格式,數(shù)據(jù)集的簡單描述等等。如果你發(fā)現(xiàn)更好的相關(guān)數(shù)據(jù)資源,歡迎共享 :)
1. GeoLife GPS Trajectories
該GPS軌跡數(shù)據(jù)集出自微軟研究GeoLift項(xiàng)目。從2007年四月到2012年八月收集了182個(gè)用戶的軌跡數(shù)據(jù)。這些數(shù)據(jù)包含了一系列以時(shí)間為序的點(diǎn),每一個(gè)點(diǎn)包含經(jīng)緯度、海拔等信息。包含了17621個(gè)軌跡,總距離120多萬公里,總時(shí)間48000多小時(shí)。這些數(shù)據(jù)不僅僅記錄了用戶在家和在工作地點(diǎn)的位置軌跡,還記錄了大范圍的戶外活動軌跡,比如購物、旅游、遠(yuǎn)足、騎自行車。
這個(gè)數(shù)據(jù)集可以用來進(jìn)行用戶活動相似度估算,移動模型挖掘,用戶活動推薦,基于位置的社交網(wǎng)絡(luò),位置隱私,位置推薦。
-
時(shí)間:2007年4月~2012年8月
-
數(shù)據(jù)大小:大概300M.
-
下載地址:https://www.microsoft.com/en-us/download/details.aspx?id=52367
-
數(shù)據(jù)格式:
一個(gè)文件夾存儲一個(gè)用戶的GPS日志,這些日志文件都被轉(zhuǎn)換成了plt格式。為了避免時(shí)間區(qū)間問題,統(tǒng)一使用了GMT格式的時(shí)間表示。其他具體格式為:
Line 1…6 are useless in this dataset, andcan be ignored. Points are described in following lines, one for each line. Field 1: Latitude in decimal degrees. Field 2: Longitude in decimal degrees. Field 3: All set to 0 for this dataset. Field 4: Altitude in feet (-777 if notvalid). Field 5: Date - number of days (withfractional part) that have passed since 12/30/1899. Field 6: Date as a string. Field 7: Time as a string. Note that field 5 and field 6&7represent the same date/time in this dataset. You may use either of them.Example: 39.906631,116.385564,0,492,40097.5864583333,2009-10-11,14:04:30 39.906554,116.385625,0,492,40097.5865162037,2009-10-11,14:04:35交通方式數(shù)據(jù)集格式:
可能的交通方式有:walk,bike, bus, car, subway, train, airplane, boat, run and motorcycle,再次強(qiáng)調(diào),雖然大多數(shù)數(shù)據(jù)是在中國產(chǎn)生的,但是,還是把時(shí)間或者日期都統(tǒng)一以GMT的時(shí)間形式表示。
例如:
Start Time End Time Transportation Mode
2008/04/02 11:24:21 2008/04/02 11:50:45bus
具體說明在下載的文件壓縮包中!
- 使用到該數(shù)據(jù)的論文有:
Q. Li, Y. Zheng, X. Xie, Y. Chen, W. Liu, and M. Ma. 2008.Mining user similarity based on location history. In Proceedings of the 16thAnnual ACM International Conference on Advances in Geographic InformationSystems. ACM, 34.
Z. Chen, H. T. Shen, X. Zhou, Y. Zheng, and X. Xie. 2010.Searching trajectories by locations—An efficient study. In Proceedings of the29th ACM SIGMOD International Conference on Management of Data. ACM,255–266.
[1] Yu Zheng, Lizhu Zhang, Xing Xie, Wei-Ying Ma. Mininginteresting locations and travel sequences from GPS trajectories. InProceedings of International conference on World Wild Web (WWW 2009), MadridSpain. ACM Press: 791-800.
[2] Yu Zheng, Quannan Li, Yukun Chen, Xing Xie, Wei-Ying Ma.Understanding Mobility Based on GPS Data. In Proceedings of ACM conference onUbiquitous Computing (UbiComp 2008), Seoul, Korea. ACM Press: 312-321.
[3] Yu Zheng, Xing Xie, Wei-Ying Ma, GeoLife: ACollaborative Social Networking Service among User, location and trajectory.Invited paper, in IEEE Data Engineering Bulletin. 33, 2, 2010, pp. 32-40.
2.T-Drive Taxi Trajectories
這個(gè)數(shù)據(jù)來自微軟T-Drive項(xiàng)目,包含在2008年北京一萬多倆出租車一周的軌跡數(shù)據(jù)。這個(gè)數(shù)據(jù)集包含了1500萬個(gè)坐標(biāo)點(diǎn),軌跡的總距離達(dá)到900多萬公里。
-
時(shí)間:2008年
-
數(shù)據(jù)大小:80M左右。
- 數(shù)據(jù)集下載地址:
https://www.microsoft.com/en-us/research/publication/t-drive-trajectory-data-sample/
- 數(shù)據(jù)詳細(xì)說明:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/User_guide_T-drive.pdf
- 數(shù)據(jù)格式:
Here is a piece ofsample in a file:
1,2008-02-0215:36:08,116.51172,39.92123 1,2008-02-0215:46:08,116.51135,39.93883 1,2008-02-0215:46:08,116.51135,39.93883 1,2008-02-0215:56:08,116.51627,39.91034 1,2008-02-0216:06:08,116.47186,39.91248 1,2008-02-0216:16:08,116.47217,39.92498 1,2008-02-02 16:26:08,116.47179,39.90718 1,2008-02-0216:36:08,116.45617,39.90531 1,2008-02-0217:00:24,116.47191,39.90577 1,2008-02-0217:10:24,116.50661,39.9145 1,2008-02-0220:30:34,116.49625,39.9146每一個(gè)字段的所代表的意思是:
taxi id, date time,longitude, latitude
- 使用到該數(shù)據(jù)集的論文有:
J. Yuan, Y. Zheng, and X. Xie. 2012. Discovering regions ofdifferent functions in a city using human mobility and POIs. In Proceedings ofthe 18th ACM SIGKDD International Conference on Knowledge Discovery and DataMining. ACM, 186–194.
J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, and Y.Huang. 2010a. T-Drive: Driving directions based on taxi trajectories. InProceedings of the 18th Annual ACM International Conference on Advances inGeographic Information Systems. ACM, 99–108.
J. Yuan, Y. Zheng, X. Xie, and G. Sun. 2011a. Driving withknowledge from the physical world. In Proceedings of the 17th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining. ACM, 316–324.
J. Yuan, Y. Zheng, X. Xie, and G. Sun. 2013a. T-Drive:Enhancing driving directions with taxi drivers’
intelligence. IEEE Transaction on Knowledge and DataEngineering 25, 1 (2013), 220–232.
N. J. Yuan, Y. Zheng, L. Zhang, and X. Xie. 2013b. T-Finder:A recommender system for finding passengers and vacant taxis. IEEE Transactionon Knowledge and Data Engineering 25, 10 (2013), 2390–2403.
N. J. Yuan, Y. Zheng, X. Xie, Y. Wang, K. Zheng, and H.Xiong. 2015. Discovering urban functional zones using latent activitytrajectories. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2015),1041–4347.
S. Ma, Y. Zheng, and O. Wolfson. 2013. T-Share: Alarge-scale dynamic taxi ridesharing service. In Proceedings of the 29th IEEEInternational Conference on Data Engineering. IEEE, 410–421.
S. Ma, Y. Zheng, and O. Wolfson. 2015. Real-time city-scaletaxi ridesharing. IEEE Transactions on Knowledge and Data Engineering 99.DOI:http://doi.ieeecomputersociety.org/10.1109/TKDE.2014.2334313
Jing Yuan, Yu Zheng, Xing Xie, and Guangzhong Sun. Drivingwith knowledge from the physical world. In The 17th ACM SIGKDD internationalconference on Knowledge Discovery and Data mining, KDD’11, New York, NY, USA,2011. ACM.
Jing Yuan, Yu Zheng, Chengyang Zhang, Wenlei Xie, Xing Xie, Guangzhong Sun, and?
Yan Huang. T-drive: driving directions based on taxi trajectories. In?
Proceedings of the 18th SIGSPATIAL International Conference on Advances in?
Geographic Information Systems, GIS ’10, pages 99-108, New York, NY, USA,2010.?
ACM.
3. GPS Trajectories with transportationmode labels
這個(gè)數(shù)據(jù)集是微軟亞洲研究院Geolift項(xiàng)目用到的GPS軌跡數(shù)據(jù)集的一部分。這個(gè)數(shù)據(jù)集代表按時(shí)間順序排序的點(diǎn)集,每一個(gè)點(diǎn)所包含的信息有經(jīng)緯度、高度、速度和當(dāng)前朝向等等。這些軌跡數(shù)據(jù)是由不同的GPS設(shè)備收集的,這些設(shè)備的數(shù)據(jù)收集頻率是不一樣的。95%的軌跡是密集的,比如每2~5秒或者每5~10米一個(gè)點(diǎn)。
軌跡數(shù)據(jù)文件被轉(zhuǎn)換成了.plt格式,每一個(gè)軌跡還有一個(gè)單獨(dú)文件存儲的交通方式標(biāo)簽文件,比如開車、坐公交車、騎自行車、步行。
- 時(shí)間:2008年
- 數(shù)據(jù)集大小:大概80M。
- 下載地址:
https://www.microsoft.com/en-us/research/publication/gps-trajectories-with-transportation-mode-labels/
- 數(shù)據(jù)格式:
交通方式數(shù)據(jù)格式:
Date Start Time End Time Transportationmodes 2008/3/1 11:07:00 11:40:00 walk 2008/3/1 11:44:00 12:07:00 bus 2008/3/1 12:07:00 13:30:00 walk 2008/3/1 13:30:00 13:55:00 car 2008/3/1 13:55:00 14:16:00 walkPlt格式文件數(shù)據(jù)的格式:
39.977685,116.3276249,1,0,39539.1428935185,2008/04/01,03:25:46 39.9777233,116.3276216,0,0,39539.1429050926,2008/04/01,03:25:47 39.9778499,116.3276266,0,0,39539.1429398148,2008/04/01,03:25:50 39.9779866,116.3276249,0,0,39539.142974537,2008/04/01,03:25:53 39.97812,116.3276133,0,0,39539.1430092593,2008/04/01,03:25:56 第一個(gè)字段:緯度(十進(jìn)制) 第二個(gè)字段:緯度(十進(jìn)制) 第三個(gè)字段:0表示正常,1表示在軌跡中斷 第四個(gè)字段:海拔高度(英尺),-777表示無效 第五個(gè)字段:日期—注意下面的日期格式,如果是空白的,就會使用一個(gè)預(yù)設(shè)的日期。 第六個(gè)字段:日期字符串 第七個(gè)字段:時(shí)間字符串需要注意的是:?
具體請查看官方說明:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/User20Guide-with20labels.pdf)
- 使用到該數(shù)據(jù)集的論文:
Y. Zheng, Q. Li, Y. Chen, and X. Xie. 2008a. Understandingmobility based on GPS data. In Proceedings of the 11th International Conferenceon Ubiquitous Computing. ACM, 312–321.
Y. Zheng, L. Liu, L. Wang, and X. Xie. 2008b. Learningtransportation mode from raw GPS data for geographic application on the Web. InProceedings of the 17th International Conference on World Wide Web.ACM,247–256.
[1] Yu Zheng, Like Liu, Longhao Wang, Xing Xie. LearningTransportation Modes from Raw GPS Data for Geographic Application on the Web,In Proceedings of International conference on World Wild Web (WWW 2008), Beijing,China. ACM Press: 247-256
[2] Yu Zheng, Quannan Li, Yukun Chen, Xing Xie. Understanding Mobility Based on?
GPS Data. In Proceedings of ACM conference on Ubiquitous Computing (UbiComp?
2008), Seoul, Korea. ACM Press: 312–321.
[3] Yu Zheng, Yukun Chen, Quannan Li, Xing Xie, Wei-Ying Ma.Understanding transportation modes based on GPS data for Web applications. ACMTransaction on the Web. Volume 4, Issue 1, January, 2010. pp. 1-36.
4. 社交網(wǎng)絡(luò)簽到數(shù)據(jù)集:
這是一個(gè)基于社交網(wǎng)絡(luò)的網(wǎng)站的用戶簽到的數(shù)據(jù)集,來自斯坦福大學(xué)網(wǎng)站。好友網(wǎng)絡(luò)不是直接相連接的,這些數(shù)據(jù)是通過網(wǎng)站的公共接口獲取的,包含了196591個(gè)節(jié)點(diǎn)和950327個(gè)邊。從2009年2月到2010年10月總共收集了6442890個(gè)簽到記錄。
- 數(shù)據(jù)集大小:用戶簽到的時(shí)間和位置的文件有101M, 好友網(wǎng)絡(luò)數(shù)據(jù):6.1M.
- 數(shù)據(jù)格式:
- 數(shù)據(jù)集下載地址:https://snap.stanford.edu/data/loc-gowalla.html
這個(gè)也是上面同一家網(wǎng)站所產(chǎn)生的數(shù)據(jù),也是基于社交網(wǎng)絡(luò)數(shù)據(jù),大約300M, 詳情和下載網(wǎng)址為;http://www.yongliu.org/datasets
- 使用到該數(shù)據(jù)集的論文有:
E. Cho, S. A. Myers, J. Leskovec. Friendship and Mobility:?Friendship and Mobility: User Movement in Location-BasedSocial Networks?ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD), 2011.
- 使用了check-in類型數(shù)據(jù)集的論文有:
L. Wei, Y. Zheng, and W. Peng. 2012. Constructing popularroutes from uncertain trajectories. In Proceedings of the 18th ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining. ACM, 195–203.
J. Bao, Y. Zheng, and M. F. Mokbel. 2012. Location-based andpreference-aware recommendation using sparse geo-social networking data. InProceedings of the 20th ACM SIGSPATIAL International Conference on Advances inGeographic Information Systems. ACM, 199–208.
2013年Foursquare的數(shù)據(jù)集(150M):
- 詳情:https://archive.org/details/201309_foursquare_dataset_umn
- 下載:https://archive.org/download/201309_foursquare_dataset_umn
- 其他check-in數(shù)據(jù)集下載地址:?https://sites.google.com/site/yangdingqi/home/foursquare-dataset
5. 這個(gè)是國家颶風(fēng)中心的數(shù)據(jù)
(1)大西洋颶風(fēng)數(shù)據(jù)庫,時(shí)間為1851到2015年之間,這個(gè)數(shù)據(jù)集在2016年7月6日提供,包含了1956年到1960年修訂之后的。這個(gè)數(shù)據(jù)集叫HURDAT2, 之前那個(gè)HURDAT被替換了。
- 數(shù)據(jù)大小:5.9MB.
- 下載地址:http://www.nhc.noaa.gov/data/hurdat/hurdat2-1851-2015-070616.txt
- 其他詳細(xì)信息:http://www.nhc.noaa.gov/data/
- 還可以查看:http://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf
- 數(shù)據(jù)格式:
這個(gè)數(shù)據(jù)集用逗號分隔的文本,六小時(shí)信息的位置,最大的風(fēng),中央的壓力,和(從2004開始)所有已知的熱帶氣旋和熱帶氣旋的大小。
(2)1949-2015年東北部和北部太平洋中心颶風(fēng)數(shù)據(jù)庫,大概3.2兆。
- 下載地址:http://www.nhc.noaa.gov/data/hurdat/hurdat2-nepac-1949-2015-050916.txt
- 數(shù)據(jù)格式和上面的數(shù)據(jù)集的是一樣子的。
- 具體還可以查看:http://www.nhc.noaa.gov/data/hurdat/hurdat2-format-nencpac.pdf
6. 其他數(shù)據(jù)
時(shí)空數(shù)據(jù),網(wǎng)絡(luò)數(shù)據(jù), 數(shù)據(jù)流, 神經(jīng)圖像數(shù)據(jù),生物信息學(xué)(基因表達(dá))數(shù)據(jù)集http://dm.uestc.edu.cn/resource/
Natural Earth :http://www.naturalearthdata.com/
Machine Learning Repository:?http://archive.ics.uci.edu/ml/
Google Trends Datastore:?http://googletrends.github.io/data/
Open Data Network:?https://www.opendatanetwork.com/
總結(jié)
以上是生活随笔為你收集整理的【轨迹数据集】GPS轨迹数据集整理的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 【武忠祥高等数学基础课笔记】常微分方程
- 下一篇: java awt还有用吗,有了swing