當前位置：首頁 > 编程语言 > python >内容正文

python

python中字典数据的特点_Python字典(Dictionary) 在数据分析中的操作

發布時間：2024/3/7 python 23 豆豆

生活随笔收集整理的這篇文章主要介紹了 python中字典数据的特点_Python字典(Dictionary) 在数据分析中的操作小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

今天來聊聊python中的字典在數據分析中的應用，為了貼近實戰關于簡單結構的字典就略過。

今天要聊的字典結構是如下這類復雜結構：

{

"id": "2406124091",

"type": "node",

"visible":"true",

"created": {

"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"

"pos": [41.9757030, -87.6921867],

"address": {

"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"

"amenity": "restaurant",

"cuisine": "mexican",

"name": "La Cabana De Don Luis",

"phone": "1 (773)-271-5176"

}

這類數據結構是為了方便寫成JSON，或者存入MongoDB使用而存在的。為了便于理解和掌握這種復雜字典的操作方式，我們采取幾個有趣的實驗，來感受一下:

一、復雜結構字典是否可以拆分成簡單結構的字典

如果把這個復雜結構拆分成幾個結構簡單的小字典或者列表，那么處理起來就會簡單許多：

##第一個小字典

{"id": "2406124091",

"type": "node",

"visible":"true"}

##第二個小字典

{"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"}

##一個小列表

[41.9757030, -87.6921867]

##第三個小字典

{"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"}

##第四個小字典

{"amenity": "restaurant",

"cuisine": "mexican",

"name": "La Cabana De Don Luis",

"phone": "1 (773)-271-5176"}

接下來，我們看看用哪種方法可以進行合并：

d1 = {"id": "2406124091",

"type": "node",

"visible":"true"}

d2 = {"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"}

l1 = [41.9757030, -87.6921867]

d3 = {"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"}

d4 = {"amenity": "restaurant",

"cuisine": "mexican",

"name": "La Cabana De Don Luis",

"phone": "1 (773)-271-5176"}

d = {d1,d2,l1,d3,d4}

#Traceback (most recent call last):

# File "", line 1, in

# d = {d1,d2,l1,d3,d4}

#TypeError: unhashable type: 'dict'

###簡單粗暴的合并，可惜這樣的合并是不可行的

###嘗試加上標簽后進行合并

d = d1

d['created'] = d2

d['pos'] = l1

d['address'] = d3

d = dict(d,**d4)

pprint.pprint(d)

#{'address': {'housenumber': '5157',

# 'postcode': '60625',

# 'street': 'North Lincoln Ave'},

# 'amenity': 'restaurant',

# 'created': {'changeset': '17206049',

# 'timestamp': '2013-08-03T16:43:42Z',

# 'uid': '1219059',

# 'user': 'linuxUser16',

# 'version': '2'},

# 'cuisine': 'mexican',

# 'id': '2406124091',

# 'name': 'La Cabana De Don Luis',

# 'phone': '1 (773)-271-5176',

# 'pos': [41.975703, -87.6921867],

# 'type': 'node',

# 'visible': 'true'}

###成功完成復雜字典的合并，但是有個問題，順序不對。在一些特定應用場景中，字典中的數據結構

###是被嚴格要求的。那么需要繼續進行帶有順序要求的控制。

d = {'created':d2,'pos':l1,'address':d3}

pprint.pprint(d)

#{'address': {'housenumber': '5157',

# 'postcode': '60625',

# 'street': 'North Lincoln Ave'},

# 'created': {'changeset': '17206049',

# 'timestamp': '2013-08-03T16:43:42Z',

# 'uid': '1219059',

# 'user': 'linuxUser16',

# 'version': '2'},

# 'pos': [41.975703, -87.6921867]}

###成功完成了按順序的合并，但是d1和d4的字典卻無法進行可控的合并，采用dict()函數合并后，

###元素會添加在最后，這就又回到最初的情況

二、由上一個實驗可知，兩個字典直接合并可行，但結構順序無法控制，需要對一些結構進行再分解。

d1 = {"id": "2406124091"}

d2 = {"type": "node"}

d3 = {"visible":"true"}

d4 = {"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"}

l1 = [41.9757030, -87.6921867]

d5 = {"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"}

d6 = {"amenity": "restaurant"}

d7 = {"cuisine": "mexican"}

d8 ={"name": "La Cabana De Don Luis"}

d9 = {"phone": "1 (773)-271-5176"}

拆分完之后是這個樣子：

d = {'id':d1,'type':d2,'visible':d3,'created':d4,'pos':l1,'address':d5,

'amenity':d6,'cuisine':d7,'name':d8,'phone':d9}

import pprint

pprint.pprint(d)

#{'address': {'housenumber': '5157',

# 'postcode': '60625',

# 'street': 'North Lincoln Ave'},

# 'amenity': {'amenity': 'restaurant'},

# 'created': {'changeset': '17206049',

# 'timestamp': '2013-08-03T16:43:42Z',

# 'uid': '1219059',

# 'user': 'linuxUser16',

# 'version': '2'},

# 'cuisine': {'cuisine': 'mexican'},

# 'id': {'id': '2406124091'},

# 'name': {'name': 'La Cabana De Don Luis'},

# 'phone': {'phone': '1 (773)-271-5176'},

# 'pos': [41.975703, -87.6921867],

# 'type': {'type': 'node'},

# 'visible': {'visible': 'true'}}

###沒有出現我們想要的結果，除了結構是混亂的之外，重新構建的字典中，數據結構也出現的明顯的

###錯誤。嘗試另外一種構建方式：

d = {d1,d2,d3,'created':d4,'pos':l1,'address':d5,d6,d7,d8,d9}

d = {d1,d2,d3,d4,l1,d5,d6,d7,d8,d9}

#Traceback (most recent call last):

# File "", line 1, in

# d = {d1,d2,d3,d4,l1,d5,d6,d7,d8,d9}

#TypeError: unhashable type: 'dict'

d = {d1,d2,d3,d4,'pos':l1,d5,d6,d7,d8,d9}

# File "", line 1

# d = {d1,d2,d3,d4,'pos':l1,d5,d6,d7,d8,d9}

# ^

#SyntaxError: invalid syntax

###這種構建方式，過于異想天開了，語法是錯誤的。

###dict()函數合并的方式我不打算嘗試了，應為其中的l1是list，這個是無法用這個函數合并的。

三、直接固定字典的格式，然后對其填充數值或者內容

d = {

"id": "",

"type": "",

"visible":"",

"created": {

"version":"",

"changeset":"",

"timestamp":"",

"user":"",

"uid":""

"pos": [0,0],

"address": {

"housenumber": "",

"postcode": "",

"street": ""

"amenity": "",

"cuisine": "",

"name": "",

"phone": ""

}

d['id']='2406124091'

d['address']['housenumber']='123456'

pprint.pprint(d['id'])

#'2406124091'

pprint.pprint(d['address'])

#{'housenumber': '123456', 'postcode': '', 'street': ''}

###成功完成寫入操作，這種方式需要配合循環和條件判斷語句使用。好處是，讓數據結構得以固定。

最后，編程這事，沒有唯一答案，條條道路通羅馬。找到自己最喜歡最順手的方式是最好的。

其實還有一種方法是使用eval()函數，但介于廣大高手們痛恨和鄙視使用該函數，所以這里就不在對這個函數的用法進行探討。

總結

以上是生活随笔為你收集整理的python中字典数据的特点_Python字典(Dictionary) 在数据分析中的操作的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。