當(dāng)前位置：首頁(yè) > 编程语言 > python >内容正文

python

Python社区发现—Louvain—networkx和community

發(fā)布時(shí)間：2023/12/10 python 41 豆豆

生活随笔收集整理的這篇文章主要介紹了 Python社区发现—Louvain—networkx和community 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

社區(qū)

如果一張圖是對(duì)一片區(qū)域的描述的話，將這張圖劃分為很多個(gè)子圖。當(dāng)子圖之內(nèi)滿足關(guān)聯(lián)性盡可能大，而子圖之間關(guān)聯(lián)性盡可能低時(shí)，這樣的子圖可以稱之為一個(gè)社區(qū)。

社區(qū)發(fā)現(xiàn)算法

社區(qū)發(fā)現(xiàn)算法有很多，例如LPA，HANP，SLPA以及Louvain，不同的算法劃分社區(qū)的效果不盡相同。Louvain算法是基于模塊度的社區(qū)發(fā)現(xiàn)算法，該算法在效率和效果上都表現(xiàn)較好，并且能夠發(fā)現(xiàn)層次性的社區(qū)結(jié)構(gòu)，其優(yōu)化目標(biāo)是最大化整個(gè)社區(qū)網(wǎng)絡(luò)的模塊度。

模塊度

模塊度是評(píng)估一個(gè)社區(qū)網(wǎng)絡(luò)劃分好壞的度量方法，它的物理含義是社區(qū)內(nèi)節(jié)點(diǎn)的連邊數(shù)與隨機(jī)情況下的邊數(shù)只差，它的取值范圍是 [?1/2,1)。可以簡(jiǎn)單地理解為社區(qū)內(nèi)部邊的權(quán)重減去所有與社區(qū)節(jié)點(diǎn)相連的邊的權(quán)重和，對(duì)無(wú)向圖更好理解，即社區(qū)內(nèi)部邊的度數(shù)減去社區(qū)內(nèi)節(jié)點(diǎn)的總度數(shù)。

Louvain算法

算法流程：
1、初始時(shí)將每個(gè)頂點(diǎn)當(dāng)作一個(gè)社區(qū)，社區(qū)個(gè)數(shù)與頂點(diǎn)個(gè)數(shù)相同。
2、依次將每個(gè)頂點(diǎn)與之相鄰頂點(diǎn)合并在一起，計(jì)算它們的模塊度增益是否大于0，如果大于0，就將該結(jié)點(diǎn)放入該相鄰結(jié)點(diǎn)所在社區(qū)。
3、迭代第二步，直至算法穩(wěn)定，即所有頂點(diǎn)所屬社區(qū)不再變化。
4、將各個(gè)社區(qū)所有節(jié)點(diǎn)壓縮成為一個(gè)結(jié)點(diǎn)，社區(qū)內(nèi)點(diǎn)的權(quán)重轉(zhuǎn)化為新結(jié)點(diǎn)環(huán)的權(quán)重，社區(qū)間權(quán)重轉(zhuǎn)化為新結(jié)點(diǎn)邊的權(quán)重。
5、重復(fù)步驟1-3，直至算法穩(wěn)定。

# coding=utf-8 import collections import randomdef load_graph(path):G = collections.defaultdict(dict)with open(path) as text:for line in text:vertices = line.strip().split()v_i = int(vertices[0])v_j = int(vertices[1])w = float(vertices[2])G[v_i][v_j] = wG[v_j][v_i] = wreturn Gclass Vertex():def __init__(self, vid, cid, nodes, k_in=0):self._vid = vidself._cid = cidself._nodes = nodesself._kin = k_in # 結(jié)點(diǎn)內(nèi)部的邊的權(quán)重class Louvain():def __init__(self, G):self._G = Gself._m = 0 # 邊數(shù)量self._cid_vertices = {} # 需維護(hù)的關(guān)于社區(qū)的信息(社區(qū)編號(hào),其中包含的結(jié)點(diǎn)編號(hào)的集合)self._vid_vertex = {} # 需維護(hù)的關(guān)于結(jié)點(diǎn)的信息(結(jié)點(diǎn)編號(hào)，相應(yīng)的Vertex實(shí)例)for vid in self._G.keys():self._cid_vertices[vid] = set([vid])self._vid_vertex[vid] = Vertex(vid, vid, set([vid]))self._m += sum([1 for neighbor in self._G[vid].keys() if neighbor > vid])def first_stage(self):mod_inc = False # 用于判斷算法是否可終止visit_sequence = self._G.keys()random.shuffle(list(visit_sequence))while True:can_stop = True # 第一階段是否可終止for v_vid in visit_sequence:v_cid = self._vid_vertex[v_vid]._cidk_v = sum(self._G[v_vid].values()) + self._vid_vertex[v_vid]._kincid_Q = {}for w_vid in self._G[v_vid].keys():w_cid = self._vid_vertex[w_vid]._cidif w_cid in cid_Q:continueelse:tot = sum([sum(self._G[k].values()) + self._vid_vertex[k]._kin for k in self._cid_vertices[w_cid]])if w_cid == v_cid:tot -= k_vk_v_in = sum([v for k, v in self._G[v_vid].items() if k in self._cid_vertices[w_cid]])delta_Q = k_v_in - k_v * tot / self._m # 由于只需要知道delta_Q的正負(fù)，所以少乘了1/(2*self._m)cid_Q[w_cid] = delta_Qcid, max_delta_Q = sorted(cid_Q.items(), key=lambda item: item[1], reverse=True)[0]if max_delta_Q > 0.0 and cid != v_cid:self._vid_vertex[v_vid]._cid = cidself._cid_vertices[cid].add(v_vid)self._cid_vertices[v_cid].remove(v_vid)can_stop = Falsemod_inc = Trueif can_stop:breakreturn mod_incdef second_stage(self):cid_vertices = {}vid_vertex = {}for cid, vertices in self._cid_vertices.items():if len(vertices) == 0:continuenew_vertex = Vertex(cid, cid, set())for vid in vertices:new_vertex._nodes.update(self._vid_vertex[vid]._nodes)new_vertex._kin += self._vid_vertex[vid]._kinfor k, v in self._G[vid].items():if k in vertices:new_vertex._kin += v / 2.0cid_vertices[cid] = set([cid])vid_vertex[cid] = new_vertexG = collections.defaultdict(dict)for cid1, vertices1 in self._cid_vertices.items():if len(vertices1) == 0:continuefor cid2, vertices2 in self._cid_vertices.items():if cid2 <= cid1 or len(vertices2) == 0:continueedge_weight = 0.0for vid in vertices1:for k, v in self._G[vid].items():if k in vertices2:edge_weight += vif edge_weight != 0:G[cid1][cid2] = edge_weightG[cid2][cid1] = edge_weightself._cid_vertices = cid_verticesself._vid_vertex = vid_vertexself._G = Gdef get_communities(self):communities = []for vertices in self._cid_vertices.values():if len(vertices) != 0:c = set()for vid in vertices:c.update(self._vid_vertex[vid]._nodes)communities.append(c)return communitiesdef execute(self):iter_time = 1while True:iter_time += 1mod_inc = self.first_stage()if mod_inc:self.second_stage()else:breakreturn self.get_communities()if __name__ == '__main__':G = load_graph('s.txt')algorithm = Louvain(G)communities = algorithm.execute()# 按照社區(qū)大小從大到小排序輸出communities = sorted(communities, key=lambda b: -len(b)) # 按社區(qū)大小排序count = 0for communitie in communities:count += 1print("社區(qū)", count, " ", communitie)

networkx和community社區(qū)劃分和可視化

安裝

使用community安裝python-louvain即可
pip install python-louvain
pip install networkx

使用

最佳劃分

community.best_partition(graph, partition=None, weight='weight', resolution=1.0)

Compute the partition of the graph nodes which maximises the modularity (or try…) using the Louvain heuristics.
This is the partition of highest modularity, i.e. the highest partition of the dendrogram generated by the Louvain algorithm.

import community import networkx as nx import matplotlib.pyplot as plt#better with karate_graph() as defined in networkx example. #erdos renyi don't have true community structure G = nx.erdos_renyi_graph(30, 0.05)#first compute the best partition partition = community.best_partition(G)#drawing size = float(len(set(partition.values()))) pos = nx.spring_layout(G) count = 0. for com in set(partition.values()) :count = count + 1.list_nodes = [nodes for nodes in partition.keys()if partition[nodes] == com]nx.draw_networkx_nodes(G, pos, list_nodes, node_size = 20,node_color = str(count / size))nx.draw_networkx_edges(G,pos, alpha=0.5) plt.show()

總結(jié)

以上是生活随笔為你收集整理的Python社区发现—Louvain—networkx和community的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： python xlrd模块_python
下一篇：阿里云RPA（机器人流程自动化）干货系列