當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ML之HierarchicalClustering：自定义HierarchicalClustering层次聚类算法

發布時間：2025/3/21 编程问答 15 豆豆

生活随笔收集整理的這篇文章主要介紹了 ML之HierarchicalClustering：自定义HierarchicalClustering层次聚类算法小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

ML之HierarchicalClustering：自定義HierarchicalClustering層次聚類算法

輸出結果

實現代碼

輸出結果

更新……

實現代碼

# -*- encoding=utf-8 -*-from numpy import *class cluster_node: #定義cluster_node類，類似Java中的構造函數def __init__(self,vec,left=None,right=None,distance=0.0,id=None,count=1):self.left=left self.right=rightself.vec=vecself.id=idself.distance=distanceself.count=count #only used for weighted average def L2dist(v1,v2): return sqrt(sum((v1-v2)**2))def L1dist(v1,v2): return sum(abs(v1-v2))def hcluster(features,distance=L2dist): #cluster the rows of the "features" matrixdistances={} currentclustid=-1 # clusters are initially just the individual rowsclust=[cluster_node(array(features[i]),id=i) for i in range(len(features))]while len(clust)>1: lowestpair=(0,1) closest=distance(clust[0].vec,clust[1].vec)for i in range(len(clust)):for j in range(i+1,len(clust)):# distances is the cache of distance calculationsif (clust[i].id,clust[j].id) not in distances: distances[(clust[i].id,clust[j].id)]=distance(clust[i].vec,clust[j].vec)d=distances[(clust[i].id,clust[j].id)] if d<closest: closest=dlowestpair=(i,j) mergevec=[(clust[lowestpair[0]].vec[i]+clust[lowestpair[1]].vec[i])/2.0 \for i in range(len(clust[0].vec))]newcluster=cluster_node(array(mergevec),left=clust[lowestpair[0]],right=clust[lowestpair[1]],distance=closest,id=currentclustid)currentclustid-=1 del clust[lowestpair[1]]del clust[lowestpair[0]]clust.append(newcluster)return clust[0]def extract_clusters(clust,dist): #(clust上邊的樹形結構，dist閾值)# extract list of sub-tree clusters from hcluster tree with distance<distclusters = {}if clust.distance<dist:# we have found a cluster subtreereturn [clust] else:# check the right and left branchescl = [] cr = []if clust.left!=None: cl = extract_clusters(clust.left,dist=dist)if clust.right!=None: cr = extract_clusters(clust.right,dist=dist)return cl+cr def get_cluster_elements(clust): #用于取出算好聚類的元素# return ids for elements in a cluster sub-treeif clust.id>=0: # positive id means that this is a leafreturn [clust.id]else:# check the right and left branchescl = []cr = []if clust.left!=None: cl = get_cluster_elements(clust.left)if clust.right!=None: cr = get_cluster_elements(clust.right)return cl+crdef printclust(clust,labels=None,n=0): #將值打印出來# indent to make a hierarchy layoutfor i in range(n): print (' '),if clust.id<0: # negative id means that this is branchprint ('-')else: # positive id means that this is an endpointif labels==None: print (clust.id)else: print (labels[clust.id])if clust.left!=None: printclust(clust.left,labels=labels,n=n+1)if clust.right!=None: printclust(clust.right,labels=labels,n=n+1)def getheight(clust): #樹的高度，遞歸方法# Is this an endpoint? Then the height is just 1if clust.left==None and clust.right==None: return 1# Otherwise the height is the same of the heights of# each branchreturn getheight(clust.left)+getheight(clust.right)def getdepth(clust): #樹的深度，遞歸方法if clust.left==None and clust.right==None: return 0return max(getdepth(clust.left),getdepth(clust.right))+clust.distance

相關文章
ML之H-clustering：自定義HierarchicalClustering層次聚類算法

總結

以上是生活随笔為你收集整理的ML之HierarchicalClustering：自定义HierarchicalClustering层次聚类算法的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： ML之Kmeans：利用自定义Kmean
下一篇： ML之Hierarchical clus