数据挖掘-亲和性分析函数(通用)
生活随笔
收集整理的這篇文章主要介紹了
数据挖掘-亲和性分析函数(通用)
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
#庫導入
import numpy as np
from collections import defaultdict
from operator import itemgetterdef Affinity_Analysis(dataset,features,nums_feature);
if __name__ == "__main__":#數據導入(0、1矩陣,一行表示單一消費者的消費情況,一列表示單一商品售賣情況)dataset_filename="數據包路徑"X=np.loadtxt(dataset_filename)#特征名稱features=['bread','milk','cheese','apple','banana']#特征數量nums_feature=5#函數調用Affinity_Analysis(X,features,nums_feature)def Affinity_Analysis(dataset,features,nums_feature):#遍歷數據valid_rules=defaultdict(int)invalid_rules=defaultdict(int)nums_occurances=defaultdict(int)for sample in dataset:for primise in range(nums_feature):if sample[primise]==0: continuenums_occurances[primise]+=1for conclusion in range(nums_feature):if conclusion==primise: continueif sample[conclusion]==1: valid_rules[(primise,conclusion)]+=1else: invalid_rules[(primise,conclusion)]+=1#支持度support=valid_rules#置信度confidence=defaultdict(float)for primise,conclusion in valid_rules.keys():confidence[(primise,conclusion)]=valid_rules[(primise,conclusion)]/nums_occurances[primise]#根據支持度和置信度進行降序排列sorted_support=sorted(support.items(),key=itemgetter(1),reverse=True)sorted_confidence=sorted(confidence.items(),key=itemgetter(1),reverse=True)#結果展示print('\n支持度由高到低:')for i in sorted_support:print("[{0} {1}]\t- Support: {2}".format(features[i[0][0]],features[i[0][1]],i[1]))#print("- Support: {0}".format(confidence[(primise,conclusion)]))#print("- Confidence: {0}".format(support[(primise,conclusion)]))print('\n置信度由高到低:')for i in sorted_confidence:print("[{0} {1}]\t- Support: {2:.3f}".format(features[i[0][0]],features[i[0][1]],i[1]))#print("- Support: {0}".format(confidence[(primise,conclusion)]))#print("- Confidence: {0}".format(support[(primise,conclusion)]))
總結
以上是生活随笔為你收集整理的数据挖掘-亲和性分析函数(通用)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 亲和性分析_0(python数据挖掘入门
- 下一篇: Leetcode题库 11.盛水最多的容