當前位置：首頁 > 编程语言 > python >内容正文

python

python亲和性分析法推荐电影论文_关于《Python数据挖掘入门与实战》读书笔记二（亲和性分析）...

發布時間：2023/12/20 python 34 豆豆

生活随笔收集整理的這篇文章主要介紹了 python亲和性分析法推荐电影论文_关于《Python数据挖掘入门与实战》读书笔记二（亲和性分析）... 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

#原文的代碼比較零散，網上的代碼大多數互抄，先基于個人的理解對代碼進行了歸納整理，并添加了注釋importnumpyasnpfromcollectionsimportdefaultdictfromoperatorimportitemgetter#生成關聯規則defmake_relation_rule(X,n_features):

valid_rules=defaultdict(int)#定義規則有效的集合invalid_rules=defaultdict(int)#定義規則無效的集合num_occurances=defaultdict(int)#定義某商品支持度集合

#復雜度分析(Complexity Analysis) n_samples*n_features*n_featuresforsampleinX:#循環對樣本的每個個體進行處理forpremiseinrange(n_features):#循環對樣本的每個個體的每個特征值進行處理ifsample[premise]==0:continue#檢測個體是否滿足條件，如果不滿足，繼續檢測下一個條件。num_occurances[premise]+=1#如果條件滿足(即值為1)，該條件的出現次數加1forconclusioninrange(n_features):#再次循環樣本的每個個體的每個特征值進行處理ifpremise==conclusion:continue#在遍歷過程中跳過條件和結論相同的情況ifsample[conclusion]==1:#如果個體的目標特征值為1valid_rules[(premise,conclusion)] +=1#存入規則有效的集合else:

invalid_rules[(premise,conclusion)] +=1#規則無效的集合support=valid_rules#規則有效的集合，即支持度集合confidence=defaultdict(float)#定義置信度集合

#print(valid_rules)???????????? #打印規則有效的結果集

#print(invalid_rules)?????????? #打印規則無效的結果集forpremise,conclusioninvalid_rules.keys():#valid_rules是個元祖集合rule=(premise,conclusion)#獲取每個規則confidence[rule]=float(valid_rules[rule])/num_occurances[premise]#這里需要將valid_rules的規則條目數從int轉成float,生成規則的置信度returnsupport,confidence#輸出某兩件商品的支持度和置信度defprint_especial_rule(premise,conclusion,support,confidence,features):

premise_name=features[premise]#轉換為商品名稱conclusion_name=features[conclusion]#轉換為商品名稱print("Rule:If a person buys {0} they will also buy {1}".format(premise_name,conclusion_name))#輸出商品名稱print("-Support:{0}".format(support[(premise,conclusion)]))#輸出支持度print("-Confidence:{0:.3f}".format(confidence[(premise,conclusion)]))#輸出支持度

#輸出該結果集支持度topN最高的商品defprint_topN_suppor_rule(support,confidence,features,topN):

sorted_support =sorted(support.items(),key=itemgetter(1),reverse=True)print('支持度最高的前{0}條規則：'.format(topN))forindexinrange(topN):print("規則 #{0}".format(index +1))

premise,conclusion = sorted_support[index][0]

print_especial_rule(premise, conclusion, support, confidence, features)#輸出該結果集置信度topN最高的商品defprint_topN_confidence_rule(support,confidence,features,topN):

sorted_confidence =sorted(confidence.items(),key=itemgetter(1),reverse=True)print('置信度最高的前{0}條規則：'.format(topN))forindexinrange(topN):print("規則 #{0}".format(index +1))

premise,conclusion = sorted_confidence[index][0]

print_especial_rule(premise, conclusion, support, confidence, features)if__name__ =='__main__':#使用numpy加載數據集X=np.loadtxt("affinity_dataset.txt")#定義物品的映射關系features = ["bread","milk","cheese","apples","bananas"]'''

#Version1購買蘋果的支持度

num_app_purchases=0

for sameple in X:

if sameple[3]==1:

num_app_purchases+=1

print('{0}people bought Apples'.format(num_app_purchases))

'''#獲取數據集的大小形狀，np_sameple為樣本數量，n_features為樣本列數n_samples,n_features=X.shapeprint(n_samples,n_features)#生成支持度和置信度集合support,confidence=make_relation_rule(X,n_features)#定義待求關聯結果的物品和物品premise=1conclusion=3#輸出兩件商品的置信度和支持度print_especial_rule(premise,conclusion,support,confidence,features)#輸出支持度高的前5條規則print_topN_suppor_rule(support,confidence,features,5)#輸出置信度高的前5條規則print_topN_confidence_rule(support, confidence, features,5)

總結

以上是生活随笔為你收集整理的python亲和性分析法推荐电影论文_关于《Python数据挖掘入门与实战》读书笔记二（亲和性分析）...的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： types是什么意思中文翻译成_type
下一篇： swot分析法案例_新媒体小白如何上手案