wordcloud用来制作中文词云
生活随笔
收集整理的這篇文章主要介紹了
wordcloud用来制作中文词云
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
1. 讀入數據,刪除NAN,用jieba分詞
df = pd.read_csv("./data/entertainment_news.csv", encoding='utf-8')
df
df = df.dropna()
df
content=df.content.values.tolist()
content
#jieba.load_userdict(u"data/user_dic.txt")
segment=[]
for line in content:try:segs=jieba.lcut(line)for seg in segs:if len(seg)>1 and seg!='\r\n':segment.append(seg)except:print linecontinue2. 去掉停用詞
words_df=pd.DataFrame({'segment':segment})
#words_df.head()
stopwords=pd.read_csv("data/stopwords.txt",index_col=False,quoting=3,sep="\t",names=['stopword'], encoding='utf-8')#quoting=3全不引用
#stopwords.head()
words_df=words_df[~words_df.segment.isin(stopwords.stopword)]
words_df3. 統計計數words_stat=words_df.groupby(by=['segment'])['segment'].agg({"計數":numpy.size})words_stat=words_stat.reset_index().sort_values(by=["計數"],ascending=False)
words_stat.head()4. 繪圖wordcloud=WordCloud(font_path="data/simhei.ttf",background_color="white",max_font_size=80)
word_frequence = {x[0]:x[1] for x in words_stat.head(1000).values}
wordcloud=wordcloud.fit_words(word_frequence)
plt.imshow(wordcloud)
總結
以上是生活随笔為你收集整理的wordcloud用来制作中文词云的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: wordcloud用来制作词云
- 下一篇: pandas.read_csv——分块读