问题: return unicode(text, encoding, errors=errors) UnicodeDecodeError: ‘utf-8‘ codec can‘t decode
報錯全文:Traceback (most recent call last):
File “D:/xiangmu/python/test/提取詞向量.py”, line 13, in
trainWordvec(“評論分詞提取(云南).txt”,500)
File “D:/xiangmu/python/test/提取詞向量.py”, line 8, in trainWordvec
model=word2vec.Word2Vec(sentences,size=sizes)
File “D:\xiangmu\python\test\venv\lib\site-packages\gensim\models\word2vec.py”, line 597, in init
super(Word2Vec, self).init(
File “D:\xiangmu\python\test\venv\lib\site-packages\gensim\models\base_any2vec.py”, line 745, in init
self.build_vocab(sentences=sentences, corpus_file=corpus_file, trim_rule=trim_rule)
File “D:\xiangmu\python\test\venv\lib\site-packages\gensim\models\base_any2vec.py”, line 921, in build_vocab
total_words, corpus_count = self.vocabulary.scan_vocab(
File “D:\xiangmu\python\test\venv\lib\site-packages\gensim\models\word2vec.py”, line 1403, in scan_vocab
total_words, corpus_count = self._scan_vocab(sentences, progress_per, trim_rule)
File “D:\xiangmu\python\test\venv\lib\site-packages\gensim\models\word2vec.py”, line 1372, in _scan_vocab
for sentence_no, sentence in enumerate(sentences):
File “D:\xiangmu\python\test\venv\lib\site-packages\gensim\models\word2vec.py”, line 1201, in iter
words, rest = (utils.to_unicode(text[:last_token]).split(),
File “D:\xiangmu\python\test\venv\lib\site-packages\gensim\utils.py”, line 368, in any2unicode
return unicode(text, encoding, errors=errors)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xbb in position 7: invalid start byte
解決
把txt文本修改成帶BOM的UTF-8的
總結
以上是生活随笔為你收集整理的问题: return unicode(text, encoding, errors=errors) UnicodeDecodeError: ‘utf-8‘ codec can‘t decode的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 七段显示器显示整数(C语言) ----存
- 下一篇: CODESYS 赛搏机器智能MIC700