跑实验记录一
1.使用tagger&wikipedia-pubmed-and-PMC-w2v詞向量
Loading pretrained embeddings from ../.local/lib/python3.5/site-packages/neuroner/data/word_vectors/wikipedia-pubmed-and-PMC-w2v.txt... WARNING: 5443657 invalid lines Loaded 0 pretrained embeddings. 0 / 18309 (0.0000%) words have been initialized with pretrained embeddings. 0 found directly, 0 after lowercasing, 0 after lowercasing + zero. Compiling...詞向量無效的問題。
2.使用tagger&PMC-w2v詞向量
Loading pretrained embeddings from ./dataset/PMC-w2v.txt... WARNING: 2515687 invalid lines Loaded 0 pretrained embeddings. 0 / 18141 (0.0000%) words have been initialized with pretrained embeddings. 0 found directly, 0 after lowercasing, 0 after lowercasing + zero. Compiling...?
?依舊是詞向量不能加載的問題。
解決:找到原因了,因為詞向量中的維度和默認維度不同,需要指定默認維度啊,--word_dim 200。即可:
Found 10407 unique words (115614 in total)
Loading pretrained embeddings from ./dataset/PMC-w2v.txt...
Found 80 unique characters
Found 9 unique named entity tags
4595 / 4598 / 4840 sentences in train / dev / test.
Saving the mappings to disk...
?
目前使用的是Att中的CDR數據集進行訓練的。
3.使用tagger和chemdner_pubmed_drug.word2vec_model_token4_d50詞向量
?
轉載于:https://www.cnblogs.com/BlueBlueSea/p/10724243.html
總結
- 上一篇: CLOUD配置审批流发消息
- 下一篇: js06--函数库jq与prototyp