开关变压器绕制教程_教程:如何将变压器权重和令牌化器从AllenNLP上传到HuggingFace
開關變壓器繞制教程
This is the first of a series of mini-tutorials to help you with various aspects of the AllenNLP library.
這是一系列迷你教程的第一部分,可以幫助您了解AllenNLP庫的各個方面。
If you’re new to AllenNLP, consider first going through the official guide, as these tutorials will be focused on more advanced use cases.
如果您是AllenNLP的新手,請考慮先閱讀正式指南 ,因為這些教程將重點介紹更高級的用例。
Please keep in mind these tutorials are written for version 1.0 and greater of AllenNLP and may not be relevant for older versions.
請記住,這些教程是針對1.0版及更高版本的AllenNLP編寫的,可能與舊版本無關。
One way AllenNLP is commonly used is for fine-tuning transformer models to specific tasks. We host several of these models on our demo site, such as a BERT model applied to the SQuAD v1.1 question-answer task, and a RoBERTa model applied to the SNLI textual entailment task.
AllenNLP常用的一種方法是將變壓器模型微調到特定任務。 我們在演示站點上托管了這些模型中的幾種,例如應用于SQuAD v1.1問題解答任務的BERT模型,以及應用于SNLI文本包含任務的RoBERTa模型。
You can find the code and configuration files used to train these models in the AllenNLP Models repository.
您可以在 AllenNLP Models 存儲庫中 找到用于訓練這些模型的代碼和配置文件 。
This tutorial will show you how to take a fine-tuned transformer model, like one of these, and upload the weights and/or the tokenizer to HuggingFace’s model hub.
本教程將向您展示如何采用微調的變壓器模型(如其中之一),并將權重和/或標記器上傳到HuggingFace的模型中心 。
Note that we are talking about uploading only the transformer part of your model, not including any task-specific heads that you’re using.
請注意,我們正在談論的是僅上傳模型的轉換器部分,而不包括您正在使用的任何特定于任務的機頭。
First of all, you’ll need to know how a transformer model and tokenizer is actually integrated into an AllenNLP model.
首先,您需要了解如何將轉換器模型和令牌生成器實際集成到AllenNLP模型中。
This is usually done by providing your dataset reader with a PretrainedTransformerTokenizer and a matching PretrainedTransformerIndexer, and then providing your model with the corresponding PretrainedTransformerEmbedder.
通常,這是通過為數據集讀取器提供PretrainedTransformerTokenizer和匹配的PretrainedTransformerIndexer ,然后為模型提供相應的PretrainedTransformerEmbedder 。
If your dataset reader and model are already general enough that they can accept any type of tokenizer / token indexer and token embedder, respectively, then the only thing you need to do in order to utilize a pretrained transformer in your model is tweak your training configuration file.
如果您的數據集閱讀器和模型已經足夠通用,可以分別接受任何類型的令牌生成器/令牌索引器和令牌嵌入器,那么為了在模型中使用預訓練的轉換器,您要做的唯一事情就是調整訓練配置文件。
With the RoBERTa SNLI model, for example, the “dataset_reader” part of the config would look like this:
例如,使用RoBERTa SNLI模型,配置的“ dataset_reader”部分如下所示:
"dataset_reader": {"type": "snli",
"tokenizer": {
"type": "pretrained_transformer",
"model_name": "roberta-large",
"add_special_tokens": false
},
"token_indexers": {
"tokens": {
"type": "pretrained_transformer",
"model_name": "roberta-large",
"max_length": 512
}
}
}
While the “model” part of the config would look like this:
雖然配置的“模型”部分看起來像這樣:
"model": {"type": "basic_classifier",
"text_field_embedder": {
"token_embedders": {
"tokens": {
"type": "pretrained_transformer",
"model_name": "roberta-large",
"max_length": 512
}
}
},
...
}
Once you’ve trained your model, just follow these 3 steps to upload the transformer part of your model to HuggingFace.
訓練完模型后,只需按照以下3個步驟將模型的變壓器部分上載到HuggingFace。
Step 1: Load your tokenizer and your trained model.
步驟1:加載令牌生成器和訓練有素的模型。
If you get a ConfigurationError during this step that says something like “foo is not a registered name for bar”, that just means you need to import any other classes that your model or dataset reader use so they get registered.
如果 在此步驟中 收到 ConfigurationError ,并說“ foo不是bar的注冊名稱”,則意味著您需要導入模型或數據集讀取器使用的任何其他類,以便對其進行注冊。
Step 2: Serialize your tokenizer and just the transformer part of your model using the HuggingFace transformers API.
第2步:使用HuggingFace transformers API 將令牌化器和模型的轉換器部分序列化 。
Step 3: Upload the serialized tokenizer and transformer to the HuggingFace model hub.
步驟3:將序列化標記器和轉換器上載到HuggingFace模型中心。
Finally, just follow the steps from HuggingFace’s documentation to upload your new cool transformer with their CLI.
最后,只需按照HuggingFace文檔中的步驟操作, 即可通過其CLI上傳新的酷轉換器。
That’s it! Happy NLP-ing!
而已! NLP開心!
If you find any issues with this tutorial please leave a comment or open a new issue in the AllenNLP repo and give it the “Tutorials” tag:
如果您發現本教程有任何問題,請在AllenNLP存儲庫中發表評論或打開新問題 ,并為其添加“ Tutorials”標簽:
Follow @allen_ai and @ai2_allennlp on Twitter, and subscribe to the AI2 Newsletter to stay current on news and research coming out of AI2.
在Twitter上 關注 @allen_ai 和 @ ai2_allennlp ,并訂閱 AI2時事通訊, 以 隨時了解 AI2 方面的新聞和研究。
翻譯自: https://medium.com/ai2-blog/tutorial-how-to-upload-transformer-weights-and-tokenizers-from-allennlp-to-huggingface-ecf6c0249bf
開關變壓器繞制教程
總結
以上是生活随笔為你收集整理的开关变压器绕制教程_教程:如何将变压器权重和令牌化器从AllenNLP上传到HuggingFace的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 苹果7自带翻译功能在哪
- 下一篇: 一般线性模型和混合线性模型_线性混合模型