當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

tensorflow2调用huggingface transformer预训练模型

發(fā)布時間：2025/3/12 编程问答 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 tensorflow2调用huggingface transformer预训练模型小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

tensorflow2調用huggingface transformer預訓練模型

一點廢話
- huggingface簡介
- 傳送門
- pipline
- 加載模型
- 設定訓練參數(shù)
- 數(shù)據(jù)預處理
- 訓練模型
- 結語

一點廢話

好久沒有更新過內容了，開工以來就是在不停地配環(huán)境，如今調通模型后，對整個流程做一個簡單的總結（水一篇）。

現(xiàn)在的NLP行業(yè)幾乎都逃不過fune-tuning預訓練的bert或者transformer這一關，按照傳統(tǒng)方法，構建整個模型，在processer里傳數(shù)據(jù)，在util里配路徑，在bert_fine_tune.sh里煉丹，說實話很麻煩，面對很多不需要復雜下游任務的任務，直接調用預訓練模型是最便捷高效的方法，著名的huggingface正是為此而生的，但它整體面向pytorch，如何有效得在tensorflow2中使用這些模型，對新員工和新入學的盆友們是一件很頭疼的事情，那么具體應該做些什么呢？往下看。

huggingface簡介

Hugging face原本是一家聊天機器人初創(chuàng)服務商 https://huggingface.co/ 專注于NLP技術，擁有大型的開源社區(qū)。尤其是在github上開源的自然語言處理，預訓練模型庫 Transformers，已被下載超過一百萬次，github上超過24000個star。Transformers 提供了NLP領域大量state-of-art的預訓練語言模型結構的模型和調用框架。（https://github.com/huggingface/transformers）

傳送門

https://huggingface.co/models
在這里有近年來較為成熟的bert系模型，可以在目錄中直接下載預訓練的模型使用

pipline

pip install transformers==4.6.1 pip install tensorflow-gpu==2.4.0

需要注意的是，transforers包所需的tensorflow版本至少為2.2.0，而該版本對應的CUDA版本可能不同，如筆者使用的2.4.0版本tensorflow對應的CUDA是11版本，在此祭奠配cuda環(huán)境浪費的兩天時間……

加載模型

下載好的完整模型是這樣的，其中：
config 定義模型參數(shù)，如layer_num、batch等
tf_model.h5 tensorflow2模型文件
tokenizer 文本處理模塊
vocab 詞典

記錄好模型所在的目錄，然后打開你的編輯器，導入所需的包，這里以序列分類為例，其他下游任務參考官方文檔https://huggingface.co/transformers/model_doc/bert.html

import tensorflow as tf import transformers as ts from transformer import BertTokenizer, TFBertForSequenceClassification import pandas as pd

設定訓練參數(shù)

data_path = 'YOUR_DATA_PATH' model_path = 'YOUR_MODEL_PATH' config_path = 'YOUR_CONFIG_PATH' num_labels = 10 epoch = 10 tokenizer = BertTokenizer.from_pretrained(model_path)

數(shù)據(jù)預處理

涉及到數(shù)據(jù)的讀入和重組，注意數(shù)據(jù)格式一定要符合bert模型所需的格式

def data_incoming(path):x = []y = []with open(path, 'r') as f:for line in f.readlines():line = line.strip('\n')line = line.split('\t')x.append(line[0])y.append(line[1])df_row = pd.DataFrame([x, y], index=['text', 'label'])df_row = df_row.Tdf_label = pd.DataFrame({"label": ['YOUR_LABEL'], 'y': list(range(10))})output = pd.merge(df_row, df_label, on='label', how='left')return outputdef convert_example_to_feature(review):return tokenizer.encode_plus(review,max_length=256,pad_tp_max_length=True,return_attention_mask=True,truncation=True)def map_example_to_dict(input_ids, attention_mask, token_type_ids, label):return {"input_ids": input_ids,"token_type_ids": token_type_ids,"attention_mask": attention_mask,}, labeldef encode_example(ds, limit=-1):input_ids_list = []token_type_ids_list = []attention_maks_list = []label_list = []if limit > 0:ds.take(limit)for index, row in ds.iterrows():review = row["text"]label = row['y']bert_input = convert_example_to_feature(review)input_ids_list.append(bert_input["input_ids"])token_type_ids_list.append(bert_input['token_type_ids'])attention_maks_list.append(bert_input['attention_maks'])label_list.append([label])return tf.data.Dataset.from_tensor_slices((input_ids_list, token_type_ids_list, attention_maks_list, label_list)).map(map_example_to_dict)

具體內容就不再贅述了，已經(jīng)寫得很詳細了，實在不懂的話……說不定我還有時間看評論

訓練模型

def main():train = data_incoming(data_path + 'train.tsv')test = data_incoming(data_path + 'test.tsv')train = encode_example(train).shuffle(100000).batch(100)test = encode_example(test).batch(100)model = TFBertForSequenceClassification(model_path, num_labels=num_labels)optimizer = tf.keras.optimizers.Adam(1e-5)model.compile(optimizer=optimizer, loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))model.fit(train, epochs=epoch, verbose=1, validation_data=test)if __name__ == '__main__':main()

結語

吶，就是這么簡單的一小塊代碼，足以讓你的gpu煎雞蛋了，然鵝筆者再摸魚可能就要被組長裹進雞蛋里煎了，就先到這吧，下一期……有沒有還不曉得，996不配有空閑

與50位技術專家面對面20年技術見證，附贈技術全景圖

總結

以上是生活随笔為你收集整理的tensorflow2调用huggingface transformer预训练模型的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： php 编写mysql_php编写数据写
下一篇：下图中的蓝月亮为科学家用计算机,2018