當(dāng)前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

简洁优美的深度学习包-bert4keras

發(fā)布時間：2024/1/18 pytorch 33 豆豆

生活随笔收集整理的這篇文章主要介紹了简洁优美的深度学习包-bert4keras 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

新手友好bert4keras

在鵝廠實(shí)習(xí)階段，follow蘇神（科學(xué)空間）的博客，啟發(fā)了idea，成功改進(jìn)了線上的一款模型。想法產(chǎn)出和實(shí)驗進(jìn)展很大一部分得益于蘇神設(shè)計的bert4keras，清晰輕量、基于keras，可以很簡潔的實(shí)現(xiàn)bert，同時附上了很多易讀的example，對nlp新手及其友好！本文推薦幾篇基于bert4keras的項目，均來自蘇神，對新手入門bert比較合適~

項目1：測試bert的mlm

項目地址：basic_masked_language_model

tokenizer：分詞器，主要方法：encode,decode。
build_transformer_model：建立bert模型，建議看源碼，可以加載多種權(quán)重和模型結(jié)構(gòu)（如unilm）。

import numpy as np from bert4keras.models import build_transformer_model from bert4keras.tokenizers import Tokenizer from bert4keras.snippets import to_arrayconfig_path = '/root/kg/bert/chinese_L-12_H-768_A-12/bert_config.json' checkpoint_path = '/root/kg/bert/chinese_L-12_H-768_A-12/bert_model.ckpt' dict_path = '/root/kg/bert/chinese_L-12_H-768_A-12/vocab.txt'tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分詞器 model = build_transformer_model(config_path=config_path, checkpoint_path=checkpoint_path, with_mlm=True ) # 建立模型，加載權(quán)重token_ids, segment_ids = tokenizer.encode(u'科學(xué)技術(shù)是第一生產(chǎn)力')# mask掉“技術(shù)” token_ids[3] = token_ids[4] = tokenizer._token_mask_id token_ids, segment_ids = to_array([token_ids], [segment_ids])# 用mlm模型預(yù)測被mask掉的部分 probas = model.predict([token_ids, segment_ids])[0] print(tokenizer.decode(probas[3:5].argmax(axis=1))) # 結(jié)果正是“技術(shù)”

項目2：句子對分類任務(wù)

項目地址：task_sentence_similarity_lcqmc
核心模型代碼：

句子1和句子2拼接在一起輸入bert。
bert模型的pooler輸出經(jīng)dropout和mlp投影到2維空間，做分類問題。
最終整個模型是一個標(biāo)準(zhǔn)的keras model。

class data_generator(DataGenerator):"""數(shù)據(jù)生成器"""def __iter__(self, random=False):batch_token_ids, batch_segment_ids, batch_labels = [], [], []for is_end, (text1, text2, label) in self.sample(random):token_ids, segment_ids = tokenizer.encode(text1, text2, maxlen=maxlen)batch_token_ids.append(token_ids)batch_segment_ids.append(segment_ids)batch_labels.append([label])if len(batch_token_ids) == self.batch_size or is_end:batch_token_ids = sequence_padding(batch_token_ids)batch_segment_ids = sequence_padding(batch_segment_ids)batch_labels = sequence_padding(batch_labels)yield [batch_token_ids, batch_segment_ids], batch_labelsbatch_token_ids, batch_segment_ids, batch_labels = [], [], []# 加載預(yù)訓(xùn)練模型 bert = build_transformer_model(config_path=config_path,checkpoint_path=checkpoint_path,with_pool=True,return_keras_model=False, )output = Dropout(rate=0.1)(bert.model.output) output = Dense(units=2, activation='softmax', kernel_initializer=bert.initializer )(output)model = keras.models.Model(bert.model.input, output)

項目3：標(biāo)題生成任務(wù)

項目地址：task_seq2seq_autotitle
NLG任務(wù)很方便用unilm結(jié)構(gòu)實(shí)現(xiàn)，而bert4keras實(shí)現(xiàn)unilm只需一個參數(shù)。

model = build_transformer_model(config_path,checkpoint_path,application='unilm',keep_tokens=keep_tokens, # 只保留keep_tokens中的字，精簡原字表 )

NLG任務(wù)的loss是交叉熵，示例中的實(shí)現(xiàn)很美觀：

CrossEntropy類繼承Loss類，重寫compute_loss。
將參與計算loss的變量過一遍CrossEntropy，這個過程中l(wèi)oss會被計算，具體閱讀Loss類源碼。
最終整個模型是一個標(biāo)準(zhǔn)的keras model。

class CrossEntropy(Loss):"""交叉熵作為loss，并mask掉輸入部分"""def compute_loss(self, inputs, mask=None):y_true, y_mask, y_pred = inputsy_true = y_true[:, 1:] # 目標(biāo)token_idsy_mask = y_mask[:, 1:] # segment_ids，剛好指示了要預(yù)測的部分y_pred = y_pred[:, :-1] # 預(yù)測序列，錯開一位loss = K.sparse_categorical_crossentropy(y_true, y_pred)loss = K.sum(loss * y_mask) / K.sum(y_mask)return lossmodel = build_transformer_model(config_path,checkpoint_path,application='unilm',keep_tokens=keep_tokens, # 只保留keep_tokens中的字，精簡原字表 )output = CrossEntropy(2)(model.inputs + model.outputs)model = Model(model.inputs, output) model.compile(optimizer=Adam(1e-5)) model.summary()

預(yù)測階段自回歸解碼，繼承AutoRegressiveDecoder類可以很容易實(shí)現(xiàn)beam_search。

項目4：SimBert

項目地址：SimBert
融合了unilm和對比學(xué)習(xí)，data generator和loss類的設(shè)計很巧妙，值得仔細(xì)閱讀，建議看不懂的地方打開jupyter對著一行一行print來理解。

項目5：SPACES：“抽取-生成”式長文本摘要

項目地址：SPACES
一個比較全面的項目，核心部分是BioCopyNet+Unilm。

總結(jié)

bert4keras項目的優(yōu)點(diǎn)：

build_transformer_model一句代碼構(gòu)建bert模型，一個參數(shù)即可切換為unilm結(jié)構(gòu)。
繼承Loss類，重寫compute_loss方法，很容易計算loss。
深度基于keras，訓(xùn)練、保存和keras一致。
豐富的example！蘇神的前沿算法研究也會附上bert4keras實(shí)現(xiàn)。

總結(jié)

以上是生活随笔為你收集整理的简洁优美的深度学习包-bert4keras的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： 59、PPP
下一篇：将1亿以内的任意数字转换为中文输出