當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

keras基本结构功能

發(fā)布時(shí)間：2024/9/20 编程问答 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 keras基本结构功能小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1. Keras和TensorFlow的關(guān)系和區(qū)別:

TensorFlow和theano以及Keras都是深度學(xué)習(xí)框架，TensorFlow和theano比較靈活，也比較難學(xué)，它們其實(shí)就是一個(gè)微分器
Keras其實(shí)就是TensorFlow和Keras的接口（Keras作為前端，TensorFlow或theano作為后端），它也很靈活，且比較容易學(xué)。可以把keras看作為tensorflow封裝后的一個(gè)API。

2. Sequential與Model等基本功能

中文文檔：http://keras-cn.readthedocs.io/en/latest/
官方文檔：https://keras.io/
文檔主要是以keras2.0。

零、keras介紹與基本的模型保存

寫(xiě)成了思維導(dǎo)圖，便于觀(guān)察與理解。

1.keras網(wǎng)絡(luò)結(jié)構(gòu)

2.keras網(wǎng)絡(luò)配置

其中回調(diào)函數(shù)callbacks應(yīng)該是keras的精髓~

3.keras預(yù)處理功能

4、模型的節(jié)點(diǎn)信息提取

# 節(jié)點(diǎn)信息提取 config = model.get_config() # 把model中的信息，solver.prototxt和train.prototxt信息提取出來(lái) model = Model.from_config(config) # 還回去 # or, for Sequential: model = Sequential.from_config(config) # 重構(gòu)一個(gè)新的Model模型，用去其他訓(xùn)練，fine-tuning比較好用

5、模型概況查詢(xún)（包括權(quán)重查詢(xún)）

# 1、模型概括打印 model.summary()# 2、返回代表模型的JSON字符串，僅包含網(wǎng)絡(luò)結(jié)構(gòu)，不包含權(quán)值。可以從JSON字符串中重構(gòu)原模型： from models import model_from_jsonjson_string = model.to_json() model = model_from_json(json_string)# 3、model.to_yaml：與model.to_json類(lèi)似，同樣可以從產(chǎn)生的YAML字符串中重構(gòu)模型 from models import model_from_yamlyaml_string = model.to_yaml() model = model_from_yaml(yaml_string)# 4、權(quán)重獲取 model.get_layer() #依據(jù)層名或下標(biāo)獲得層對(duì)象 model.get_weights() #返回模型權(quán)重張量的列表，類(lèi)型為numpy array model.set_weights() #從numpy array里將權(quán)重載入給模型，要求數(shù)組具有與model.get_weights()相同的形狀。# 查看model中Layer的信息 model.layers 查看layer信息

6、模型保存與加載

model.save_weights(filepath) # 將模型權(quán)重保存到指定路徑，文件類(lèi)型是HDF5（后綴是.h5）model.load_weights(filepath, by_name=False) # 從HDF5文件中加載權(quán)重到當(dāng)前模型中, 默認(rèn)情況下模型的結(jié)構(gòu)將保持不變。 # 如果想將權(quán)重載入不同的模型（有些層相同）中，則設(shè)置by_name=True，只有名字匹配的層才會(huì)載入權(quán)重

7、如何在keras中設(shè)定GPU使用的大小

本節(jié)來(lái)源于：深度學(xué)習(xí)theano/tensorflow多顯卡多人使用問(wèn)題集（參見(jiàn)：Limit the resource usage for tensorflow backend · Issue #1538 · fchollet/keras · GitHub）
在使用keras時(shí)候會(huì)出現(xiàn)總是占滿(mǎn)GPU顯存的情況，可以通過(guò)重設(shè)backend的GPU占用情況來(lái)進(jìn)行調(diào)節(jié)。

import tensorflow as tf from keras.backend.tensorflow_backend import set_session config = tf.ConfigProto() config.gpu_options.per_process_gpu_memory_fraction = 0.3 set_session(tf.Session(config=config))

需要注意的是，雖然代碼或配置層面設(shè)置了對(duì)顯存占用百分比閾值，但在實(shí)際運(yùn)行中如果達(dá)到了這個(gè)閾值，程序有需要的話(huà)還是會(huì)突破這個(gè)閾值。換而言之如果跑在一個(gè)大數(shù)據(jù)集上還是會(huì)用到更多的顯存。以上的顯存限制僅僅為了在跑小數(shù)據(jù)集時(shí)避免對(duì)顯存的浪費(fèi)而已。（2017年2月20日補(bǔ)充）

8.更科學(xué)地模型訓(xùn)練與模型保存

filepath = 'model-ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5' checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='min') # fit model model.fit(x, y, epochs=20, verbose=2, callbacks=[checkpoint], validation_data=(x, y))

save_best_only打開(kāi)之后，會(huì)如下：

ETA: 3s - loss: 0.5820Epoch 00017: val_loss did not improve

如果val_loss 提高了就會(huì)保存，沒(méi)有提高就不會(huì)保存。

9.如何在keras中使用tensorboard

RUN = RUN + 1 if 'RUN' in locals() else 1 # locals() 函數(shù)會(huì)以字典類(lèi)型返回當(dāng)前位置的全部局部變量。LOG_DIR = model_save_path + '/training_logs/run{}'.format(RUN)LOG_FILE_PATH = LOG_DIR + '/checkpoint-{epoch:02d}-{val_loss:.4f}.hdf5' # 模型Log文件以及.h5模型文件存放地址tensorboard = TensorBoard(log_dir=LOG_DIR, write_images=True)checkpoint = ModelCheckpoint(filepath=LOG_FILE_PATH, monitor='val_loss', verbose=1, save_best_only=True)early_stopping = EarlyStopping(monitor='val_loss', patience=5, verbose=1)history = model.fit_generator(generator=gen.generate(True), steps_per_epoch=int(gen.train_batches / 4),validation_data=gen.generate(False), validation_steps=int(gen.val_batches / 4),epochs=EPOCHS, verbose=1, callbacks=[tensorboard, checkpoint, early_stopping])

都是在回調(diào)函數(shù)中起作用：

EarlyStopping patience：當(dāng)early
（1）stop被激活（如發(fā)現(xiàn)loss相比上一個(gè)epoch訓(xùn)練沒(méi)有下降），則經(jīng)過(guò)patience個(gè)epoch后停止訓(xùn)練。
（2）mode：‘a(chǎn)uto’，‘min’，‘max’之一，在min模式下，如果檢測(cè)值停止下降則中止訓(xùn)練。在max模式下，當(dāng)檢測(cè)值不再上升則停止訓(xùn)練。
模型檢查點(diǎn)ModelCheckpoint
（1）save_best_only：當(dāng)設(shè)置為T(mén)rue時(shí)，將只保存在驗(yàn)證集上性能最好的模型
（2） mode：‘a(chǎn)uto’，‘min’，‘max’之一，在save_best_only=True時(shí)決定性能最佳模型的評(píng)判準(zhǔn)則，例如，當(dāng)監(jiān)測(cè)值為val_acc時(shí)，模式應(yīng)為max，當(dāng)檢測(cè)值為val_loss時(shí)，模式應(yīng)為min。在auto模式下，評(píng)價(jià)準(zhǔn)則由被監(jiān)測(cè)值的名字自動(dòng)推斷。
（3）save_weights_only：若設(shè)置為T(mén)rue，則只保存模型權(quán)重，否則將保存整個(gè)模型（包括模型結(jié)構(gòu)，配置信息等）
（4）period：CheckPoint之間的間隔的epoch數(shù)
可視化tensorboard write_images: 是否將模型權(quán)重以圖片的形式可視化

其他內(nèi)容可參考keras中文文檔

Sequential模型的基本組件

一般需要：

1、model.add，添加層；
2、model.compile,模型訓(xùn)練的BP模式設(shè)置；
3、model.fit，模型訓(xùn)練參數(shù)設(shè)置 + 訓(xùn)練；
4、模型評(píng)估
5、模型預(yù)測(cè)

1. add：添加層——train_val.prototxt

add(self, layer)# 譬如： model.add(Dense(32, activation='relu', input_dim=100)) model.add(Dropout(0.25))

add里面只有層layer的內(nèi)容，當(dāng)然在序貫式里面，也可以model.add（other_model）加載另外模型，在函數(shù)式里面就不太一樣，詳見(jiàn)函數(shù)式。

2、compile 訓(xùn)練模式——solver.prototxt文件

compile(self, optimizer, loss, metrics=None, sample_weight_mode=None)

其中：
optimizer：字符串（預(yù)定義優(yōu)化器名）或優(yōu)化器對(duì)象，參考優(yōu)化器
loss：字符串（預(yù)定義損失函數(shù)名）或目標(biāo)函數(shù)，參考損失函數(shù)
metrics：列表，包含評(píng)估模型在訓(xùn)練和測(cè)試時(shí)的網(wǎng)絡(luò)性能的指標(biāo)，典型用法是metrics=[‘a(chǎn)ccuracy’]
sample_weight_mode：如果你需要按時(shí)間步為樣本賦權(quán)（2D權(quán)矩陣），將該值設(shè)為“temporal”。
默認(rèn)為“None”，代表按樣本賦權(quán)（1D權(quán)）。在下面fit函數(shù)的解釋中有相關(guān)的參考內(nèi)容。
kwargs：使用TensorFlow作為后端請(qǐng)忽略該參數(shù)，若使用Theano作為后端，kwargs的值將會(huì)傳遞給 K.function

注意：
模型在使用前必須編譯，否則在調(diào)用fit或evaluate時(shí)會(huì)拋出異常。

3、fit 模型訓(xùn)練參數(shù)+訓(xùn)練——train.sh+soler.prototxt（部分）

fit(self, x, y, batch_size=32, epochs=10, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)

本函數(shù)將模型訓(xùn)練nb_epoch輪，其參數(shù)有：

x：輸入數(shù)據(jù)。如果模型只有一個(gè)輸入，那么x的類(lèi)型是numpy
array，如果模型有多個(gè)輸入，那么x的類(lèi)型應(yīng)當(dāng)為list，list的元素是對(duì)應(yīng)于各個(gè)輸入的numpy array
y：標(biāo)簽，numpy array
batch_size：整數(shù)，指定進(jìn)行梯度下降時(shí)每個(gè)batch包含的樣本數(shù)。訓(xùn)練時(shí)一個(gè)batch的樣本會(huì)被計(jì)算一次梯度下降，使目標(biāo)函數(shù)優(yōu)化一步。
epochs：整數(shù)，訓(xùn)練的輪數(shù)，每個(gè)epoch會(huì)把訓(xùn)練集輪一遍。
verbose：日志顯示，0為不在標(biāo)準(zhǔn)輸出流輸出日志信息，1為輸出進(jìn)度條記錄，2為每個(gè)epoch輸出一行記錄
callbacks：list，其中的元素是keras.callbacks.Callback的對(duì)象。這個(gè)list中的回調(diào)函數(shù)將會(huì)在訓(xùn)練過(guò)程中的適當(dāng)時(shí)機(jī)被調(diào)用，參考回調(diào)函數(shù)
validation_split：0~1之間的浮點(diǎn)數(shù)，用來(lái)指定訓(xùn)練集的一定比例數(shù)據(jù)作為驗(yàn)證集。驗(yàn)證集將不參與訓(xùn)練，并在每個(gè)epoch結(jié)束后測(cè)試的模型的指標(biāo)，如損失函數(shù)、精確度等。注意，validation_split的劃分在shuffle之前，因此如果你的數(shù)據(jù)本身是有序的，需要先手工打亂再指定validation_split，否則可能會(huì)出現(xiàn)驗(yàn)證集樣本不均勻。
validation_data：形式為（X，y）的tuple，是指定的驗(yàn)證集。此參數(shù)將覆蓋validation_spilt。
shuffle：布爾值或字符串，一般為布爾值，表示是否在訓(xùn)練過(guò)程中隨機(jī)打亂輸入樣本的順序。若為字符串“batch”，則是用來(lái)處理HDF5數(shù)據(jù)的特殊情況，它將在batch內(nèi)部將數(shù)據(jù)打亂。
class_weight：字典，將不同的類(lèi)別映射為不同的權(quán)值，該參數(shù)用來(lái)在訓(xùn)練過(guò)程中調(diào)整損失函數(shù)（只能用于訓(xùn)練）
sample_weight：權(quán)值的numpy
array，用于在訓(xùn)練時(shí)調(diào)整損失函數(shù)（僅用于訓(xùn)練）。可以傳遞一個(gè)1D的與樣本等長(zhǎng)的向量用于對(duì)樣本進(jìn)行1對(duì)1的加權(quán)，或者在面對(duì)時(shí)序數(shù)據(jù)時(shí)，傳遞一個(gè)的形式為（samples，sequence_length）的矩陣來(lái)為每個(gè)時(shí)間步上的樣本賦不同的權(quán)。這種情況下請(qǐng)確定在編譯模型時(shí)添加了sample_weight_mode=’temporal’。
initial_epoch: 從該參數(shù)指定的epoch開(kāi)始訓(xùn)練，在繼續(xù)之前的訓(xùn)練時(shí)有用。

fit函數(shù)返回一個(gè)History的對(duì)象，其History.history屬性記錄了損失函數(shù)和其他指標(biāo)的數(shù)值隨epoch變化的情況，如果有驗(yàn)證集的話(huà)，也包含了驗(yàn)證集的這些指標(biāo)變化情況
注意：
要與之后的fit_generator做區(qū)別，兩者輸入x/y不同。

4.evaluate 模型評(píng)估

evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)

本函數(shù)按batch計(jì)算在某些輸入數(shù)據(jù)上模型的誤差，其參數(shù)有：

x：輸入數(shù)據(jù)，與fit一樣，是numpy array或numpy array的list
y：標(biāo)簽，numpy array
batch_size：整數(shù)，含義同fit的同名參數(shù)
verbose：含義同fit的同名參數(shù)，但只能取0或1
sample_weight：numpy array，含義同fit的同名參數(shù)

本函數(shù)返回一個(gè)測(cè)試誤差的標(biāo)量值（如果模型沒(méi)有其他評(píng)價(jià)指標(biāo)），或一個(gè)標(biāo)量的list（如果模型還有其他的評(píng)價(jià)指標(biāo)）。model.metrics_names將給出list中各個(gè)值的含義。

如果沒(méi)有特殊說(shuō)明，以下函數(shù)的參數(shù)均保持與fit的同名參數(shù)相同的含義
如果沒(méi)有特殊說(shuō)明，以下函數(shù)的verbose參數(shù)（如果有）均只能取0或1

5 predict 模型評(píng)估

predict(self, x, batch_size=32, verbose=0) predict_classes(self, x, batch_size=32, verbose=1) predict_proba(self, x, batch_size=32, verbose=1)

本函數(shù)按batch獲得輸入數(shù)據(jù)對(duì)應(yīng)的輸出，其參數(shù)有：

函數(shù)的返回值是預(yù)測(cè)值的numpy array
predict_classes：本函數(shù)按batch產(chǎn)生輸入數(shù)據(jù)的類(lèi)別預(yù)測(cè)結(jié)果；
predict_proba：本函數(shù)按batch產(chǎn)生輸入數(shù)據(jù)屬于各個(gè)類(lèi)別的概率

6 on_batch 、batch的結(jié)果，檢查

train_on_batch(self, x, y, class_weight=None, sample_weight=None) test_on_batch(self, x, y, sample_weight=None) predict_on_batch(self, x)

train_on_batch：本函數(shù)在一個(gè)batch的數(shù)據(jù)上進(jìn)行一次參數(shù)更新，函數(shù)返回訓(xùn)練誤差的標(biāo)量值或標(biāo)量值的list，與evaluate的情形相同。
test_on_batch：本函數(shù)在一個(gè)batch的樣本上對(duì)模型進(jìn)行評(píng)估，函數(shù)的返回與evaluate的情形相同
predict_on_batch：本函數(shù)在一個(gè)batch的樣本上對(duì)模型進(jìn)行測(cè)試，函數(shù)返回模型在一個(gè)batch上的預(yù)測(cè)結(jié)果

7 fit_generator

#利用Python的生成器，逐個(gè)生成數(shù)據(jù)的batch并進(jìn)行訓(xùn)練。 #生成器與模型將并行執(zhí)行以提高效率。 #例如，該函數(shù)允許我們?cè)贑PU上進(jìn)行實(shí)時(shí)的數(shù)據(jù)提升，同時(shí)在GPU上進(jìn)行模型訓(xùn)練 # 參考鏈接：http://keras-cn.readthedocs.io/en/latest/models/sequential/

有了該函數(shù)，圖像分類(lèi)訓(xùn)練任務(wù)變得很簡(jiǎn)單。

fit_generator(self, generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_q_size=10, workers=1, pickle_safe=False, initial_epoch=0)# 案例： def generate_arrays_from_file(path):while 1:f = open(path)for line in f:# create Numpy arrays of input data# and labels, from each line in the filex, y = process_line(line)yield (x, y)f.close()model.fit_generator(generate_arrays_from_file('/my_file.txt'),samples_per_epoch=10000, epochs=10)

其他的兩個(gè)輔助的內(nèi)容：

evaluate_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False) predict_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False, verbose=0)

evaluate_generator：本函數(shù)使用一個(gè)生成器作為數(shù)據(jù)源評(píng)估模型，生成器應(yīng)返回與test_on_batch的輸入數(shù)據(jù)相同類(lèi)型的數(shù)據(jù)。該函數(shù)的參數(shù)與fit_generator同名參數(shù)含義相同，steps是生成器要返回?cái)?shù)據(jù)的輪數(shù)。
predcit_generator：本函數(shù)使用一個(gè)生成器作為數(shù)據(jù)源預(yù)測(cè)模型，生成器應(yīng)返回與test_on_batch的輸入數(shù)據(jù)相同類(lèi)型的數(shù)據(jù)。該函數(shù)的參數(shù)與fit_generator同名參數(shù)含義相同，steps是生成器要返回?cái)?shù)據(jù)的輪數(shù)。

案例一：簡(jiǎn)單的2分類(lèi)

For a single-input model with 2 classes (binary classification):

from keras.models import Sequential from keras.layers import Dense, Activation #模型搭建階段 model= Sequential() model.add(Dense(32, activation='relu', input_dim=100)) # Dense(32) is a fully-connected layer with 32 hidden units. model.add(Dense(1, activation='sigmoid')) model.compile(optimizer='rmsprop',loss='binary_crossentropy',metrics=['accuracy'])

其中：
Sequential()代表類(lèi)的初始化；
Dense代表全連接層，此時(shí)有32個(gè)全連接層，最后接relu，輸入的是100維度
model.add，添加新的全連接層，
compile，跟prototxt一樣，一些訓(xùn)練參數(shù),solver.prototxt

# Generate dummy data import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(2, size=(1000, 1))# Train the model, iterating on the data in batches of 32 samples model.fit(data, labels, nb_epoch =10, batch_size=32)

之前報(bào)過(guò)這樣的錯(cuò)誤，是因?yàn)榘姹镜膯?wèn)題。版本1.2里面是nb_epoch ，而keras2.0是epochs = 10

error:TypeError: Received unknown keyword arguments: {'epochs': 10}

其中：
epoch=batch_size * iteration,10次epoch代表訓(xùn)練十次訓(xùn)練集

案例二:多分類(lèi)-VGG的卷積神經(jīng)網(wǎng)絡(luò)

import numpy as np import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.optimizers import SGD from keras.utils import np_utils# Generate dummy data x_train = np.random.random((100, 100, 100, 3)) # 100張圖片，每張100*100*3 y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10) # 100*10 x_test = np.random.random((20, 100, 100, 3)) y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10) # 20*100model = Sequential() # input: 100x100 images with 3 channels -> (100, 100, 3) tensors. # this applies 32 convolution filters of size 3x3 each. model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3))) model.add(Conv2D(32, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25))model.add(Conv2D(64, (3, 3), activation='relu')) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25))model.add(Flatten()) model.add(Dense(256, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax'))sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss='categorical_crossentropy', optimizer=sgd)model.fit(x_train, y_train, batch_size=32, epochs=10) score = model.evaluate(x_test, y_test, batch_size=32)

標(biāo)準(zhǔn)序貫網(wǎng)絡(luò)，標(biāo)簽的訓(xùn)練模式
注意：
這里非常重要的一點(diǎn)，對(duì)于我這樣的新手，這一步的作用？

keras.utils.to_categorical

特別是多分類(lèi)時(shí)候，我之前以為輸入的就是一列（100，），但是keras在多分類(lèi)任務(wù)中是不認(rèn)得這個(gè)的，所以需要再加上這一步，讓其轉(zhuǎn)化為Keras認(rèn)得的數(shù)據(jù)格式。

案例三：使用LSTM的序列分類(lèi)

from keras.models import Sequential from keras.layers import Dense, Dropout from keras.layers import Embedding from keras.layers import LSTMmodel = Sequential() model.add(Embedding(max_, output_dim=256)) model.add(LSTM(128)) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid'))model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])model.fit(x_train, y_train, batch_size=16, epochs=10) score = model.evaluate(x_test, y_test, batch_size=16)

三、Model式模型

來(lái)自keras中文文檔：http://keras-cn.readthedocs.io/en/latest/
比序貫?zāi)Ｐ鸵獜?fù)雜，但是效果很好，可以同時(shí)/分階段輸入變量，分階段輸出想要的模型；
一句話(huà)，只要你的模型不是類(lèi)似VGG一樣一條路走到黑的模型，或者你的模型需要多于一個(gè)的輸出，那么你總應(yīng)該選擇函數(shù)式模型。

不同之處：
書(shū)寫(xiě)結(jié)構(gòu)完全不一致

函數(shù)式模型基本屬性與訓(xùn)練流程

一般需要：
1、model.layers，添加層信息；
2、model.compile,模型訓(xùn)練的BP模式設(shè)置；
3、model.fit，模型訓(xùn)練參數(shù)設(shè)置 + 訓(xùn)練；
4、evaluate，模型評(píng)估；
5、predict 模型預(yù)測(cè)

1 常用Model屬性

model.layers：組成模型圖的各個(gè)層 model.inputs：模型的輸入張量列表 model.outputs：模型的輸出張量列表

2 compile 訓(xùn)練模式設(shè)置——solver.prototxt

compile(self, optimizer, loss, metrics=None, loss_weights=None, sample_weight_mode=None)

本函數(shù)編譯模型以供訓(xùn)練，參數(shù)有

optimizer：優(yōu)化器，為預(yù)定義優(yōu)化器名或優(yōu)化器對(duì)象，參考優(yōu)化器
loss：損失函數(shù)，為預(yù)定義損失函數(shù)名或一個(gè)目標(biāo)函數(shù)，參考損失函數(shù)
metrics：列表，包含評(píng)估模型在訓(xùn)練和測(cè)試時(shí)的性能的指標(biāo)，典型用法是metrics=[‘a(chǎn)ccuracy’]如果要在多輸出模型中為不同的輸出指定不同的指標(biāo)，可像該參數(shù)傳遞一個(gè)字典，例如metrics={‘ouput_a’: ‘a(chǎn)ccuracy’}
sample_weight_mode：如果你需要按時(shí)間步為樣本賦權(quán)（2D權(quán)矩陣），將該值設(shè)為“temporal”。默認(rèn)為“None”，代表按樣本賦權(quán)（1D權(quán)）。
如果模型有多個(gè)輸出，可以向該參數(shù)傳入指定sample_weight_mode的字典或列表。在下面fit函數(shù)的解釋中有相關(guān)的參考內(nèi)容。

【Tips】如果你只是載入模型并利用其predict，可以不用進(jìn)行compile。在Keras中，compile主要完成損失函數(shù)和優(yōu)化器的一些配置，是為訓(xùn)練服務(wù)的。predict會(huì)在內(nèi)部進(jìn)行符號(hào)函數(shù)的編譯工作（通過(guò)調(diào)用_make_predict_function生成函數(shù)）

3 fit 模型訓(xùn)練參數(shù)設(shè)置 + 訓(xùn)練

fit(self, x=None, y=None, batch_size=32, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)

本函數(shù)用以訓(xùn)練模型，參數(shù)有：

x：輸入數(shù)據(jù)。如果模型只有一個(gè)輸入，那么x的類(lèi)型是numpy
array，如果模型有多個(gè)輸入，那么x的類(lèi)型應(yīng)當(dāng)為list，list的元素是對(duì)應(yīng)于各個(gè)輸入的numpy
array。如果模型的每個(gè)輸入都有名字，則可以傳入一個(gè)字典，將輸入名與其輸入數(shù)據(jù)對(duì)應(yīng)起來(lái)。
y：標(biāo)簽，numpy array。如果模型有多個(gè)輸出，可以傳入一個(gè)numpy
array的list。如果模型的輸出擁有名字，則可以傳入一個(gè)字典，將輸出名與其標(biāo)簽對(duì)應(yīng)起來(lái)。
batch_size：整數(shù)，指定進(jìn)行梯度下降時(shí)每個(gè)batch包含的樣本數(shù)。訓(xùn)練時(shí)一個(gè)batch的樣本會(huì)被計(jì)算一次梯度下降，使目標(biāo)函數(shù)優(yōu)化一步。
nb_epoch：整數(shù)，訓(xùn)練的輪數(shù)，訓(xùn)練數(shù)據(jù)將會(huì)被遍歷nb_epoch次。Keras中nb開(kāi)頭的變量均為”number of”的意思
verbose：日志顯示，0為不在標(biāo)準(zhǔn)輸出流輸出日志信息，1為輸出進(jìn)度條記錄，2為每個(gè)epoch輸出一行記錄
callbacks：list，其中的元素是keras.callbacks.Callback的對(duì)象。這個(gè)list中的回調(diào)函數(shù)將會(huì)在訓(xùn)練過(guò)程中的適當(dāng)時(shí)機(jī)被調(diào)用，參考回調(diào)函數(shù)
validation_split：0~1之間的浮點(diǎn)數(shù)，用來(lái)指定訓(xùn)練集的一定比例數(shù)據(jù)作為驗(yàn)證集。驗(yàn)證集將不參與訓(xùn)練，并在每個(gè)epoch結(jié)束后測(cè)試的模型的指標(biāo)，如損失函數(shù)、精確度等。注意，validation_split的劃分在shuffle之后，因此如果你的數(shù)據(jù)本身是有序的，需要先手工打亂再指定validation_split，否則可能會(huì)出現(xiàn)驗(yàn)證集樣本不均勻。
validation_data：形式為（X，y）或（X，y，sample_weights）的tuple，是指定的驗(yàn)證集。此參數(shù)將覆蓋validation_spilt。
shuffle：布爾值，表示是否在訓(xùn)練過(guò)程中每個(gè)epoch前隨機(jī)打亂輸入樣本的順序。
class_weight：字典，將不同的類(lèi)別映射為不同的權(quán)值，該參數(shù)用來(lái)在訓(xùn)練過(guò)程中調(diào)整損失函數(shù)（只能用于訓(xùn)練）。該參數(shù)在處理非平衡的訓(xùn)練數(shù)據(jù)（某些類(lèi)的訓(xùn)練樣本數(shù)很少）時(shí)，可以使得損失函數(shù)對(duì)樣本數(shù)不足的數(shù)據(jù)更加關(guān)注。
sample_weight：權(quán)值的numpy
array，用于在訓(xùn)練時(shí)調(diào)整損失函數(shù)（僅用于訓(xùn)練）。可以傳遞一個(gè)1D的與樣本等長(zhǎng)的向量用于對(duì)樣本進(jìn)行1對(duì)1的加權(quán)，或者在面對(duì)時(shí)序數(shù)據(jù)時(shí)，傳遞一個(gè)的形式為（samples，sequence_length）的矩陣來(lái)為每個(gè)時(shí)間步上的樣本賦不同的權(quán)。這種情況下請(qǐng)確定在編譯模型時(shí)添加了sample_weight_mode=’temporal’。
initial_epoch: 從該參數(shù)指定的epoch開(kāi)始訓(xùn)練，在繼續(xù)之前的訓(xùn)練時(shí)有用。

輸入數(shù)據(jù)與規(guī)定數(shù)據(jù)不匹配時(shí)會(huì)拋出錯(cuò)誤

fit函數(shù)返回一個(gè)History的對(duì)象，其History.history屬性記錄了損失函數(shù)和其他指標(biāo)的數(shù)值隨epoch變化的情況，如果有驗(yàn)證集的話(huà)，也包含了驗(yàn)證集的這些指標(biāo)變化情況

4.evaluate，模型評(píng)估

evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)

本函數(shù)按batch計(jì)算在某些輸入數(shù)據(jù)上模型的誤差，其參數(shù)有：

x：輸入數(shù)據(jù)，與fit一樣，是numpy array或numpy array的list
y：標(biāo)簽，numpy array
batch_size：整數(shù)，含義同fit的同名參數(shù)
verbose：含義同fit的同名參數(shù)，但只能取0或1
sample_weight：numpy array，含義同fit的同名參數(shù)

5.predict 模型預(yù)測(cè)

predict(self, x, batch_size=32, verbose=0)

本函數(shù)按batch獲得輸入數(shù)據(jù)對(duì)應(yīng)的輸出，其參數(shù)有：

函數(shù)的返回值是預(yù)測(cè)值的numpy array

模型檢查 on_batch

train_on_batch(self, x, y, class_weight=None, sample_weight=None) test_on_batch(self, x, y, sample_weight=None) predict_on_batch(self, x)

train_on_batch：本函數(shù)在一個(gè)batch的數(shù)據(jù)上進(jìn)行一次參數(shù)更新，函數(shù)返回訓(xùn)練誤差的標(biāo)量值或標(biāo)量值的list，與evaluate的情形相同。
test_on_batch：本函數(shù)在一個(gè)batch的樣本上對(duì)模型進(jìn)行評(píng)估，函數(shù)的返回與evaluate的情形相同；
predict_on_batch：本函數(shù)在一個(gè)batch的樣本上對(duì)模型進(jìn)行測(cè)試，函數(shù)返回模型在一個(gè)batch上的預(yù)測(cè)結(jié)果

_generator

fit_generator(self, generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_q_size=10, workers=1, pickle_safe=False, initial_epoch=0) evaluate_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False)

案例一：簡(jiǎn)單的單層-全連接網(wǎng)絡(luò)

from keras.layers import Input, Dense from keras.models import Model# This returns a tensor inputs = Input(shape=(784,))# a layer instance is callable on a tensor, and returns a tensor x = Dense(64, activation='relu')(inputs) # 輸入inputs，輸出x # (inputs)代表輸入 x = Dense(64, activation='relu')(x) # 輸入x，輸出x predictions = Dense(10, activation='softmax')(x) # 輸入x，輸出分類(lèi)# This creates a model that includes # the Input layer and three Dense layers model = Model(inputs=inputs, outputs=predictions) model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy']) model.fit(data, labels) # starts training

其中：
可以看到結(jié)構(gòu)與序貫?zāi)Ｐ屯耆灰粯?#xff0c;其中x = Dense(64, activation=’relu’)(inputs)中：(input)代表輸入；x代表輸出
model = Model(inputs=inputs, outputs=predictions)；該句是函數(shù)式模型的經(jīng)典，可以同時(shí)輸入兩個(gè)input，然后輸出output兩個(gè)模型

案例二：視頻處理

x = Input(shape=(784,)) # This works, and returns the 10-way softmax we defined above. y = model(x) # model里面存著權(quán)重，然后輸入x，輸出結(jié)果，用來(lái)作fine-tuning# 分類(lèi)->視頻、實(shí)時(shí)處理 from keras.layers import TimeDistributed# Input tensor for sequences of 20 timesteps, # each containing a 784-dimensional vector input_sequences = Input(shape=(20, 784)) # 20個(gè)時(shí)間間隔，輸入784維度的數(shù)據(jù)# This applies our previous model to every timestep in the input sequences. # the output of the previous model was a 10-way softmax, # so the output of the layer below will be a sequence of 20 vectors of size 10. processed_sequences = TimeDistributed(model)(input_sequences) # Model是已經(jīng)訓(xùn)練好的

其中：
Model是已經(jīng)訓(xùn)練好的，現(xiàn)在用來(lái)做遷移學(xué)習(xí)；
其中還可以通過(guò)TimeDistributed來(lái)進(jìn)行實(shí)時(shí)預(yù)測(cè)；
TimeDistributed(model)(input_sequences)，input_sequences代表序列輸入；model代表已訓(xùn)練的模型

案例三：雙輸入、雙模型輸出：LSTM 時(shí)序預(yù)測(cè)

本案例很好，可以了解到Model的精髓在于他的任意性，給編譯者很多的便利。

輸入：
新聞?wù)Z料；新聞?wù)Z料對(duì)應(yīng)的時(shí)間
輸出：
新聞?wù)Z料的預(yù)測(cè)模型；新聞?wù)Z料+對(duì)應(yīng)時(shí)間的預(yù)測(cè)模型

模型一：只針對(duì)新聞?wù)Z料的LSTM模型

from keras.layers import Input, Embedding, LSTM, Dense from keras.models import Model# Headline input: meant to receive sequences of 100 integers, between 1 and 10000. # Note that we can name any layer by passing it a "name" argument. main_input = Input(shape=(100,), dtype='int32', name='main_input') # 一個(gè)100詞的BOW序列# This embedding layer will encode the input sequence # into a sequence of dense 512-dimensional vectors. x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input) # Embedding層，把100維度再encode成512的句向量，10000指的是詞典單詞總數(shù)# A LSTM will transform the vector sequence into a single vector, # containing information about the entire sequence lstm_out = LSTM(32)(x) # ？ 32什么意思？？？？？？？？？？？？？？？？？？？？？#然后，我們插入一個(gè)額外的損失，使得即使在主損失很高的情況下，LSTM和Embedding層也可以平滑的訓(xùn)練。auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out) #再然后，我們將LSTM與額外的輸入數(shù)據(jù)串聯(lián)起來(lái)組成輸入，送入模型中： # 模型一：只針對(duì)以上的序列做的預(yù)測(cè)模型

組合模型：新聞?wù)Z料+時(shí)序

# 模型二：組合模型 auxiliary_input = Input(shape=(5,), name='aux_input') # 新加入的一個(gè)Input,5維度 x = keras.layers.concatenate([lstm_out, auxiliary_input]) # 組合起來(lái)，對(duì)應(yīng)起來(lái)# We stack a deep densely-connected network on top # 組合模型的形式 x = Dense(64, activation='relu')(x) x = Dense(64, activation='relu')(x) x = Dense(64, activation='relu')(x) # And finally we add the main logistic regression layer main_output = Dense(1, activation='sigmoid', name='main_output')(x)#最后，我們定義整個(gè)2輸入，2輸出的模型： model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output]) #模型定義完畢，下一步編譯模型。 #我們給額外的損失賦0.2的權(quán)重。我們可以通過(guò)關(guān)鍵字參數(shù)loss_weights或loss來(lái)為不同的輸出設(shè)置不同的損失函數(shù)或權(quán)值。 #這兩個(gè)參數(shù)均可為Python的列表或字典。這里我們給loss傳遞單個(gè)損失函數(shù)，這個(gè)損失函數(shù)會(huì)被應(yīng)用于所有輸出上。

其中：Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])是核心，
Input兩個(gè)內(nèi)容，outputs兩個(gè)模型

# 訓(xùn)練方式一：兩個(gè)模型一個(gè)loss model.compile(optimizer='rmsprop', loss='binary_crossentropy',loss_weights=[1., 0.2]) #編譯完成后，我們通過(guò)傳遞訓(xùn)練數(shù)據(jù)和目標(biāo)值訓(xùn)練該模型：model.fit([headline_data, additional_data], [labels, labels],epochs=50, batch_size=32)# 訓(xùn)練方式二：兩個(gè)模型,兩個(gè)Loss #因?yàn)槲覀冚斎牒洼敵鍪潜幻^(guò)的（在定義時(shí)傳遞了“name”參數(shù)），我們也可以用下面的方式編譯和訓(xùn)練模型： model.compile(optimizer='rmsprop',loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},loss_weights={'main_output': 1., 'aux_output': 0.2})# And trained it via: model.fit({'main_input': headline_data, 'aux_input': additional_data},{'main_output': labels, 'aux_output': labels},epochs=50, batch_size=32)

因?yàn)檩斎雰蓚€(gè)，輸出兩個(gè)模型，所以可以分為設(shè)置不同的模型訓(xùn)練參數(shù)

案例四：共享層：對(duì)應(yīng)關(guān)系、相似性

一個(gè)節(jié)點(diǎn)，分成兩個(gè)分支出去

import keras from keras.layers import Input, LSTM, Dense from keras.models import Modeltweet_a = Input(shape=(140, 256)) tweet_b = Input(shape=(140, 256)) #若要對(duì)不同的輸入共享同一層，就初始化該層一次，然后多次調(diào)用它 # 140個(gè)單詞，每個(gè)單詞256維度，詞向量 # # This layer can take as input a matrix # and will return a vector of size 64 shared_lstm = LSTM(64) # 返回一個(gè)64規(guī)模的向量# When we reuse the same layer instance # multiple times, the weights of the layer # are also being reused # (it is effectively *the same* layer) encoded_a = shared_lstm(tweet_a) encoded_b = shared_lstm(tweet_b)# We can then concatenate the two vectors:# 連接兩個(gè)結(jié)果# axis=-1？？？？？ merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)# And add a logistic regression on top predictions = Dense(1, activation='sigmoid')(merged_vector) # 其中的1 代表什么？？？？# We define a trainable model linking the # tweet inputs to the predictions model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)model.compile(optimizer='rmsprop',loss='binary_crossentropy',metrics=['accuracy']) model.fit([data_a, data_b], labels, epochs=10) # 訓(xùn)練模型，然后預(yù)測(cè)

案例五：抽取層節(jié)點(diǎn)內(nèi)容

# 1、單節(jié)點(diǎn) a = Input(shape=(140, 256)) lstm = LSTM(32) encoded_a = lstm(a) assert lstm.output == encoded_a # 抽取獲得encoded_a的輸出張量# 2、多節(jié)點(diǎn) a = Input(shape=(140, 256)) b = Input(shape=(140, 256))lstm = LSTM(32) encoded_a = lstm(a) encoded_b = lstm(b)assert lstm.get_output_at(0) == encoded_a assert lstm.get_output_at(1) == encoded_b# 3、圖像層節(jié)點(diǎn) # 對(duì)于input_shape和output_shape也是一樣，如果一個(gè)層只有一個(gè)節(jié)點(diǎn)， #或所有的節(jié)點(diǎn)都有相同的輸入或輸出shape， #那么input_shape和output_shape都是沒(méi)有歧義的，并也只返回一個(gè)值。 #但是，例如你把一個(gè)相同的Conv2D應(yīng)用于一個(gè)大小為(3,32,32)的數(shù)據(jù)， #然后又將其應(yīng)用于一個(gè)(3,64,64)的數(shù)據(jù)，那么此時(shí)該層就具有了多個(gè)輸入和輸出的shape， #你就需要顯式的指定節(jié)點(diǎn)的下標(biāo)，來(lái)表明你想取的是哪個(gè)了 a = Input(shape=(3, 32, 32)) b = Input(shape=(3, 64, 64))conv = Conv2D(16, (3, 3), padding='same') conved_a = conv(a)# Only one input so far, the following will work: assert conv.input_shape == (None, 3, 32, 32)conved_b = conv(b) # now the `.input_shape` property wouldn't work, but this does: assert conv.get_input_shape_at(0) == (None, 3, 32, 32) assert conv.get_input_shape_at(1) == (None, 3, 64, 64)

案例六：視覺(jué)問(wèn)答模型

#這個(gè)模型將自然語(yǔ)言的問(wèn)題和圖片分別映射為特征向量， #將二者合并后訓(xùn)練一個(gè)logistic回歸層，從一系列可能的回答中挑選一個(gè)。 from keras.layers import Conv2D, MaxPooling2D, Flatten from keras.layers import Input, LSTM, Embedding, Dense from keras.models import Model, Sequential# First, let's define a vision model using a Sequential model. # This model will encode an image into a vector. vision_model = Sequential() vision_model.add(Conv2D(64, (3, 3) activation='relu', padding='same', input_shape=(3, 224, 224))) vision_model.add(Conv2D(64, (3, 3), activation='relu')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Conv2D(128, (3, 3), activation='relu', padding='same')) vision_model.add(Conv2D(128, (3, 3), activation='relu')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Conv2D(256, (3, 3), activation='relu', padding='same')) vision_model.add(Conv2D(256, (3, 3), activation='relu')) vision_model.add(Conv2D(256, (3, 3), activation='relu')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Flatten())# Now let's get a tensor with the output of our vision model: image_input = Input(shape=(3, 224, 224)) encoded_image = vision_model(image_input)# Next, let's define a language model to encode the question into a vector. # Each question will be at most 100 word long, # and we will index words as integers from 1 to 9999. question_input = Input(shape=(100,), dtype='int32') embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input) encoded_question = LSTM(256)(embedded_question)# Let's concatenate the question vector and the image vector: merged = keras.layers.concatenate([encoded_question, encoded_image])# And let's train a logistic regression over 1000 words on top: output = Dense(1000, activation='softmax')(merged)# This is our final model: vqa_model = Model(inputs=[image_input, question_input], outputs=output)# The next stage would be training this model on actual data.

延伸一：fine-tuning時(shí)如何加載No_top的權(quán)重

如果你需要加載權(quán)重到不同的網(wǎng)絡(luò)結(jié)構(gòu)（有些層一樣）中，例如fine-tune或transfer-learning，你可以通過(guò)層名字來(lái)加載模型：
model.load_weights(‘my_model_weights.h5’, by_name=True)
例如：

假如原模型為：

model = Sequential()model.add(Dense(2, input_dim=3, name="dense_1"))model.add(Dense(3, name="dense_2"))...model.save_weights(fname) # new model model = Sequential() model.add(Dense(2, input_dim=3, name="dense_1")) # will be loaded model.add(Dense(10, name="new_dense")) # will not be loaded# load weights from first model; will only affect the first layer, dense_1. model.load_weights(fname, by_name=True)

轉(zhuǎn)載：
https://blog.csdn.net/sinat_26917383/article/details/72857454?locationNum=1&fps=1

總結(jié)

以上是生活随笔為你收集整理的keras基本结构功能的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。