Ray.tune可视化调整超参数Tensorflow 2.0
Ray.tune官方文檔
調整超參數通常是機器學習工作流程中最昂貴的部分。 Tune專為解決此問題而設計,展示了針對此痛點的有效且可擴展的解決方案。 請注意,此示例取決于Tensorflow 2.0。
Code: ray/python/ray/tune at master · ray-project/ray · GitHub
Examples: https://github.com/ray-project/ray/tree/master/python/ray/tune/examples)
Documentation: Tune: Scalable Hyperparameter Tuning — Ray v1.6.0
Mailing List: https://groups.google.com/forum/#!forum/ray-dev
## If you are running on Google Colab, uncomment below to install the necessary dependencies ## before beginning the exercise.# print("Setting up colab environment") # !pip uninstall -y -q pyarrow # !pip install -q https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev5-cp36-cp36m-manylinux1_x86_64.whl # !pip install -q ray[debug]# # A hack to force the runtime to restart, needed to include the above dependencies. # print("Done installing! Restarting via forced crash (this is not an issue).") # import os # os._exit(0) ## If you are running on Google Colab, please install TensorFlow 2.0 by uncommenting below..# try: # # %tensorflow_version only exists in Colab. # %tensorflow_version 2.x # except Exception: # pass本教程將逐步介紹使用Tune進行超參數調整的幾個關鍵步驟。
請注意,這使用了Tune的基于函數的API。 這主要是用于原型制作。 后面的教程將介紹Tune更加強大的基于類的可訓練 API。
import numpy as np np.random.seed(0)import tensorflow as tf try:tf.get_logger().setLevel('INFO') except Exception as exc:print(exc) import warnings warnings.simplefilter("ignore")from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Densefrom tensorflow.keras.optimizers import SGD, Adam from tensorflow.keras.callbacks import ModelCheckpointimport ray from ray import tune from ray.tune.examples.utils import get_iris_dataimport inspect import pandas as pd import matplotlib.pyplot as plt plt.style.use('ggplot') %matplotlib inline
Visualize your data
首先讓我們看一下數據集的分布。
鳶尾花數據集由3種不同類型的鳶尾花(Setosa,Versicolour和Virginica)的花瓣和萼片長度組成,存儲在150x4 numpy中。
行為樣本,列為:隔片長度,隔片寬度,花瓣長度和花瓣寬度。
本教程的目標是提供一個模型,該模型可以準確地預測給定的萼片長度,萼片寬度,花瓣長度和花瓣寬度4元組的真實標簽。
創建模型訓練過程(使用Keras)
現在,讓我們定義一個函數,該函數將包含一些超參數并返回一個可用于訓練的模型。
def create_model(learning_rate, dense_1, dense_2):assert learning_rate > 0 and dense_1 > 0 and dense_2 > 0, "Did you set the right configuration?"model = Sequential()model.add(Dense(int(dense_1), input_shape=(4,), activation='relu', name='fc1'))model.add(Dense(int(dense_2), activation='relu', name='fc2'))model.add(Dense(3, activation='softmax', name='output'))optimizer = SGD(lr=learning_rate)model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])return model下面是一個使用create_model函數訓練模型并返回訓練后的模型的函數。
def train_on_iris():train_x, train_y, test_x, test_y = get_iris_data()model = create_model(learning_rate=0.1, dense_1=2, dense_2=2)# This saves the top model. `accuracy` is only available in TF2.0.checkpoint_callback = ModelCheckpoint("model.h5", monitor='accuracy', save_best_only=True, save_freq=2)# Train the modelmodel.fit(train_x, train_y, validation_data=(test_x, test_y),verbose=0, batch_size=10, epochs=20, callbacks=[checkpoint_callback])return model讓我們在數據集中快速訓練模型。 準確性應該很低。
original_model = train_on_iris() # This trains the model and returns it. train_x, train_y, test_x, test_y = get_iris_data() original_loss, original_accuracy = original_model.evaluate(test_x, test_y, verbose=0) print("Loss is {:0.4f}".format(original_loss)) print("Accuracy is {:0.4f}".format(original_accuracy))
與tune整合
現在,讓我們使用Tune優化學習鳶尾花分類的模型。 這將分為兩個部分-修改訓練功能以支持Tune,然后配置Tune。
讓我們首先定義一個回調函數,以將中間訓練進度報告回Tune。
import tensorflow.keras as keras from ray.tune import trackclass TuneReporterCallback(keras.callbacks.Callback):"""Tune Callback for Keras.The callback is invoked every epoch."""def __init__(self, logs={}):self.iteration = 0super(TuneReporterCallback, self).__init__()def on_epoch_end(self, batch, logs={}):self.iteration += 1track.log(keras_info=logs, mean_accuracy=logs.get("accuracy"), mean_loss=logs.get("loss"))
整合第1部分:修改訓練功能
說明按照接下來的2個步驟來修改train_iris函數以支持Tune。
def tune_iris(config)
model = create_model(learning_rate=config["lr"], dense_1=config["dense_1"], dense_2=config["dense_2"])
第2部分:配置Tune以調整超參數。
說明按照接下來的2個步驟來配置Tune,以識別頂部的超參數。
hyperparameter_space = { "lr": tune.loguniform(0.001, 0.1), "dense_1": tune.uniform(2, 128), "dense_2": tune.uniform(2, 128), }
num_samples = 20
常見問題:并行在Tune中如何工作?
設置num_samples將總共運行20個試驗(超參數配置示例)。 但是,并非所有這些都可以一次運行。 最大訓練并發性是您正在運行的計算機上的CPU內核數。 對于2核機器,將同時訓練2個模型。 完成后,新的訓練過程將從新的超參數配置示例開始。
每個試用版都將在新的Python進程上運行。 試用結束后,python進程將被殺死。
常見問題解答:如何調試Tune中的內容?
錯誤文件列將顯示在輸出中。 運行下面帶有錯誤文件路徑路徑的單元格以診斷您的問題。
! cat /home/ubuntu/tune_iris/tune_iris_c66e1100_2019-10-09_17-13-24x_swb9xs/error_2019-10-09_17-13-29.txt
啟動Tune超參數搜索
# This seeds the hyperparameter sampling.
import numpy as np; np.random.seed(5)
hyperparameter_space = {} # TODO: Fill me out.
num_samples = 1 # TODO: Fill me out.####################################################################################################
################ This is just a validation function for tutorial purposes only. ####################
HP_KEYS = ["lr", "dense_1", "dense_2"]
assert all(key in hyperparameter_space for key in HP_KEYS), ("The hyperparameter space is not fully designated. It must include all of {}".format(HP_KEYS))
######################################################################################################ray.shutdown() # Restart Ray defensively in case the ray connection is lost.
ray.init(log_to_driver=False)
# We clean out the logs before running for a clean visualization later.
! rm -rf ~/ray_results/tune_irisanalysis = tune.run(tune_iris, verbose=1, config=hyperparameter_space,num_samples=num_samples)assert len(analysis.trials) == 20, "Did you set the correct number of samples?"
分析最佳調整的模型
讓我們將真實標簽與分類標簽進行比較。
_, _, test_data, test_labels = get_iris_data() plot_data(test_data, test_labels.argmax(1)) # Obtain the directory where the best model is saved. print("You can use any of the following columns to get the best model: \n{}.".format([k for k in analysis.dataframe() if k.startswith("keras_info")])) print("=" * 10) logdir = analysis.get_best_logdir("keras_info/val_loss", mode="min") # We saved the model as `model.h5` in the logdir of the trial. from tensorflow.keras.models import load_model tuned_model = load_model(logdir + "/model.h5")tuned_loss, tuned_accuracy = tuned_model.evaluate(test_data, test_labels, verbose=0) print("Loss is {:0.4f}".format(tuned_loss)) print("Tuned accuracy is {:0.4f}".format(tuned_accuracy)) print("The original un-tuned model had an accuracy of {:0.4f}".format(original_accuracy)) predicted_label = tuned_model.predict(test_data) plot_data(test_data, predicted_label.argmax(1))我們可以通過可視化與基本事實相比較的預測來比較最佳模型的性能。
def plot_comparison(X, y):# Visualize the data setsplt.figure(figsize=(16, 6))plt.subplot(1, 2, 1)for target, target_name in enumerate(["Incorrect", "Correct"]):X_plot = X[y == target]plt.plot(X_plot[:, 0], X_plot[:, 1], linestyle='none', marker='o', label=target_name)plt.xlabel(feature_names[0])plt.ylabel(feature_names[1])plt.axis('equal')plt.legend();plt.subplot(1, 2, 2)for target, target_name in enumerate(["Incorrect", "Correct"]):X_plot = X[y == target]plt.plot(X_plot[:, 2], X_plot[:, 3], linestyle='none', marker='o', label=target_name)plt.xlabel(feature_names[2])plt.ylabel(feature_names[3])plt.axis('equal')plt.legend();plot_comparison(test_data, test_labels.argmax(1) == predicted_label.argmax(1))
額外-使用Tensorboard獲得結果
您可以使用TensorBoard查看試用表演。 如果未加載圖形,請單擊“切換所有運行”。
%load_ext tensorboard %load_ext tensorboard
Ray.tune官方文檔
調整超參數通常是機器學習工作流程中最昂貴的部分。 Tune專為解決此問題而設計,展示了針對此痛點的有效且可擴展的解決方案。 請注意,此示例取決于Tensorflow 2.0。
Code: ray/python/ray/tune at master · ray-project/ray · GitHub
Examples: https://github.com/ray-project/ray/tree/master/python/ray/tune/examples)
Documentation: Tune: Scalable Hyperparameter Tuning — Ray v1.6.0
Mailing List: https://groups.google.com/forum/#!forum/ray-dev
## If you are running on Google Colab, uncomment below to install the necessary dependencies ## before beginning the exercise.# print("Setting up colab environment") # !pip uninstall -y -q pyarrow # !pip install -q https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev5-cp36-cp36m-manylinux1_x86_64.whl # !pip install -q ray[debug]# # A hack to force the runtime to restart, needed to include the above dependencies. # print("Done installing! Restarting via forced crash (this is not an issue).") # import os # os._exit(0) ## If you are running on Google Colab, please install TensorFlow 2.0 by uncommenting below..# try: # # %tensorflow_version only exists in Colab. # %tensorflow_version 2.x # except Exception: # pass本教程將逐步介紹使用Tune進行超參數調整的幾個關鍵步驟。
請注意,這使用了Tune的基于函數的API。 這主要是用于原型制作。 后面的教程將介紹Tune更加強大的基于類的可訓練 API。
import numpy as np np.random.seed(0)import tensorflow as tf try:tf.get_logger().setLevel('INFO') except Exception as exc:print(exc) import warnings warnings.simplefilter("ignore")from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Densefrom tensorflow.keras.optimizers import SGD, Adam from tensorflow.keras.callbacks import ModelCheckpointimport ray from ray import tune from ray.tune.examples.utils import get_iris_dataimport inspect import pandas as pd import matplotlib.pyplot as plt plt.style.use('ggplot') %matplotlib inline
Visualize your data
首先讓我們看一下數據集的分布。
鳶尾花數據集由3種不同類型的鳶尾花(Setosa,Versicolour和Virginica)的花瓣和萼片長度組成,存儲在150x4 numpy中。
行為樣本,列為:隔片長度,隔片寬度,花瓣長度和花瓣寬度。
本教程的目標是提供一個模型,該模型可以準確地預測給定的萼片長度,萼片寬度,花瓣長度和花瓣寬度4元組的真實標簽。
from sklearn.datasets import load_irisiris = load_iris() true_data = iris['data'] true_label = iris['target'] names = iris['target_names'] feature_names = iris['feature_names']def plot_data(X, y):# Visualize the data setsplt.figure(figsize=(16, 6))plt.subplot(1, 2, 1)for target, target_name in enumerate(names):X_plot = X[y == target]plt.plot(X_plot[:, 0], X_plot[:, 1], linestyle='none', marker='o', label=target_name)plt.xlabel(feature_names[0])plt.ylabel(feature_names[1])plt.axis('equal')plt.legend();plt.subplot(1, 2, 2)for target, target_name in enumerate(names):X_plot = X[y == target]plt.plot(X_plot[:, 2], X_plot[:, 3], linestyle='none', marker='o', label=target_name)plt.xlabel(feature_names[2])plt.ylabel(feature_names[3])plt.axis('equal')plt.legend();plot_data(true_data, true_label)- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
創建模型訓練過程(使用Keras)
現在,讓我們定義一個函數,該函數將包含一些超參數并返回一個可用于訓練的模型。
def create_model(learning_rate, dense_1, dense_2):assert learning_rate > 0 and dense_1 > 0 and dense_2 > 0, "Did you set the right configuration?"model = Sequential()model.add(Dense(int(dense_1), input_shape=(4,), activation='relu', name='fc1'))model.add(Dense(int(dense_2), activation='relu', name='fc2'))model.add(Dense(3, activation='softmax', name='output'))optimizer = SGD(lr=learning_rate)model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])return model- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
下面是一個使用create_model函數訓練模型并返回訓練后的模型的函數。
def train_on_iris():train_x, train_y, test_x, test_y = get_iris_data()model = create_model(learning_rate=0.1, dense_1=2, dense_2=2)# This saves the top model. `accuracy` is only available in TF2.0.checkpoint_callback = ModelCheckpoint("model.h5", monitor='accuracy', save_best_only=True, save_freq=2)# Train the modelmodel.fit(train_x, train_y, validation_data=(test_x, test_y),verbose=0, batch_size=10, epochs=20, callbacks=[checkpoint_callback])return model讓我們在數據集中快速訓練模型。 準確性應該很低。
original_model = train_on_iris() # This trains the model and returns it. train_x, train_y, test_x, test_y = get_iris_data() original_loss, original_accuracy = original_model.evaluate(test_x, test_y, verbose=0) print("Loss is {:0.4f}".format(original_loss)) print("Accuracy is {:0.4f}".format(original_accuracy))- 1
- 2
- 3
- 4
- 5
與tune整合
現在,讓我們使用Tune優化學習鳶尾花分類的模型。 這將分為兩個部分-修改訓練功能以支持Tune,然后配置Tune。
讓我們首先定義一個回調函數,以將中間訓練進度報告回Tune。
import tensorflow.keras as keras from ray.tune import trackclass TuneReporterCallback(keras.callbacks.Callback):"""Tune Callback for Keras.The callback is invoked every epoch."""def __init__(self, logs={}):self.iteration = 0super(TuneReporterCallback, self).__init__()def on_epoch_end(self, batch, logs={}):self.iteration += 1track.log(keras_info=logs, mean_accuracy=logs.get("accuracy"), mean_loss=logs.get("loss"))整合第1部分:修改訓練功能
說明按照接下來的2個步驟來修改train_iris函數以支持Tune。
def tune_iris(config)
model = create_model(learning_rate=config["lr"], dense_1=config["dense_1"], dense_2=config["dense_2"])
第2部分:配置Tune以調整超參數。
說明按照接下來的2個步驟來配置Tune,以識別頂部的超參數。
hyperparameter_space = { "lr": tune.loguniform(0.001, 0.1), "dense_1": tune.uniform(2, 128), "dense_2": tune.uniform(2, 128), }
num_samples = 20
常見問題:并行在Tune中如何工作?
設置num_samples將總共運行20個試驗(超參數配置示例)。 但是,并非所有這些都可以一次運行。 最大訓練并發性是您正在運行的計算機上的CPU內核數。 對于2核機器,將同時訓練2個模型。 完成后,新的訓練過程將從新的超參數配置示例開始。
每個試用版都將在新的Python進程上運行。 試用結束后,python進程將被殺死。
常見問題解答:如何調試Tune中的內容?
錯誤文件列將顯示在輸出中。 運行下面帶有錯誤文件路徑路徑的單元格以診斷您的問題。
! cat /home/ubuntu/tune_iris/tune_iris_c66e1100_2019-10-09_17-13-24x_swb9xs/error_2019-10-09_17-13-29.txt
啟動Tune超參數搜索
# This seeds the hyperparameter sampling.
import numpy as np; np.random.seed(5)
hyperparameter_space = {} # TODO: Fill me out.
num_samples = 1 # TODO: Fill me out.####################################################################################################
################ This is just a validation function for tutorial purposes only. ####################
HP_KEYS = ["lr", "dense_1", "dense_2"]
assert all(key in hyperparameter_space for key in HP_KEYS), ("The hyperparameter space is not fully designated. It must include all of {}".format(HP_KEYS))
######################################################################################################ray.shutdown() # Restart Ray defensively in case the ray connection is lost.
ray.init(log_to_driver=False)
# We clean out the logs before running for a clean visualization later.
! rm -rf ~/ray_results/tune_irisanalysis = tune.run(tune_iris, verbose=1, config=hyperparameter_space,num_samples=num_samples)assert len(analysis.trials) == 20, "Did you set the correct number of samples?"
分析最佳調整的模型
讓我們將真實標簽與分類標簽進行比較。
_, _, test_data, test_labels = get_iris_data() plot_data(test_data, test_labels.argmax(1)) # Obtain the directory where the best model is saved. print("You can use any of the following columns to get the best model: \n{}.".format([k for k in analysis.dataframe() if k.startswith("keras_info")])) print("=" * 10) logdir = analysis.get_best_logdir("keras_info/val_loss", mode="min") # We saved the model as `model.h5` in the logdir of the trial. from tensorflow.keras.models import load_model tuned_model = load_model(logdir + "/model.h5")tuned_loss, tuned_accuracy = tuned_model.evaluate(test_data, test_labels, verbose=0) print("Loss is {:0.4f}".format(tuned_loss)) print("Tuned accuracy is {:0.4f}".format(tuned_accuracy)) print("The original un-tuned model had an accuracy of {:0.4f}".format(original_accuracy)) predicted_label = tuned_model.predict(test_data) plot_data(test_data, predicted_label.argmax(1))我們可以通過可視化與基本事實相比較的預測來比較最佳模型的性能。
def plot_comparison(X, y):# Visualize the data setsplt.figure(figsize=(16, 6))plt.subplot(1, 2, 1)for target, target_name in enumerate(["Incorrect", "Correct"]):X_plot = X[y == target]plt.plot(X_plot[:, 0], X_plot[:, 1], linestyle='none', marker='o', label=target_name)plt.xlabel(feature_names[0])plt.ylabel(feature_names[1])plt.axis('equal')plt.legend();plt.subplot(1, 2, 2)for target, target_name in enumerate(["Incorrect", "Correct"]):X_plot = X[y == target]plt.plot(X_plot[:, 2], X_plot[:, 3], linestyle='none', marker='o', label=target_name)plt.xlabel(feature_names[2])plt.ylabel(feature_names[3])plt.axis('equal')plt.legend();plot_comparison(test_data, test_labels.argmax(1) == predicted_label.argmax(1))額外-使用Tensorboard獲得結果
如上述博客有任何錯誤或者疑問,請加VX:1755337994,及時告知!萬分感激!?
您可以使用TensorBoard查看試用表演。
總結
以上是生活随笔為你收集整理的Ray.tune可视化调整超参数Tensorflow 2.0的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 杨辉三角——数组解决
- 下一篇: 华为交换机基本查询、目录、文件操作命令