當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

tensorflow tf.data.Dataset.from_tensor_slices() （创建一个“数据集”，其元素是给定张量的切片）

發(fā)布時間：2025/3/20 编程问答 21 豆豆

生活随笔收集整理的這篇文章主要介紹了 tensorflow tf.data.Dataset.from_tensor_slices() （创建一个“数据集”，其元素是给定张量的切片）小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

from tensorflow\python\data\ops\dataset_ops.py

@staticmethoddef from_tensor_slices(tensors):"""Creates a `Dataset` whose elements are slices of the given tensors.創(chuàng)建一個“數(shù)據(jù)集”，其元素是給定張量的切片。Note that if `tensors` contains a NumPy array, and eager execution is notenabled, the values will be embedded in the graph as one or more`tf.constant` operations. For large datasets (> 1 GB), this can wastememory and run into byte limits of graph serialization. If tensors containsone or more large NumPy arrays, consider the alternative described in[this guide](https://tensorflow.org/guide/datasets#consuming_numpy_arrays).請注意，如果“張量”包含一個NumPy數(shù)組，并且未啟用急切執(zhí)行，則值將作為一個或多個“ tf.constant”操作嵌入到圖形中。對于大型數(shù)據(jù)集（> 1 GB），這會浪費(fèi)內(nèi)存并遇到圖序列化的字節(jié)限制。如果張量包含一個或多個大NumPy數(shù)組，請考慮[本指南]（https://tensorflow.org/guide/datasets#using_numpy_arrays）中所述的替代方法。Args:tensors: A nested structure of tensors, each having the same size in the0th dimension. 張量的嵌套結(jié)構(gòu)，每個張量在第0維上具有相同的大小。Returns:Dataset: A `Dataset`. 數(shù)據(jù)集"""return TensorSliceDataset(tensors)

tf.data.Dataset.from_tensor_slices的用法

tf.data.Dataset.from_tensor_slices

該函數(shù)是dataset核心函數(shù)之一，它的作用是把給定的元組、列表和張量等數(shù)據(jù)進(jìn)行特征切片。切片的范圍是從最外層維度開始的。如果有多個特征進(jìn)行組合，那么一次切片是把每個組合的最外維度的數(shù)據(jù)切開，分成一組一組的。

假設(shè)我們現(xiàn)在有兩組數(shù)據(jù)，分別是特征和標(biāo)簽，為了簡化說明問題，我們假設(shè)每兩個特征對應(yīng)一個標(biāo)簽。之后把特征和標(biāo)簽組合成一個tuple，那么我們的想法是讓每個標(biāo)簽都恰好對應(yīng)2個特征，而且像直接切片，比如：[f11, f12] [t1]。f11表示第一個數(shù)據(jù)的第一個特征，f12表示第1個數(shù)據(jù)的第二個特征，t1表示第一個數(shù)據(jù)標(biāo)簽。那么tf.data.Dataset.from_tensor_slices就是做了這件事情：

import tensorflow as tf import numpy as npfeatures, labels = (np.random.sample((6, 3)), # 模擬6組數(shù)據(jù)，每組數(shù)據(jù)3個特征np.random.sample((6, 1))) # 模擬6組數(shù)據(jù)，每組數(shù)據(jù)對應(yīng)一個標(biāo)簽，注意兩者的維數(shù)必須匹配print('features：\n{}\nlabels：\n{}'.format(features,labels)) # 輸出下組合的數(shù)據(jù) data = tf.data.Dataset.from_tensor_slices((features, labels)) print(data) # 輸出張量的信息

結(jié)果：

features： [[0.51145285 0.01551108 0.8964333 ][0.0320744 0.78853224 0.5573909 ][0.41400104 0.79518971 0.08757085][0.77886924 0.38981308 0.27692796][0.0143111 0.98848222 0.80941925][0.53292765 0.22526956 0.75004045]] labels： [[0.13517541][0.11131747][0.47719857][0.8386204 ][0.61713735][0.80272754]] <TensorSliceDataset shapes: ((3,), (1,)), types: (tf.float64, tf.float64)>

從結(jié)果可以看出，該函數(shù)將數(shù)據(jù)分為了shape為（（3，），（1，））的數(shù)據(jù)形式，即每三個特征對應(yīng)一個標(biāo)簽。

參考文章：tf.data.Dataset.from_tensor_slices的用法

總結(jié)

以上是生活随笔為你收集整理的tensorflow tf.data.Dataset.from_tensor_slices() （创建一个“数据集”，其元素是给定张量的切片）的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： tensorflow tf.is_gpu
下一篇： python中三个双引号的作用是什么