當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

tensor判断是否相等_PyTorch的Tensor(中)

發(fā)布時(shí)間：2023/12/10 编程问答 20 豆豆

生活随笔收集整理的這篇文章主要介紹了 tensor判断是否相等_PyTorch的Tensor(中) 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

背景

在PyTorch的Tensor系列上一篇文章中：

Gemfield：PyTorch的Tensor（上）?zhuanlan.zhihu.com

Gemfield介紹了一個(gè)Tensor的創(chuàng)建過(guò)程，特別是在創(chuàng)建一個(gè)Tensor的時(shí)候，調(diào)用棧從Python到C++再回到Python的過(guò)程。與此同時(shí)，在內(nèi)存中對(duì)應(yīng)的是一個(gè)Variable實(shí)例的創(chuàng)建（嚴(yán)格來(lái)說(shuō)，Variable實(shí)例的某個(gè)field也是Variable實(shí)例）。

在本文，Gemfield將介紹PyTorch的Tensor中autograd相關(guān)的部分。autograd是PyTorch之所以是神經(jīng)網(wǎng)絡(luò)框架的一個(gè)重要原因。autograd機(jī)制提供了對(duì)Tensor上所有操作自動(dòng)求微分的功能。我們知道，對(duì)于一個(gè)Variable來(lái)說(shuō)，它的唯一數(shù)據(jù)成員就是impl_，這個(gè)impl_成員是TensorImpl 類型，在初始化階段impl_會(huì)被實(shí)例化為一個(gè)Variable::Impl的實(shí)例（TensorImpl的子類）：

Variable --> impl_ = Variable::Impl實(shí)例

對(duì)于一個(gè)Variable的autograd來(lái)說(shuō)，autograd的部分就體現(xiàn)在impl_的autograd_meta_成員上。在初始化階段，autograd_meta_會(huì)被初始化為一個(gè)Variable::AutogradMeta的實(shí)例（AutogradMetaInterface的子類），或者會(huì)被初始化為一個(gè)Variable::DifferentiableViewMeta的實(shí)例（Variable::AutogradMeta的子類），然后通過(guò)Variable的 get_autograd_meta()來(lái)訪問(wèn)。實(shí)際上，autograd_meta_正是一個(gè)Variable是普通tensor還是帶autograd功能的tensor的唯一標(biāo)識(shí)：

#1 Variable是個(gè)Tensor，沒(méi)有requires_grad Variable --> impl_ --> autograd_meta_ = None#2 Variable --> impl_ --> autograd_meta_ = Variable::AutogradMeta實(shí)例#3 Variable --> impl_ --> autograd_meta_ = Variable::DifferentiableViewMeta實(shí)例

而一個(gè)Variable::AutogradMeta實(shí)例有如下成員，這些成員正是PyTorch autograd系統(tǒng)的中堅(jiān)：

# Variable::AutogradMeta 和 Variable::DifferentiableViewMeta Variable grad_; std::shared_ptr<Function> grad_fn_; std::weak_ptr<Function> grad_accumulator_; VariableVersion version_counter_; std::vector<std::shared_ptr<FunctionPreHook>> hooks_; bool requires_grad_; bool is_view_; uint32_t output_nr_;# 僅 Variable::DifferentiableViewMeta Variable base_; uint32_t attr_version;

1，grad_是另外一個(gè)Variable，存儲(chǔ)當(dāng)前Variable實(shí)例的梯度；
2，grad_fn是個(gè)Function的實(shí)例，非leaf variables才有。通過(guò)Variable的grad_fn()來(lái)訪問(wèn)，實(shí)際上，PyTorch中就是通過(guò)是否grad_fn_ == nullptr來(lái)判斷一個(gè)Variable是否是leaf variable的；
3，grad_accumulator_是個(gè)Function的實(shí)例，只有l(wèi)eaf variables才有。通過(guò)Variable的grad_accumulator()來(lái)訪問(wèn)；
4，version_counter_里有個(gè)version number；
5，hooks_可以是一組；
6，requires_grad_ 是個(gè)flag，表明此Variable實(shí)例是否需要grad；
7，is_view_是個(gè)flag，表明此Variable實(shí)例是否是個(gè)view（沒(méi)有實(shí)際存儲(chǔ)，基于base的variable）；
8，output_nr_是個(gè)數(shù)字；
9，base_是view的base variable；
10，attr_version是個(gè)數(shù)字。

我們通過(guò)下面這一小段代碼來(lái)演示下這個(gè)能力：

gemfield = torch.ones(2, 2, requires_grad=True) syszux = gemfield + 2 civilnet = syszux * syszux * 3 gemfieldout = civilnet.mean() gemfieldout.backward()

特別的，對(duì)于在python會(huì)話中的每一步操作，gemfield都將映射到內(nèi)存上類實(shí)例中的成員/結(jié)構(gòu)體的變化。

Tensor創(chuàng)建：gemfield = torch.ones(2, 2, requires_grad=True)

我們使用gemfield = torch.ones(2, 2, requires_grad=True) 語(yǔ)句來(lái)創(chuàng)建了一個(gè)tensor。在https://zhuanlan.zhihu.com/p/54896021一文中已經(jīng)介紹過(guò)了，這個(gè)調(diào)用會(huì)在內(nèi)存中產(chǎn)生如下一個(gè)Variable實(shí)例：

這個(gè)gemfield變量就是圖中的leaf，為什么呢？因?yàn)檫@是用戶直接創(chuàng)建的（不是經(jīng)過(guò)計(jì)算得到的），位于圖中最“底端/外側(cè)”的位置，沒(méi)有子節(jié)點(diǎn)。這個(gè)時(shí)候，Tensor gemfield的grad是None，grad_fn是None。output_nr_為0,表明這個(gè)Variable是function的第1個(gè)輸出。

Tensor的簡(jiǎn)單加法：syszux = gemfield + 2

我們使用 syszux = gemfield + 2 來(lái)得到一個(gè)新的Tensor，名字為syszux。這個(gè)加法嘛，在初始化的時(shí)候已經(jīng)和C++中的THPVariable_add函數(shù)綁定上，并注冊(cè)到Python的torch._C._TensorBase符號(hào)上了：

PyMethodDef variable_methods[] = {{"__add__", (PyCFunction)THPVariable_add, METH_VARARGS | METH_KEYWORDS, NULL},......

而THPVariable_add的定義如下：

static PyObject * THPVariable_add(PyObject* self_, PyObject* args, PyObject* kwargs) {......return wrap(dispatch_add(self, r.tensor(0), r.scalar(1))); }

1，scalar to tensor

在這個(gè)函數(shù)中，首先要將syszux = gemfield + 2 中的2從標(biāo)量轉(zhuǎn)換為tensor，這個(gè)轉(zhuǎn)換邏輯如下：

auto tensor = scalar_to_tensor(scalar); tensor.unsafeGetTensorImpl()->set_wrapped_number(true); return autograd::make_variable(tensor);

現(xiàn)在scalar 2已經(jīng)變成了內(nèi)存中一個(gè)Variable的實(shí)例，在add真正執(zhí)行之前，在內(nèi)存中已經(jīng)有2個(gè)Variable實(shí)例了，分別是gemfield和2：

#gemfield Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = [[1., 1.],[1., 1.]]--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= None--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = True--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist#scalar 2 Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = [2.]--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= None--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = False--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist

不過(guò)要注意了，由于gemfield是一個(gè)leaf variable，因此在后文的加法運(yùn)算中，gemfield會(huì)被觸發(fā)Variable::grad_accumulator()調(diào)用，這會(huì)初始化gemfield的grad_accumulator_成員，因此在那之后，gemfield在內(nèi)存中的樣子就會(huì)變?yōu)?#xff1a;

2，dispatch to add

既然是分發(fā)，肯定是為了分發(fā)到對(duì)應(yīng)種類的Type上。分發(fā)的核心邏輯是：

dispatch_type().add(*this, other, alpha)

dispatch_type是at::Tensor類上的一個(gè)方法，根據(jù)其impl（TensorImpl類）的id、dtype等信息，推導(dǎo)出Type類型（VariableType）。在這個(gè)例子上，根據(jù)輸入?yún)?shù)的類型，最終分發(fā)到torch/csrc/autograd/generated/VariableType.cpp（參考autograd的代碼生成）中的VariableType的add方法上：

Tensor VariableType::add(const Tensor & self, const Tensor & other, Scalar alpha) const {std::shared_ptr<AddBackward0> grad_fn;grad_fn = std::shared_ptr<AddBackward0>(new AddBackward0(), deleteFunction);grad_fn->set_next_edges(collect_next_edges( self, other ));grad_fn->alpha = alpha;auto tmp = ([&]() {at::AutoNonVariableTypeMode non_var_type_mode(true);return baseType->add(self_, other_, alpha);})();auto result = as_variable(tmp);set_history(flatten_tensor_args( result ), grad_fn);return result; }

匿名函數(shù)中的baseType->add最終又調(diào)用了如下的調(diào)用棧，只能說(shuō)一個(gè)簡(jiǎn)單的加法在graph設(shè)計(jì)中也會(huì)變得比較復(fù)雜：

VariableType::add | V baseType->add(self_, other_, alpha) | V #ATen/TypeDefault.cpp TypeDefault::add | V #aten/src/ATen/native/BinaryOps.cpp add(const Tensor& self, const Tensor& other, Scalar alpha) | V #此處依賴初始化階段的REGISTER_DISPATCH的工作 add_stub | V #aten/src/ATen/native/cpu/BinaryOpsKernel.cpp.AVX2.cpp add_kernel | V #aten/src/ATen/native/cpu/BinaryOpsKernel.cpp.AVX2.cpp binary_kernel_vec | V binary_loop

3，構(gòu)建autograd

3.1 構(gòu)建grad_fn （AddBackward0）

AddBackward0是加法運(yùn)算的反向傳播算法，構(gòu)建這個(gè)grad_fn實(shí)例是通過(guò)如下一行代碼完成的：

std::shared_ptr<AddBackward0> grad_fn = std::shared_ptr<AddBackward0>(new AddBackward0(), deleteFunction);

可以看到，對(duì)于加法，對(duì)應(yīng)的grad_fn是AddBackward0。在聊AddBackward0之前，你有必要先了解下PyTorch中的Function繼承體系（可以參考 https://zhuanlan.zhihu.com/p/61765561 ），還記得吧：

class AddBackward0 : public TraceableFunction class TraceableFunction : public Function

gemfield + 2 加法運(yùn)算完成后，就會(huì)創(chuàng)建出來(lái)這個(gè)AddBackward0（一個(gè)Function）實(shí)例，并且使用collect_next_edges()搜集gemfield和2的grad_fn或者grad_accumulator。gemfield是leaf variable，因此搜集的就是grad_accumulator（2啥都沒(méi)有），類型是AccumulateGrad（1個(gè)Function的實(shí)例），然后再向這個(gè)AddBackward0實(shí)例上注冊(cè)：

grad_fn->set_next_edges(collect_next_edges( self, other ));

next edges就是self和other的gradient_edge()：gradient_edge()函數(shù)返回的是Edge實(shí)例（通過(guò)Variable的grad_fn_構(gòu)建）。當(dāng)set_next_edges調(diào)用完成后，一個(gè)Function的next_edges_成員（類型為std::vector<Edge>）就會(huì)被初始化。

1，如果一個(gè)Variable是內(nèi)部創(chuàng)建的（通過(guò)運(yùn)算得到，比如syszux變量），那么grad_fn_就是這個(gè)Variable的gradient function；

2，如果一個(gè)Variable是用戶創(chuàng)建的（比如gemfield變量），則grad_fn_就是這個(gè)Variable的gradient accumulator，也就是一個(gè)AccumulateGrad類（Function子類）的實(shí)例。

但不管怎樣，Variable的grad_fn_成員在這里終歸是要構(gòu)建成一個(gè)Edge實(shí)例并以此作為gradient_edge()函數(shù)的返回：

Edge gradient_edge() const {if (const auto& gradient = grad_fn()) {return Edge(gradient, output_nr());} else {return Edge(grad_accumulator(), 0);} }

對(duì)于leaf Variable來(lái)說(shuō)，grad_fn是null，在這種情況下Variable將使用gradient accumulator, 用來(lái)累加輸出給這個(gè)Variable的梯度。注意只有當(dāng)Variable的requires_grad = True時(shí)才有g(shù)radient accumulators。經(jīng)過(guò)構(gòu)建后的AddBackward0的內(nèi)存布局如下所示：

#grad_fn AddBackward0， requires_grad == True Function實(shí)例 --> sequence_nr_ (uint64_t) = 0--> next_edges_ (edge_list) --> std::vector<Edge> = [(AccumulateGrad實(shí)例0x55ca7f304500, 0)，(0, 0)]--> input_metadata_ --> [(type, shape, device)...] = None--> alpha (Scalar) = 1--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 AddBackward0 的apply#grad_fn，requires_grad == False Function實(shí)例 --> None

3.2 構(gòu)建grad_accumulator_ （AccumulateGrad）

前文已經(jīng)提到過(guò)，在grad_fn調(diào)用collect_next_edges去搜集輸入?yún)?shù)（Variable實(shí)例）的edges時(shí)，對(duì)于leaf gemfield來(lái)說(shuō)，會(huì)觸發(fā)Variable::grad_accumulator()調(diào)用，在一個(gè)Variable第一次調(diào)用這個(gè)API的時(shí)候，會(huì)去初始化它的grad_accumulator_成員：

result = std::make_shared<AccumulateGrad>(Variable(std::move(intrusive_from_this))); autograd_meta->grad_accumulator_ = result;

這會(huì)new一個(gè)AccumulateGrad對(duì)象，使用UINT64_MAX（也就是18446744073709551615）來(lái)初始化Function的sequence_nr_成員。構(gòu)建完成后，grad_accumulator_(0x55ca7f304500)在內(nèi)存中看起來(lái)是這個(gè)樣子的：

# 類AccumulateGrad，繼承自Function，是一個(gè)Function實(shí)例 Function實(shí)例 --> sequence_nr_ (uint64_t) = UINT64_MAX--> next_edges_ (edge_list) --> None--> input_metadata_ --> [(type, shape, device)...] = [(CPUFloatType, [2, 2], cpu)]--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 AccumulateGrad 的apply--> variable = gemfield

初始化完畢后，此時(shí)gemfield這個(gè)Variable的grad_accumulator_已經(jīng)被賦值為AccumulateGrad實(shí)例(0x55ca7f304500)。

4 構(gòu)建Variable（syszux）

在加法表達(dá)式完成之后，內(nèi)存中也就產(chǎn)生了syszux。剛產(chǎn)生的syszux實(shí)例在內(nèi)存中看起來(lái)是這樣的：

//syszux Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = [[3., 3.],[3., 3.]]--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= None--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = False--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist

之后會(huì)使用set_history(flatten_tensor_args( result ), grad_fn)來(lái)設(shè)置syszux的gradient_edge，這個(gè)函數(shù)做了2件事情：

1，AddBackward0實(shí)例中的input_metadata_ 追加了Variable syszux的(type, shape, device)，追加完成后，syszux信息在input_metadata_中的index就是下文中會(huì)用到的output_nr_，此處為0（第一個(gè)元素嘛）；這樣AddBackward0實(shí)例在內(nèi)存中看起來(lái)是這樣：

//grad_fn->add_input_metadata(variable); grad_fn_ --> input_metadata_ += (variable.type, variable.shape, variable.device) = []#AddBackward0實(shí)例， requires_grad == True Function實(shí)例 --> sequence_nr_ (uint64_t) = 0--> next_edges_ (edge_list) --> std::vector<Edge> = [(AccumulateGrad實(shí)例, 0)，(0, 0)]--> input_metadata_ --> [(type, shape, device)...] = [(CPUFloatType, [2, 2],cpu])]--> alpha (Scalar) = 1--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 AddBackward0 的apply

2，syszux的autograd_meta中的grad_fn_ 被賦值成了上述AddBackward0實(shí)例，而autograd_meta中的output_nr_被賦值成了上文中的“當(dāng)前Variable信息在input_metadata_中的index”。這樣syszux實(shí)例在內(nèi)存中看起來(lái)就是這樣：

//as_variable_ref(variable).set_gradient_edge({grad_fn, output_nr}); Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = [[3., 3.],[3., 3.]]--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= AddBackward0實(shí)例0x55ca7f872e90（參考上面）--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = False--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist

完成這一步后，如果你在python會(huì)話中打印syszux的grad_fn屬性，就會(huì)調(diào)用THPVariable_get_grad_fn函數(shù)，接著調(diào)用Variable的grad_fn()函數(shù)拿到grad_fn_，然后使用 functionToPyObject()函數(shù)從cpp_function_types表中查找對(duì)應(yīng)的Python表示：

>>> syszux.grad_fn <AddBackward object at 0x7f1550f2e0d0>

civilnet = syszux * syszux * 3

現(xiàn)在進(jìn)入civilnet = syszux * syszux * 3這個(gè)表達(dá)式的解析了，這個(gè)表達(dá)式執(zhí)行期間，就會(huì)產(chǎn)生兩次python object的__mul__（乘法）調(diào)用，并且會(huì)new一個(gè)新的Variable civilnet出來(lái)。

1，Python到C++

在初始化階段，Variable的python綁定已經(jīng)完成了下面的初始化：

PyMethodDef variable_methods[] = {{"__add__", (PyCFunction)THPVariable_add, METH_VARARGS | METH_KEYWORDS, NULL},{"__mul__", (PyCFunction)THPVariable_mul, METH_VARARGS | METH_KEYWORDS, NULL},...... }

因此這個(gè)表達(dá)式會(huì)導(dǎo)致對(duì)C++函數(shù)THPVariable_mul的調(diào)用，并且是2次：兩次乘法。

2，dispatch to mul

和加法類似，這個(gè)乘法的調(diào)用棧如下所示（其中的一次）：

#torch/csrc/autograd/generated/python_variable_methods.cpp THPVariable_mul | V #torch/csrc/autograd/generated/python_variable_methods_dispatch.h dispatch_mul | V #aten/src/ATen/core/TensorMethods.h Tensor::mul | V #需要dispatch type 了 #torch/csrc/autograd/generated/VariableType_4.cpp Tensor VariableType::mul(const Tensor & self, const Tensor & other) | V #運(yùn)算符太簡(jiǎn)單了，扔回到base的default實(shí)現(xiàn) #build/aten/src/ATen/TypeDefault.cpp Tensor TypeDefault::mul(const Tensor & self, const Tensor & other) | V #aten/src/ATen/native/BinaryOps.cpp Tensor mul(const Tensor& self, const Tensor& other) | V #aten/src/ATen/native/TensorIterator.cpp TensorIterator::binary_op | V #此處依賴初始化階段的REGISTER_DISPATCH的工作 #build/aten/src/ATen/native/cpu/BinaryOpsKernel.cpp.AVX2.cpp（如果使用的是CPU版的pytorch，并且cpu支持AVX2） void mul_kernel(TensorIterator& iter) | V #aten/src/ATen/native/cpu/Loops.h void binary_kernel_vec(TensorIterator& iter, func_t op, vec_func_t vop)

如果使用的是CPU版的PyTorch，那么這里的乘法運(yùn)算最終分發(fā)到cpu的native實(shí)現(xiàn)上了。

3，構(gòu)建Autograd信息

當(dāng)乘法運(yùn)算分發(fā)到VariableType::mul的時(shí)候，PyTorch將會(huì)在這個(gè)函數(shù)中構(gòu)建autograd的信息。

Tensor VariableType::mul(const Tensor & self, const Tensor & other) const {std::shared_ptr<MulBackward0> grad_fn;grad_fn = std::shared_ptr<MulBackward0>(new MulBackward0(), deleteFunction);grad_fn->set_next_edges(collect_next_edges( self, other ));if (grad_fn->should_compute_output(1)) {grad_fn->self_ = SavedVariable(self, false);}if (grad_fn->should_compute_output(0)) {grad_fn->other_ = SavedVariable(other, false);}auto tmp = ([&]() {at::AutoNonVariableTypeMode non_var_type_mode(true);return baseType->mul(self_, other_);})();auto result = as_variable(tmp);set_history(flatten_tensor_args( result ), grad_fn);return result; }

3.1 構(gòu)建grad_fn （MulBackward0）

經(jīng)過(guò)表達(dá)式 grad_fn = std::shared_ptr<MulBackward0>(new MulBackward0(), deleteFunction)后，一個(gè)Function的實(shí)例就產(chǎn)生了，類型為MulBackward0（0x55ca7ebba2a0）。

struct MulBackward0 : public TraceableFunction {variable_list apply(variable_list&& grads) override;SavedVariable self_;SavedVariable other_; };

和AddBackward0不同的是，MulBackward0還有兩個(gè)SavedVariable成員。比如使用syszux初始化SavedVariable self_的時(shí)候，進(jìn)行了以下的拷貝：

#從syszux 拷貝到 SavedVariable self_ variable.output_nr() --> output_nr_ variable.requires_grad() --> requires_grad_ variable.data() --> data_ variable.grad_fn() --> grad_fn_ variable.version_counter() --> version_counter_ version_counter_.current_version() --> saved_version_

在內(nèi)存中，這個(gè)MulBackward0實(shí)例的布局如下所示：

#grad_fn MulBackward0， requires_grad == True Function實(shí)例 --> sequence_nr_ (uint64_t) = 1 （每個(gè)線程內(nèi)自增）--> next_edges_ (edge_list) --> std::vector<Edge> = None--> input_metadata_ --> [(type, shape, device)...] = None--> self_ (SavedVariable) = syszux的淺拷貝--> other_ (SavedVariable) = syszux的另一個(gè)淺拷貝--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 MulBackward0 的apply

3.2，初始化MulBackward0的next_edges_

前文已經(jīng)提到過(guò)，在grad_fn調(diào)用collect_next_edges去搜集輸入?yún)?shù)（Variable實(shí)例，此處為syszux）的edges時(shí)，對(duì)于“非leaf”的syszux來(lái)說(shuō)，會(huì)觸發(fā)Variable::grad_fn()調(diào)用，這會(huì)得到syszux的grad_fn，也就是AddBackward0實(shí)例。使用兩個(gè)syszux的grad_fn組成的edges初始化完MulBackward0實(shí)例后，MulBackward0在內(nèi)存中看起來(lái)是這個(gè)樣子：

#grad_fn MulBackward0， requires_grad == True，0x55ca7ebba2a0 Function實(shí)例 --> sequence_nr_ (uint64_t) = 1（每個(gè)線程內(nèi)自增）--> next_edges_ (edge_list) = [(AddBackward0實(shí)例0x55ca7f872e90,0),(AddBackward0實(shí)例0x55ca7f872e90,0)]--> input_metadata_ --> [(type, shape, device)...] = None--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 AccumulateGrad 的apply

初始化完畢后，此時(shí)MulBackward0這個(gè)Function（0x55ca7ebba2a0）的next_edges_已經(jīng)被賦值為syszux中的grad_fn組成的edges。

4 構(gòu)建Variable（tmp）

在第一次乘法表達(dá)式完成之后，內(nèi)存中也就產(chǎn)生了臨時(shí)Variable tmp。剛產(chǎn)生的tmp實(shí)例在內(nèi)存中看起來(lái)是這樣的：

//tmp Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = [[9., 9.],[9., 9.]]--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= None--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = False--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist

之后會(huì)使用set_history(flatten_tensor_args( result ), grad_fn)來(lái)設(shè)置tmp的gradient_edge，這個(gè)函數(shù)做了2件事情：

1，MulBackward0實(shí)例中的input_metadata_ 追加了Variable syszux的(type, shape, device)，追加完成后，syszux信息在input_metadata_中的index就是下文中會(huì)用到的output_nr_，此處為0（第一個(gè)元素嘛）；這樣MulBackward0實(shí)例在內(nèi)存中看起來(lái)是這樣：

//grad_fn->add_input_metadata(variable); grad_fn_ --> input_metadata_ += (variable.type, variable.shape, variable.device) = []#MulBackward0實(shí)例， requires_grad == True Function實(shí)例 --> sequence_nr_ (uint64_t) = 1--> next_edges_ (edge_list) = [(AddBackward0實(shí)例0x55ca7f872e90,0),(AddBackward0實(shí)例0x55ca7f872e90,0)]--> input_metadata_ --> [(type, shape, device)...] = [(CPUFloatType, [2, 2],cpu])]--> alpha (Scalar) = 1--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 MulBackward0 的apply

2，Variable tmp的autograd_meta中的grad_fn_ 被賦值成了上述MulBackward0實(shí)例，而autograd_meta中的output_nr_被賦值成了上文中的“當(dāng)前Variable信息在input_metadata_中的index”。這樣tmp實(shí)例在內(nèi)存中看起來(lái)就是這樣：

//as_variable_ref(variable).set_gradient_edge({grad_fn, output_nr}); Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = [[9., 9.],[9., 9.]]--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= MulBackward0實(shí)例0x55ca7ebba2a0（參考上面）--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = False--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist

5，第二次乘法

syszux * syszux后得到的tmp還要繼續(xù)進(jìn)行tmp * 3的運(yùn)算，經(jīng)過(guò)前面的一次加法(gemfield + 2)和一次乘法(syszux * syszux)，我想我們目前已經(jīng)能總結(jié)出規(guī)律了。每一次這樣的運(yùn)算，會(huì)經(jīng)歷以下的步驟：

1，加法/乘法的調(diào)用棧，最終派發(fā)到某種device的實(shí)現(xiàn)上，如果運(yùn)算輸入是個(gè)scalar，進(jìn)行scalar到Variable的構(gòu)建；
2，派發(fā)到VariableType上時(shí)，會(huì)順便進(jìn)行autograd信息的構(gòu)建；
2.1，構(gòu)建一個(gè)加法/乘法的反向計(jì)算函數(shù)實(shí)例（比如AddBackward0，MulBackward0）；
2.2，初始化反向計(jì)算函數(shù)實(shí)例的next_edges_和其它相關(guān)成員，next_edges_成員的值來(lái)自前向時(shí)候的輸入?yún)?shù)，如果輸入Variable是leaf的話，則next_edges_來(lái)自輸入Variable的grad_accumulator_；如果是非leaf的話，則來(lái)自Variable的grad_fn_；
2.3，使用步驟3中的Variable實(shí)例來(lái)初始化反向計(jì)算函數(shù)實(shí)例的input_metadata_，
3，運(yùn)算后得到新的Variable，使用Variable::Impl進(jìn)行構(gòu)建，使用步驟2中的反向計(jì)算函數(shù)實(shí)例初始化該Variable實(shí)例的grad_fn_成員。

對(duì)于civilnet = tmp * 3的運(yùn)算（civilnet = syszux * syszux * 3的第二步），上述步驟就是：

1，THPVariable_mul的分發(fā)，不再贅述；其間要使用scalar 3構(gòu)建一個(gè)Variable；

2，在調(diào)用棧到達(dá)VariableType::mul的時(shí)候，構(gòu)建又一個(gè)MulBackward0實(shí)例（0x55ca7fada2f0），并初始化其next_edges_成員：

#grad_fn MulBackward0， requires_grad == True，0x55ca7fada2f0 Function實(shí)例 --> sequence_nr_ (uint64_t) = 2 （每個(gè)線程內(nèi)自增）--> next_edges_ (edge_list) = [(MulBackward0實(shí)例0x55ca7ebba2a0,0),(0,0)]--> input_metadata_ --> [(type, shape, device)...] = [(CPUFloatType, [2, 2],cpu])]--> self_ (SavedVariable) = tmp的淺拷貝--> other_ (SavedVariable) = 3的淺拷貝--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 MulBackward0 的apply

注意sequence_nr_又自增了1。

3，構(gòu)建Variable civilnet：

//as_variable_ref(variable).set_gradient_edge({grad_fn, output_nr}); Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = [[27., 27.],[27., 27.]]--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= MulBackward0實(shí)例0x55ca7fada2f0（參考上面）--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = False--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist

gemfieldout = civilnet.mean()

步驟其實(shí)類似了：

1，python調(diào)用到C++的THPVariable_mean的調(diào)用：

static PyObject * THPVariable_mean(PyObject* self_, PyObject* args, PyObject* kwargs) {......return wrap(dispatch_mean(self)); }

調(diào)用棧如下：

#torch/csrc/autograd/generated/python_variable_methods.cpp static PyObject * THPVariable_mean(PyObject* self_, PyObject* args, PyObject* kwargs) | V #aten/src/ATen/core/TensorMethods.h Tensor Tensor::mean() | V #torch/csrc/autograd/generated/VariableType_2.cpp Tensor VariableType::mean(const Tensor & self) | V #build/aten/src/ATen/TypeDefault.cpp Tensor TypeDefault::mean(const Tensor & self) | V #aten/src/ATen/native/ReduceOps.cpp Tensor mean(const Tensor &self) | V #aten/src/ATen/native/ReduceOps.cpp static inline Tensor mean(const Tensor &self, optional<ScalarType> dtype) | V #aten/src/ATen/native/ReduceOps.cpp static inline Tensor mean(const Tensor &self, IntArrayRef dim, bool keepdim, optional<ScalarType> dtype) | V #aten/src/ATen/native/ReduceOps.cpp Tensor &mean_out(Tensor &result, const Tensor &self, IntArrayRef dim,bool keepdim, optional<ScalarType> opt_dtype) | V at::sum_out(result, self, dim, keepdim, dtype).div_(dim_prod); | V #build/aten/src/ATen/TypeDefault.cpp Tensor & TypeDefault::sum_out(Tensor & out, const Tensor & self, IntArrayRef dim, bool keepdim, ScalarType dtype) | V #aten/src/ATen/native/ReduceOps.cpp Tensor& sum_out(Tensor& result, const Tensor& self, IntArrayRef dim, bool keepdim, ScalarType dtype) | V #aten/src/ATen/native/ReduceOps.cpp Tensor& sum_out(Tensor& result, const Tensor& self, IntArrayRef dim,bool keepdim, optional<ScalarType> opt_dtype) | V sum_stub ，依賴初始化階段的REGISTER_DISPATCH的工作

2，構(gòu)建autograd

在調(diào)用棧到達(dá) VariableType::mean時(shí)，開始順便構(gòu)建autograd的信息。主要是構(gòu)建一個(gè)op的反向計(jì)算函數(shù)實(shí)例：MeanBackward0實(shí)例(0x55ca7eb358b0)。MeanBackward0類型定義如下：

struct MeanBackward0 : public TraceableFunction {variable_list apply(variable_list&& grads) override;std::vector<int64_t> self_sizes;int64_t self_numel = 0; };

多了self_sizes和self_numel成員。構(gòu)建完成后，MeanBackward0實(shí)例在內(nèi)存中看起來(lái)如下所示：

#grad_fn MeanBackward0， requires_grad == True，0x55ca7eb358b0 Function實(shí)例 --> sequence_nr_ (uint64_t) = 3 （每個(gè)線程內(nèi)自增）--> next_edges_ (edge_list) = [(MulBackward0實(shí)例0x55ca7fada2f0,0)]--> input_metadata_ --> [(type, shape, device)...] = [(CPUFloatType|[]|cpu])]--> self_sizes (std::vector<int64_t>) = (2, 2)--> self_numel = 4--> pre_hooks_ = None--> post_hooks_ = None--> anomaly_metadata_ = None--> apply() --> 使用 MulBackward0 的apply

注意sequence_nr_的值又自增了1，另外就是input_metadata_中的shape為空。

3，構(gòu)建Variable gemfieldout

gemfieldout在內(nèi)存中的布局如下所示：

//as_variable_ref(variable).set_gradient_edge({grad_fn, output_nr}); Variable實(shí)例 --> Variable::Imple實(shí)例 --> tensor data --> TensorImpl實(shí)例 --> Storage實(shí)例 = (27,)--> autograd_meta --> grad_ (又一個(gè)Variable實(shí)例) = None--> grad_fn_ (Function實(shí)例）= MeanBackward0實(shí)例0x55ca7eb358b0（參考上面）--> grad_accumulator_ (Function實(shí)例）= None--> version_counter_ = 0--> hooks_ len = 0--> requires_grad_ = False--> is_view_ = false--> output_nr_ = 0--> base_ = Not exist

總結(jié)

本文《PyTorch的Tensor(中)》主要介紹了Tensor的autograd部分，具體來(lái)說(shuō)，就是在前向運(yùn)算中，autograd的信息是如何構(gòu)建和聯(lián)系在一起的：

1，op的調(diào)用棧，最終派發(fā)到某種device的實(shí)現(xiàn)上，如果運(yùn)算輸入是個(gè)scalar，進(jìn)行scalar到Variable的構(gòu)建；
2，派發(fā)到VariableType上時(shí)，會(huì)順便進(jìn)行autograd信息的構(gòu)建；
2.1，構(gòu)建一個(gè)op的反向計(jì)算函數(shù)實(shí)例（比如AddBackward0，MulBackward0）；
2.2，初始化反向計(jì)算函數(shù)實(shí)例的next_edges_和其它相關(guān)成員，next_edges_成員的值來(lái)自前向時(shí)候的輸入?yún)?shù)，如果輸入Variable是leaf的話，則next_edges_來(lái)自輸入Variable的grad_accumulator_；如果是非leaf的話，則來(lái)自Variable的grad_fn_；
2.3，使用步驟3中的Variable實(shí)例來(lái)初始化反向計(jì)算函數(shù)實(shí)例的input_metadata_，
3，運(yùn)算后得到新的Variable，使用Variable::Impl進(jìn)行構(gòu)建，使用步驟2中的反向計(jì)算函數(shù)實(shí)例初始化該Variable實(shí)例的grad_fn_成員。

而在下一篇文章《PyTorch的Tensor(下)》中，將主要介紹在backward的時(shí)候，Tensor中的autograd部分是怎么運(yùn)行的。不像這篇文章中的Variable實(shí)例，那個(gè)時(shí)候其grad_成員將不再是None了。

總結(jié)

以上是生活随笔為你收集整理的tensor判断是否相等_PyTorch的Tensor(中)的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： Activity Intent相关FLA
下一篇：离线仿真调试，加快项目进度！