迷你5和迷你4区别_可变大小的视频迷你批处理
迷你5和迷你4區別
The most important step towards training and testing an efficient machine learning model is the ability to gather a lot of data and use the data to effectively train it. Mini-batches have helped in solving the problem with the ability to train using a small subset of data during each iteration. But with a lot of machine learning tasks being performed on video datasets now, there comes the inherent problem of batching videos of unequal length effectively. Most of the methods rely on trimming the videos to be of an equal length such that the same number of frames can be extracted and sent to the batch during an iteration. But this is not particularly helpful during scenarios when we would need information from every frame to effectively predict something especially in the case of self-driving cars and action recognition. This was the motivation behind creating a mini-batch of a video dataset which can deal with videos of different length.
噸他對訓練和測試效率的機器學習模型最重要的步驟是收集大量的數據,并使用這些數據有效地訓練它的能力。 迷你批處理在每次迭代過程中使用一小部分數據進行訓練的能力有助于解決該問題。 但是,由于現在在視頻數據集上執行了許多機器學習任務,因此存在有效地批量分配不等長視頻的固有問題。 大多數方法都依賴于將視頻修整為等長,以便在迭代過程中可以提取相同數量的幀并將其發送到批處理中。 但這在需要從每一幀獲取信息以有效預測某些情況的情況下尤其有用,尤其是在自動駕駛汽車和動作識別的情況下。 這是創建視頻數據集的迷你批處理的動機,該批處理可以處理不同長度的視頻。
I used the LoadStreams as the base in the Pytorch model of Yolov3 by Glenn Jocher (https://github.com/ultralytics/yolov3) based on which I modeled my LoadStreamsBatch class.
我使用LoadStreams作為Glenn Jocher ( https://github.com/ultralytics/yolov3 )在Yolov3的Pytorch模型中的基礎,并以此為基礎對LoadStreamsBatch類進行了建模。
類初始化 (Class Initialization)
In the __init__ function, I take in four parameters. While img_size is the same as the original version, the other three parameters are as defined below:
在__init__函數中,我接受四個參數。 盡管img_size與原始版本相同,但其他三個參數如下所示:
sources: It takes a directory path or a text file as input.
源:它采用目錄路徑或文本文件作為輸入。
batch_size: This is the size of the mini-batch required
batch_size:這是所需的迷你批處理的大小
subdir_search: This option can be toggled to ensure all the sub-directories are searched for relevant files in the event a directory is passed as the sources parameter
subdir_search:可以切換此選項,以確保在將目錄作為sources參數傳遞的情況下,搜索所有子目錄中的相關文件
I initially check if the sources parameter is a directory or a text file. IN case, its a directory, I read all the contents in the directory (The sub-directories are also included in case the subdir_search parameter is True) else I read the paths to the video in the text file. The paths to the video are stored in a list. A pointer cur_pos is maintained to keep track of the current position in the list.
最初,我檢查sources參數是目錄還是文本文件。 如果是目錄,我將讀取目錄中的所有內容(如果subdir_search參數為True,還將包含子目錄 ),否則我將在文本文件中讀取視頻的路徑。 視頻的路徑存儲在列表中。 維護指針cur_pos以跟蹤列表中的當前位置。
The list is iterated with the batch_size as the maximum with the additional check to skip faulty videos or non-existing videos. These are sent to the letterbox function to resize the image and stack all the components together which is unchanged from the original version unless all the videos are faulty/not available.
該列表將使用batch_size作為最大值進行迭代,并進行附加檢查以跳過有問題的視頻或不存在的視頻。 這些被發送到信箱功能以調整圖像大小并將所有組件堆疊在一起,這與原始版本相同,除非所有視頻有故障/不可用。
定期檢索幀的功能 (Function to retrieve frames at regular intervals)
The update function called by threading has a small change where we additionally store the default image size to be used in the cases where all the videos have been picked up for processing but one completes before the other because of the unequal lengths. It will be more clear when I explain the next part of the code which is the __next__ function.
線程調用的更新功能有一個很小的變化,在我們額外存儲了默認圖像尺寸的情況下,如果所有視頻都已被拾取處理,但由于長度不相等,則它們先于另一個就完成了。 當我解釋代碼的下一部分(即__next__函數)時,會更加清楚。
迭代器 (Iterator)
After obtaining all the frames for the initial run, the frames for processing may end at a different point of the iterations for each video. If the frame is present, it is passed as usual to the letterbox function. In the case where the frame is None, which means that the video has been completely processed, we check if all the videos in the list have been processed. If there are more videos to be processed, the cur_pos pointer is used to obtain the location of the next available video.In the case where there are no more videos to be picked up from the list but certain videos are still being processed, a blank frame is sent to the other batch components i.e. it resizes the video dynamically based on the remaining frames in the other batches.
在獲得用于初始運行的所有幀之后,用于處理的幀可以在每個視頻的迭代的不同點處結束。 如果存在該框架,則照常將其傳遞給信箱功能。 如果該幀為None ,則表示視頻已被完全處理,我們檢查列表中的所有視頻是否都已被處理。 如果有更多視頻要處理,則使用cur_pos指針獲取下一個可用視頻的位置。如果列表中沒有其他視頻要提取,但某些視頻仍在處理中,則為空白幀被發送到其他批處理組件,即,它會根據其他批處理中的其余幀動態調整視頻的大小。
結論 (Conclusion)
With the amount of time being spent on data collection and preprocessing of data, I believe this can help in a small way by reducing the time spent on making videos fit the model and we can rather concentrate on making the model fit the data.
鑒于在數據收集和數據預處理上花費了很多時間,我認為這可以通過減少使視頻適合模型所需的時間而在較小程度上有所幫助,而我們可以更專注于使模型適合數據。
I am attaching the complete source code here. Hope this helps!
我將在此處附上完整的源代碼。 希望這可以幫助!
翻譯自: https://towardsdatascience.com/variable-sized-video-mini-batching-c4b1a47c043b
迷你5和迷你4區別
總結
以上是生活随笔為你收集整理的迷你5和迷你4区别_可变大小的视频迷你批处理的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 奇瑞星途瑶光预售价公布:16.78-17
- 下一篇: 终于!C919将在2月28日商业首飞 从