多进程减少多个文件的内存占用
生活随笔
收集整理的這篇文章主要介紹了
多进程减少多个文件的内存占用
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
?
?
代碼如下:
def memory_usage_mb(df, *args, **kwargs):"""Dataframe memory usage in MB. """return df.memory_usage(*args, **kwargs).sum() / 1024**2def reduce_memory_usage(df, deep=True, verbose=True, categories=True):# All types that we want to change for "lighter" ones.# int8 and float16 are not include because we cannot reduce# those data types.# float32 is not include because float16 has too low precision.numeric2reduce = ["int16", "int32", "int64", "float64"]start_mem = 0if verbose:start_mem = memory_usage_mb(df, deep=deep)for col, col_type in df.dtypes.iteritems():best_type = Noneif col_type == "object":df[col] = df[col].astype("category")best_type = "category"elif col_type in numeric2reduce:downcast = "integer" if "int" in str(col_type) else "float"df[col] = pd.to_numeric(df[col], downcast=downcast)best_type = df[col].dtype.name# Log the conversion performed.if verbose and best_type is not None and best_type != str(col_type):print(f"Column '{col}' converted from {col_type} to {best_type}")if verbose:end_mem = memory_usage_mb(df, deep=deep)diff_mem = start_mem - end_mempercent_mem = 100 * diff_mem / start_memprint(f"Memory usage decreased from"f" {start_mem:.2f}MB to {end_mem:.2f}MB"f" ({diff_mem:.2f}MB, {percent_mem:.2f}% reduction)")return df?
%%time import multiprocessinglists=[train,test] with multiprocessing.Pool() as pool:train,test = pool.map(reduce_memory_usage, lists)#這里的map就是傳入?yún)?shù)的意思?
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎勵來咯,堅持創(chuàng)作打卡瓜分現(xiàn)金大獎總結(jié)
以上是生活随笔為你收集整理的多进程减少多个文件的内存占用的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 海量特征按照缺失值null/NAN数量异
- 下一篇: Colaboratory挂载google