六、Numpy的使用(详解)
生活随笔
收集整理的這篇文章主要介紹了
六、Numpy的使用(详解)
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
3.1.2 ndarray介紹
點擊標題即可獲取文章的源代碼和筆記
Numpy 高效的運算工具 Numpy的優勢 ndarray屬性 基本操作ndarray.方法()numpy.函數名() ndarray運算邏輯運算統計運算數組間運算 合并、分割、IO操作、數據處理3.1 Numpy優勢3.1.1 Numpy介紹 - 數值計算庫num - numerical 數值化的py - pythonndarrayn - 任意個d - dimension 維度array - 數組3.1.2 ndarray介紹3.1.3 ndarray與Python原生list運算效率對比3.1.4 ndarray的優勢1)存儲風格ndarray - 相同類型 - 通用性不強list - 不同類型 - 通用性很強2)并行化運算ndarray支持向量化運算3)底層語言C語言,解除了GIL 3.2 認識N維數組-ndarray屬性3.2.1 ndarray的屬性shapendimsizedtypeitemsize在創建ndarray的時候,如果沒有指定類型默認整數 int64浮點數 float643.2.2 ndarray的形狀[1, 2, 3, 4][[1, 2, 3, 4],[1, 2, 3, 4],[1, 2, 3, 4]][[[1, 2, 3, 4],[1, 2, 3, 4],[1, 2, 3, 4]],[[1, 2, 3, 4],[1, 2, 3, 4],[1, 2, 3, 4]],[[1, 2, 3, 4],[1, 2, 3, 4],[1, 2, 3, 4]]]3.2.3 ndarray的類型 3.3 基本操作adarray.方法()np.函數名()np.array()3.3.1 生成數組的方法1)生成0和1np.zeros(shape)np.ones(shape)2)從現有數組中生成np.array() np.copy() 深拷貝np.asarray() 淺拷貝3)生成固定范圍的數組np.linspace(0, 10, 100)[0, 10] 等距離np.arange(a, b, c)range(a, b, c)[a, b) c是步長4)生成隨機數組分布狀況 - 直方圖1)均勻分布每組的可能性相等2)正態分布σ 幅度、波動程度、集中程度、穩定性、離散程度3.3.2 數組的索引、切片3.3.3 形狀修改ndarray.reshape(shape) 返回新的ndarray,原始數據沒有改變ndarray.resize(shape) 沒有返回值,對原始的ndarray進行了修改ndarray.T 轉置 行變成列,列變成行3.3.4 類型修改ndarray.astype(type)ndarray序列化到本地ndarray.tostring()3.3.5 數組的去重set() 3.4 ndarray運算邏輯運算布爾索引通用判斷函數np.all(布爾值)只要有一個False就返回False,只有全是True才返回Truenp.any()只要有一個True就返回True,只有全是False才返回Falsenp.where(三元運算符)np.where(布爾值, True的位置的值, False的位置的值)統計運算統計指標函數min, max, mean, median, var, stdnp.函數名ndarray.方法名返回最大值、最小值所在位置np.argmax(temp, axis=)np.argmin(temp, axis=)數組間運算3.5.1 場景3.5.2 數組與數的運算3.5.3 數組與數組的運算3.5.4 廣播機制3.5.5 矩陣運算1 什么是矩陣矩陣matrix 二維數組矩陣 & 二維數組兩種方法存儲矩陣1)ndarray 二維數組矩陣乘法:np.matmulnp.dot2)matrix數據結構2 矩陣乘法運算形狀(m, n) * (n, l) = (m, l)運算規則A (2, 3) B(3, 2)A * B = (2, 2) 3.6 合并、分割 3.7 IO操作與數據處理3.7.1 Numpy讀取3.7.2 如何處理缺失值兩種思路:直接刪除含有缺失值的樣本替換/插補按列求平均,用平均值進行填補 import numpy as np# 創建ndarray score = np.array([[80,89,86,67,79], [78,97,89,67,81], [90,94,78,67,74], [91,91,90,67,69], [76,87,75,67,86], [70,79,84,67,84], [94,92,93,67,64], [86,85,83,67,80]]) score array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]]) type(score) numpy.ndarray3.1.3 ndarray與Python原生list運算效率對比
import random import time import numpy as np# 生成一個大數組 a = [] for i in range(100000000):a.append(random.random())t1 = time.time() sum1 = sum(a) t2 = time.time()b = np.array(a) t4 = time.time() sum3 = np.sum(b) t5 = time.time()print(t2-t1,t5-t4) 5.195146083831787 0.236427545547485353.2.1 ndarray的屬性
score = np.array([[80,89,86,67,79], [78,97,89,67,81], [90,94,78,67,74], [91,91,90,67,69], [76,87,75,67,86], [70,79,84,67,84], [94,92,93,67,64], [86,85,83,67,80]])type(score) numpy.ndarray score.dtype # 數組元素的類型 dtype('int32') score.shape # 數組維度的元組 (8, 5) score.ndim # 數組維數 2 score.size # 數組中元素的數量 40 score.itemsize # 一個數組元素的長度(字節) 43.2.2 ndarray的形狀
#創建不同形狀的數組 a=np.array([[1,2,3],[4,5,6]]) b=np.array([1,2,3,4]) c=np.array([[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]]) a array([[1, 2, 3],[4, 5, 6]]) a.shape # 二維數組 (2, 3) b array([1, 2, 3, 4]) b.shape # 一維數組 (4,) c array([[[1, 2, 3],[4, 5, 6]],[[1, 2, 3],[4, 5, 6]]]) c.shape # 三維數組 (2, 2, 3)3.2.3 ndarray的類型
data = np.array([1.1,2.2,3.3]) data.dtype dtype('float64')創建數組的時候指定類型
a = np.array([[1,2,3],[4,5,6]],dtype=np.float32) # a = np.array([[1,2,3],[4,5,6]],dtype='float32') a.dtype dtype('float32') arr = np.array(['python','tensorflow','scikit-learn','numpy'],dtype=np.string_) arr array([b'python', b'tensorflow', b'scikit-learn', b'numpy'], dtype='|S12')3.3基本操作
1.生成0和1的數組
zero = np.zeros([3,4]) zero array([[0., 0., 0., 0.],[0., 0., 0., 0.],[0., 0., 0., 0.]]) zero = np.zeros((3,4)) zero array([[0., 0., 0., 0.],[0., 0., 0., 0.],[0., 0., 0., 0.]]) one = np.ones([3,4]) # one = np.ones((3,4)) one array([[1., 1., 1., 1.],[1., 1., 1., 1.],[1., 1., 1., 1.]]) np.ones(shape=[3,4],dtype=np.int32) array([[1, 1, 1, 1],[1, 1, 1, 1],[1, 1, 1, 1]])2.從現有數組生成
score array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]]) data1 = np.array(score) # 深拷貝 data1 array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]]) data2 = np.asarray(score) # 淺拷貝 ,原數據發生修改后,也會跟著進行修改 data2 array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]]) data3 = np.copy(score) # 深拷貝 data3 array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]]) score[3,1] 91 score[3,1] = 100000 data1 array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]]) data2 # 原數組數據修改后,也會跟著發生變化 array([[ 80, 89, 86, 67, 79],[ 78, 97, 89, 67, 81],[ 90, 94, 78, 67, 74],[ 91, 100000, 90, 67, 69],[ 76, 87, 75, 67, 86],[ 70, 79, 84, 67, 84],[ 94, 92, 93, 67, 64],[ 86, 85, 83, 67, 80]]) data3 array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]])3.生成固定范圍的數組
np.linspace(0,10,5) # 左閉右閉 ,等差數列范圍在【0,10,個數】,個數為5個 array([ 0. , 2.5, 5. , 7.5, 10. ]) for i in range(0,10,1):print(i) # range(0,10,1) 左閉右開 【0,10,步長) 0 1 2 3 4 5 6 7 8 9 np.arange(0,10,1) # 左閉右開 【0,10,步長) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])4.生成隨機數組
# 生成均勻分布的隨機數 x1 = np.random.uniform(-1,1,100000) # uniform(起始值,終點值,個數) x1 array([ 0.55046079, 0.37804729, -0.89677218, ..., 0.35451722,0.34995045, 0.01961797]) import matplotlib.pyplot as plt %matplotlib inline# 1. 創建畫布 plt.figure(figsize=(20,8),dpi=100)# 2. 繪制直方圖 plt.hist(x1,1000)# 3. 顯示圖像 plt.show() # 生成正態分布的隨機數(標準正態分布均值為0,方差為1) # loc 均值 ,scale 標準差 data4 = np.random.normal(loc=1.75,scale=0.1,size=1000000) data4 array([1.82548844, 1.91684274, 1.48534258, ..., 1.75064937, 1.8181808 ,1.81005547]) import matplotlib.pyplot as plt %matplotlib inline# 1. 創建畫布 plt.figure(figsize=(20,8),dpi=100)# 2. 繪制直方圖 plt.hist(data4,1000)# 3. 顯示圖像 plt.show()案例:隨機生成8只股票2周的交易日漲幅數據
8只股票,兩周(10天)的漲跌幅數據,如何獲取?
-
兩周的交易日數量為:2 * 5=10
-
隨機生成漲跌幅在某個正態分布內,比如均值0,方差1
3.3.2數組的索引、切片
- 獲取第一個股票的前3個交易日的漲跌幅數據
一維、二維、三維的數組如何索引?
a1=np.array([[[1,2,3],[4,5,6]],[[12,3,34],[5,6,7]]]) a1 array([[[ 1, 2, 3],[ 4, 5, 6]],[[12, 3, 34],[ 5, 6, 7]]]) a1.shape (2, 2, 3) a1[1,0,2] 34 a1[1,0,2] = 1000000 a1 array([[[ 1, 2, 3],[ 4, 5, 6]],[[ 12, 3, 1000000],[ 5, 6, 7]]])3.3.3形狀修改
需求:讓剛才的股票行、日期列反過來,變成日期行,股票列
stock_change.shape (8, 10) stock_change array([[-0.61330497, 0.55840141, 0.41709496, 1.27999683, -1.00183693,1.19508749, -1.30481202, -0.32462183, 0.1629303 , -0.37215778],[-0.67655708, -0.24960482, -0.26775897, -1.54340984, -1.7202066 ,1.38874363, -0.0149956 , 0.66870059, -0.04502848, 0.63144735],[-0.28952395, -1.70484263, 0.61871199, 0.61306774, 0.22872944,1.1493577 , 2.48623902, 0.18940315, -0.44105589, 1.49241966],[ 0.33087272, -0.67879541, -0.6040623 , -1.20256264, -0.76551783,1.31036346, -0.46289576, -0.44254887, -0.20934797, 0.13978528],[ 0.58783968, -2.67898464, -1.41139208, 1.07009707, -2.23082484,0.69616862, 0.38991086, -1.10458314, -1.85230749, -1.59066425],[ 1.46959111, -0.91715307, 0.08142567, 2.86350894, 0.83436522,-2.01224295, -0.28835842, -1.28407105, 1.52191189, -0.09642856],[-0.82991129, 0.83983885, -1.10666366, 0.06332958, 0.42674457,1.491716 , -0.81436095, -0.85603011, 0.72720565, -2.60215313],[ 0.42427358, 0.81760609, 2.48509044, 0.41373531, -0.5184894 ,0.76798932, 0.01676593, -1.35196338, 1.216088 , 0.39931822]]) reshape_stock_change = stock_change.reshape((10,8)) reshape_stock_change.shape# reshape(10,8)返回新的ndarray,但是沒有修改原始的數據,只是修改了數組的形狀,但并沒有讓數組的行列進行互換,只是把數組單純的重新進行了切割 (10, 8) reshape_stock_change array([[-0.61330497, 0.55840141, 0.41709496, 1.27999683, -1.00183693,1.19508749, -1.30481202, -0.32462183],[ 0.1629303 , -0.37215778, -0.67655708, -0.24960482, -0.26775897,-1.54340984, -1.7202066 , 1.38874363],[-0.0149956 , 0.66870059, -0.04502848, 0.63144735, -0.28952395,-1.70484263, 0.61871199, 0.61306774],[ 0.22872944, 1.1493577 , 2.48623902, 0.18940315, -0.44105589,1.49241966, 0.33087272, -0.67879541],[-0.6040623 , -1.20256264, -0.76551783, 1.31036346, -0.46289576,-0.44254887, -0.20934797, 0.13978528],[ 0.58783968, -2.67898464, -1.41139208, 1.07009707, -2.23082484,0.69616862, 0.38991086, -1.10458314],[-1.85230749, -1.59066425, 1.46959111, -0.91715307, 0.08142567,2.86350894, 0.83436522, -2.01224295],[-0.28835842, -1.28407105, 1.52191189, -0.09642856, -0.82991129,0.83983885, -1.10666366, 0.06332958],[ 0.42674457, 1.491716 , -0.81436095, -0.85603011, 0.72720565,-2.60215313, 0.42427358, 0.81760609],[ 2.48509044, 0.41373531, -0.5184894 , 0.76798932, 0.01676593,-1.35196338, 1.216088 , 0.39931822]]) stock_change.resize((10,8)) # resize((10,8)) 沒有返回值,直接對原始的ndarray進行了修改 # 效果和 reshape()一樣,只是修改了數組的形狀,但并沒有讓數組的行列進行互換,只是把數組單純的重新進行了切割 stock_change array([[-0.61330497, 0.55840141, 0.41709496, 1.27999683, -1.00183693,1.19508749, -1.30481202, -0.32462183],[ 0.1629303 , -0.37215778, -0.67655708, -0.24960482, -0.26775897,-1.54340984, -1.7202066 , 1.38874363],[-0.0149956 , 0.66870059, -0.04502848, 0.63144735, -0.28952395,-1.70484263, 0.61871199, 0.61306774],[ 0.22872944, 1.1493577 , 2.48623902, 0.18940315, -0.44105589,1.49241966, 0.33087272, -0.67879541],[-0.6040623 , -1.20256264, -0.76551783, 1.31036346, -0.46289576,-0.44254887, -0.20934797, 0.13978528],[ 0.58783968, -2.67898464, -1.41139208, 1.07009707, -2.23082484,0.69616862, 0.38991086, -1.10458314],[-1.85230749, -1.59066425, 1.46959111, -0.91715307, 0.08142567,2.86350894, 0.83436522, -2.01224295],[-0.28835842, -1.28407105, 1.52191189, -0.09642856, -0.82991129,0.83983885, -1.10666366, 0.06332958],[ 0.42674457, 1.491716 , -0.81436095, -0.85603011, 0.72720565,-2.60215313, 0.42427358, 0.81760609],[ 2.48509044, 0.41373531, -0.5184894 , 0.76798932, 0.01676593,-1.35196338, 1.216088 , 0.39931822]]) stock_change.shape (10, 8) stock_change.T # 轉置,行列互換 array([[-0.61330497, 0.1629303 , -0.0149956 , 0.22872944, -0.6040623 ,0.58783968, -1.85230749, -0.28835842, 0.42674457, 2.48509044],[ 0.55840141, -0.37215778, 0.66870059, 1.1493577 , -1.20256264,-2.67898464, -1.59066425, -1.28407105, 1.491716 , 0.41373531],[ 0.41709496, -0.67655708, -0.04502848, 2.48623902, -0.76551783,-1.41139208, 1.46959111, 1.52191189, -0.81436095, -0.5184894 ],[ 1.27999683, -0.24960482, 0.63144735, 0.18940315, 1.31036346,1.07009707, -0.91715307, -0.09642856, -0.85603011, 0.76798932],[-1.00183693, -0.26775897, -0.28952395, -0.44105589, -0.46289576,-2.23082484, 0.08142567, -0.82991129, 0.72720565, 0.01676593],[ 1.19508749, -1.54340984, -1.70484263, 1.49241966, -0.44254887,0.69616862, 2.86350894, 0.83983885, -2.60215313, -1.35196338],[-1.30481202, -1.7202066 , 0.61871199, 0.33087272, -0.20934797,0.38991086, 0.83436522, -1.10666366, 0.42427358, 1.216088 ],[-0.32462183, 1.38874363, 0.61306774, -0.67879541, 0.13978528,-1.10458314, -2.01224295, 0.06332958, 0.81760609, 0.39931822]]) stock_change.T.shape (8, 10)3.3.4類型修改
stock_change.astype(np.int32) array([[ 0, 0, 0, 1, -1, 1, -1, 0],[ 0, 0, 0, 0, 0, -1, -1, 1],[ 0, 0, 0, 0, 0, -1, 0, 0],[ 0, 1, 2, 0, 0, 1, 0, 0],[ 0, -1, 0, 1, 0, 0, 0, 0],[ 0, -2, -1, 1, -2, 0, 0, -1],[-1, -1, 1, 0, 0, 2, 0, -2],[ 0, -1, 1, 0, 0, 0, -1, 0],[ 0, 1, 0, 0, 0, -2, 0, 0],[ 2, 0, 0, 0, 0, -1, 1, 0]]) type(stock_change) numpy.ndarray # 序列化,轉換成bytes stock_change.tostring() b'\x9a\xa38\xc11\xa0\xe3\xbf\x10\xa0\t\xa3l\xde\xe1?9\xfaO\x11\xaf\xb1\xda?~\xd3\xf4\xf3\xddz\xf4?\x0f\xae\xd2)\x86\x07\xf0\xbfO\xfb\x1b\x10\x14\x1f\xf3?\xd0d\x18\x92\x82\xe0\xf4\xbf\x0c+\xc2\xa0\x9a\xc6\xd4\xbf\xdd\xfb{f\xe6\xda\xc4?\xc3\xa8\xec\xdbn\xd1\xd7\xbf\xe3\xb0z\t[\xa6\xe5\xbf\xb3\x9b\x01\xf5\x0c\xf3\xcf\xbf\xdd\xeeL\x83\xf6"\xd1\xbf\xc5\xff\xd5\x84\xce\xb1\xf8\xbf\xcd\x92\xd6Y\xf7\x85\xfb\xbf\x1d#\xde>K8\xf6?[-\x15\xa2\x03\xb6\x8e\xbfC\xde \xc7\xfee\xe5?\xbb\x166\xeb\xf8\r\xa7\xbf|\xfd\xcb\x11\xd14\xe4?^\x9e\xdcr\x8f\x87\xd2\xbf\xfe\xa6\n\x12\tG\xfb\xbfa\xfc\xfe\x15}\xcc\xe3?S\xec\xb4>@\x9e\xe3?\x17y\xbb\x9d\x01G\xcd?,c\xe2\xe5\xc4c\xf2?\xa7\x1f,H\xd1\xe3\x03@;\x0e\x9f\xc5\\>\xc8?P\xc1\xcbyB:\xdc\xbf "\xc3o\xf3\xe0\xf7?\x7fx\x8d\xc4\x04-\xd5?\x13BP\'\xb1\xb8\xe5\xbfw3\xdauzT\xe3\xbfb\x0cQQ\xb2=\xf3\xbf\x07\xd4\xee>\x1f\x7f\xe8\xbf\xcd\xf4\t\xae?\xf7\xf4?G\xb3b\x8a\x15\xa0\xdd\xbf\xe9IV\x83\xb8R\xdc\xbf\xc7\x88\x96\x03\xea\xcb\xca\xbf\xc4q\xaf\xe1{\xe4\xc1?\x03$o(\x95\xcf\xe2?l\xb3\xa9\x7f\x8fn\x05\xc0NX/\xdc\x0f\x95\xf6\xbf\xbc\x0e"\x1b\x1e\x1f\xf1?C\xe7\xf7\xb0\xba\xd8\x01\xc0\xdaKPg\x03G\xe6?/J\xbb\xa9L\xf4\xd8?\x7fV\x11`_\xac\xf1\xbf\x7f\x94\xdf-\r\xa3\xfd\xbf\xb1\xe0~\\\\s\xf9\xbfl\xb7\n\xf8q\x83\xf7?4H\xe5fQY\xed\xbf\xdde\x96\x18P\xd8\xb4?\x02\x0c\x1c`w\xe8\x06@\xe8j\x9a\xb1\x1e\xb3\xea?R\'D\xd5\x12\x19\x00\xc0]B\xc7\xdbvt\xd2\xbf<\xcc\xf5\x16\x8e\x8b\xf4\xbfK\xdc)H\xc0Y\xf8?r\xc7\xbc\xba\x8a\xaf\xb8\xbf`\xd5i \xa2\x8e\xea\xbf\x9d\x0b.\xb9\xf5\xdf\xea?\x81\xa6\x16\xf4\xe4\xb4\xf1\xbfEq\xf7\xf6]6\xb0?\xf7\x16_r\xc8O\xdb?\x80\xe8\x18\x99\x11\xde\xf7?\x04M\x16\xb1>\x0f\xea\xbf`\x85\x83D\x99d\xeb\xbf\xe0\x1e\xad\xcaDE\xe7?\xe6\xe6\x9c\xa85\xd1\x04\xc0\x90t\xebaL\'\xdb?5w\xc0@\xd4)\xea?\xce\xbe>\x19w\xe1\x03@\x94q\xdc\xab\xa3z\xda?\x08\xc0/\x16w\x97\xe0\xbf\t_)V^\x93\xe8??\x82\xfb\x82\x16+\x91?\x10\x87\xf3Z\xa4\xa1\xf5\xbf\xd3\x8cX\xb1\x18u\xf3?\xdf\xc5\xb3\xffm\x8e\xd9?'3.3.5數組的去重
temp = np.array([[1,2,3,4],[3,4,5,6]]) temp array([[1, 2, 3, 4],[3, 4, 5, 6]]) np.unique(temp) array([1, 2, 3, 4, 5, 6]) temp.flatten() # 降為1維數組 array([1, 2, 3, 4, 3, 4, 5, 6]) type(temp.flatten()) numpy.ndarray set(temp.flatten()) # 再用set去重 {1, 2, 3, 4, 5, 6}3.4 ndarray運算
3.4.1 邏輯運算
stock_change = np.random.normal(loc=0,scale=1,size=(8,10)) stock_change array([[-1.28396641, -2.01191074, -0.18834465, 2.42922844, -0.70687122,0.58481125, 0.55148057, 1.28943409, -1.44445438, 0.87934969],[ 0.12013781, -1.43581686, -0.63207426, 1.63806518, 1.17037384,-0.44528328, 1.23718753, -1.08925098, -0.26050859, -0.69753153],[-2.36635008, -2.62254681, 0.22101136, 0.81108448, -0.66006311,-0.15948853, 1.58475241, -0.81268957, -1.45337789, -0.06213791],[ 0.45162183, 0.55933576, -0.065766 , -0.40962168, 2.08206249,-0.84223895, -0.57720066, 1.79367669, -0.97694251, -0.33250153],[ 0.60649904, -0.59661935, -0.90621156, 1.79910292, -1.20565147,0.08852257, -0.99133308, 0.96236294, -0.9192948 , -0.03587398],[ 0.43325825, 0.48811556, 1.12822497, -1.27967886, 0.7919012 ,-0.38423972, 0.72962012, 1.74817488, 1.56455728, -1.72640669],[-0.38688515, 0.40048111, 2.51085027, -0.61192208, 0.70982823,-0.14795647, 0.30593344, -0.06915128, -1.34996629, -1.08573709],[-0.04277865, 0.60692697, 0.90975811, -0.5889982 , 0.25598235,-0.88764388, 0.10974295, 0.45449013, -1.03761231, -2.7914244 ]]) # 邏輯判斷,如果漲跌幅大于0.5就標記為True,否則標記為False stock_change>0.5 array([[False, False, False, True, False, True, True, True, False,True],[False, False, False, True, True, False, True, False, False,False],[False, False, False, True, False, False, True, False, False,False],[False, True, False, False, True, False, False, True, False,False],[ True, False, False, True, False, False, False, True, False,False],[False, False, True, False, True, False, True, True, True,False],[False, False, True, False, True, False, False, False, False,False],[False, True, True, False, False, False, False, False, False,False]]) stock_change[stock_change>0.5] # 布爾索引 array([2.42922844, 0.58481125, 0.55148057, 1.28943409, 0.87934969,1.63806518, 1.17037384, 1.23718753, 0.81108448, 1.58475241,0.55933576, 2.08206249, 1.79367669, 0.60649904, 1.79910292,0.96236294, 1.12822497, 0.7919012 , 0.72962012, 1.74817488,1.56455728, 2.51085027, 0.70982823, 0.60692697, 0.90975811]) stock_change[stock_change>0.5] = 1.1 stock_change array([[-1.28396641, -2.01191074, -0.18834465, 1.1 , -0.70687122,1.1 , 1.1 , 1.1 , -1.44445438, 1.1 ],[ 0.12013781, -1.43581686, -0.63207426, 1.1 , 1.1 ,-0.44528328, 1.1 , -1.08925098, -0.26050859, -0.69753153],[-2.36635008, -2.62254681, 0.22101136, 1.1 , -0.66006311,-0.15948853, 1.1 , -0.81268957, -1.45337789, -0.06213791],[ 0.45162183, 1.1 , -0.065766 , -0.40962168, 1.1 ,-0.84223895, -0.57720066, 1.1 , -0.97694251, -0.33250153],[ 1.1 , -0.59661935, -0.90621156, 1.1 , -1.20565147,0.08852257, -0.99133308, 1.1 , -0.9192948 , -0.03587398],[ 0.43325825, 0.48811556, 1.1 , -1.27967886, 1.1 ,-0.38423972, 1.1 , 1.1 , 1.1 , -1.72640669],[-0.38688515, 0.40048111, 1.1 , -0.61192208, 1.1 ,-0.14795647, 0.30593344, -0.06915128, -1.34996629, -1.08573709],[-0.04277865, 1.1 , 1.1 , -0.5889982 , 0.25598235,-0.88764388, 0.10974295, 0.45449013, -1.03761231, -2.7914244 ]])3.4.2通用判斷函數
stock_change[0:2,0:5] array([[-1.28396641, -2.01191074, -0.18834465, 1.1 , -0.70687122],[ 0.12013781, -1.43581686, -0.63207426, 1.1 , 1.1 ]]) # 判斷stock_change[0:2,0:5]是否全是上漲的 np.all(stock_change[0:2,0:5] > 0) # 只有有一個False就返回False,只有全都是True才返回True False stock_change[0:5,:] array([[-1.28396641, -2.01191074, -0.18834465, 1.1 , -0.70687122,1.1 , 1.1 , 1.1 , -1.44445438, 1.1 ],[ 0.12013781, -1.43581686, -0.63207426, 1.1 , 1.1 ,-0.44528328, 1.1 , -1.08925098, -0.26050859, -0.69753153],[-2.36635008, -2.62254681, 0.22101136, 1.1 , -0.66006311,-0.15948853, 1.1 , -0.81268957, -1.45337789, -0.06213791],[ 0.45162183, 1.1 , -0.065766 , -0.40962168, 1.1 ,-0.84223895, -0.57720066, 1.1 , -0.97694251, -0.33250153],[ 1.1 , -0.59661935, -0.90621156, 1.1 , -1.20565147,0.08852257, -0.99133308, 1.1 , -0.9192948 , -0.03587398]]) # 判斷前5只股票這段期間是否有上漲的 np.any(stock_change[0:5,:] > 0) # 只要有一個是True就返回True,全都是False才返回False True3.4.3 np.where(三元運算符)
stock_change[:4,:4] array([[-1.28396641, -2.01191074, -0.18834465, 1.1 ],[ 0.12013781, -1.43581686, -0.63207426, 1.1 ],[-2.36635008, -2.62254681, 0.22101136, 1.1 ],[ 0.45162183, 1.1 , -0.065766 , -0.40962168]]) #判斷前四個股票前四天的漲跌幅大于0的置為1,否則為0 temp=stock_change[:4,:4] np.where(temp > 0 ,1 ,0) array([[0, 0, 0, 1],[1, 0, 0, 1],[0, 0, 1, 1],[1, 1, 0, 0]]) temp array([[-1.28396641, -2.01191074, -0.18834465, 1.1 ],[ 0.12013781, -1.43581686, -0.63207426, 1.1 ],[-2.36635008, -2.62254681, 0.22101136, 1.1 ],[ 0.45162183, 1.1 , -0.065766 , -0.40962168]]) #判斷前四個服票前四天的漲跌幅大于0.5并且小于1的,換為1,否則為0 #判斷前四個般票前四天的漲跌幅大于0.5或者小于-0.5的,換為1,否則為0np.logical_and(temp>0.5,temp<1) array([[False, False, False, False],[False, False, False, False],[False, False, False, False],[False, False, False, False]]) np.where(np.logical_and(temp>0.5,temp<1),1,0) array([[0, 0, 0, 0],[0, 0, 0, 0],[0, 0, 0, 0],[0, 0, 0, 0]]) np.logical_or(temp>0.5,temp<-0.5) array([[ True, True, False, True],[False, True, True, True],[ True, True, False, True],[False, True, False, False]]) np.where(np.logical_or(temp>0.5,temp<-0.5),1,0) array([[1, 1, 0, 1],[0, 1, 1, 1],[1, 1, 0, 1],[0, 1, 0, 0]])3.4.4 統計運算
2.股票漲跌幅統計運算
進行統計的時候,axis軸的取值并不一定,Numpy中不同的API軸的值都不一樣,在這里,axis 0代表列,axis 1代表行去進行統計
temp array([[-1.28396641, -2.01191074, -0.18834465, 1.1 ],[ 0.12013781, -1.43581686, -0.63207426, 1.1 ],[-2.36635008, -2.62254681, 0.22101136, 1.1 ],[ 0.45162183, 1.1 , -0.065766 , -0.40962168]]) temp.max() 1.1 np.max(temp) 1.1 #接下來對于這4只股票的4天數據,進行一些統計運算 #指定行去統計 print("前四只股票前四天的是大漲幅{}".format(np.max(temp,axis=1))) 前四只股票前四天的是大漲幅[1.1 1.1 1.1 1.1] #使用min,std,mean print("前四只股票前四天的最大跌幅{}".format(np.min(temp,axis=1))) 前四只股票前四天的最大跌幅[-2.01191074 -1.43581686 -2.62254681 -0.40962168] print("前四只股票前四天的波動程度{}".format(np.std(temp,axis=1))) 前四只股票前四天的波動程度[1.17480848 0.93619571 1.61034658 0.56932139] print("前四只股票前四天的平均漲跌幅{})".format(np.mean(temp,axis=1))) 前四只股票前四天的平均漲跌幅[-0.59605545 -0.21193833 -0.91697138 0.26905854])返回最大值、最小值所在位置
- np.argmax(temp,axis=)
- np.argmin(temp,axis=)
3.5.2 數組與數的運算
arr=np.array([[1,2,3,2,1,4],[5,6,1,2,3,111]]) arr array([[ 1, 2, 3, 2, 1, 4],[ 5, 6, 1, 2, 3, 111]]) arr + 10 array([[ 11, 12, 13, 12, 11, 14],[ 15, 16, 11, 12, 13, 121]]) arr * 10 array([[ 10, 20, 30, 20, 10, 40],[ 50, 60, 10, 20, 30, 1110]])3.5.3 數組與數組的運算
arr1 = np.array([[1,2,3,2,1,4],[5,6,1,2,3,1]]) arr2 = np.array([[1,2,3,4],[3,4,5,6]]) arr1 array([[1, 2, 3, 2, 1, 4],[5, 6, 1, 2, 3, 1]]) arr2 array([[1, 2, 3, 4],[3, 4, 5, 6]]) arr1 + arr2 ---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-93-d972d21b639e> in <module> ----> 1 arr1 + arr2ValueError: operands could not be broadcast together with shapes (2,6) (2,4)廣播機制,判斷兩個數組能否進行運算的方法:
- 維度相等 或者
- shape(每個維度對應的位置為1)
3.5.5 矩陣運算
# array存儲矩陣 a=np.array([[80,86],[82,80],[85,78],[90,90],[86,82],[82,98],[78,80],[92,94]]) a array([[80, 86],[82, 80],[85, 78],[90, 90],[86, 82],[82, 98],[78, 80],[92, 94]]) b = np.array([[0.3],[0.7]]) b array([[0.3],[0.7]]) # matrix存儲矩陣 a_mat = np.mat([[80,86],[82,80],[85,78],[90,90],[86,82],[82,98],[78,80],[92,94]]) a_mat matrix([[80, 86],[82, 80],[85, 78],[90, 90],[86, 82],[82, 98],[78, 80],[92, 94]]) type(a_mat) numpy.matrix b_mat = np.mat([[0.3],[0.7]]) b_mat matrix([[0.3],[0.7]]) a_mat * b_mat matrix([[84.2],[80.6],[80.1],[90. ],[83.2],[93.2],[79.4],[93.4]]) type(a) numpy.ndarray np.matmul(a,b) # np.matmul(a,b)用于兩個array數組類型相乘 array([[84.2],[80.6],[80.1],[90. ],[83.2],[93.2],[79.4],[93.4]]) np.dot(a,b) # np.dot(a,b) 也可以用于兩個array數組類型相乘 array([[84.2],[80.6],[80.1],[90. ],[83.2],[93.2],[79.4],[93.4]]) a @ b array([[84.2],[80.6],[80.1],[90. ],[83.2],[93.2],[79.4],[93.4]])3.6 合并、分割
a = np.array((1,2,3)) a array([1, 2, 3]) b = np.array((2,3,4)) b array([2, 3, 4])3.6.1 合并
np.hstack((a,b)) # 水平拼接 array([1, 2, 3, 2, 3, 4]) a = np.array([1,2,3]) a array([1, 2, 3]) a1 = np.array([[1],[2],[3]]) a1 array([[1],[2],[3]]) b1 = np.array([[2],[3],[4]]) b1 array([[2],[3],[4]]) np.hstack((a1,b1)) array([[1, 2],[2, 3],[3, 4]]) np.vstack((a,b)) # 豎直拼接 array([[1, 2, 3],[2, 3, 4]]) a=np.array([[1,2],[3,4]]) a array([[1, 2],[3, 4]]) b=np.array([[5,6]]) b array([[5, 6]]) np.concatenate((a,b),axis=0) # axis=0 豎直拼接 array([[1, 2],[3, 4],[5, 6]]) b.T array([[5],[6]]) a array([[1, 2],[3, 4]]) np.concatenate((a,b.T),axis=1) # axis=1 水平拼接 array([[1, 2, 5],[3, 4, 6]])3.6.2 分割
x = np.arange(9.0) x array([0., 1., 2., 3., 4., 5., 6., 7., 8.]) np.split(x,3) [array([0., 1., 2.]), array([3., 4., 5.]), array([6., 7., 8.])] np.split(x,[3,6]) [array([0., 1., 2.]), array([3., 4., 5.]), array([6., 7., 8.])]3.7 IO操作與數據處理
3.7.1 Numpy讀取
data = np.genfromtxt("test.csv",delimiter=",",dtype='U75') # dtype轉換數據類型,關鍵字設置為'U75', 不設置dtype,輸出數據類型為nan # delimiter=','表示數據由逗號分隔 data array([['id', 'value1.value2', 'value3', ''],['1', '123', '1.4', '23'],['2', '110', '', '18'],['3', '', '2.1', '19']], dtype='<U75')3.7.2 如何處理缺失值
data = np.genfromtxt("test.csv",delimiter=",") data array([[ nan, nan, nan, nan],[ 1. , 123. , 1.4, 23. ],[ 2. , 110. , nan, 18. ],[ 3. , nan, 2.1, 19. ]]) data[2,2] nan type(data[2,2]) numpy.float64 def fill_nan_by_column_mean(t):# 先遍歷每一列for i in range(t.shape[1]):# 計算nan的個數nan_num = np.count_nonzero(t[:,i][t[:,i] != t[:,i]])if nan_num>0:now_col=t[:,i]# 求和now_col_not_nan = now_col[np.isnan(now_col)==False].sum()# 和/個數now_col_mean = now_col_not_nan / (t.shape[0] - nan_num)# 賦值給now col now_col[np.isnan(now_col)] = now_col_mean#賦值給t,即更新t的當前列t[:,i]=now_col return t data array([[ nan, nan, nan, nan],[ 1. , 123. , 1.4, 23. ],[ 2. , 110. , nan, 18. ],[ 3. , nan, 2.1, 19. ]]) fill_nan_by_column_mean(data) array([[ 2. , 116.5 , 1.75, 20. ],[ 1. , 123. , 1.4 , 23. ],[ 2. , 110. , 1.75, 18. ],[ 3. , 116.5 , 2.1 , 19. ]]) data[0,0] = np.nan nan_num = np.count_nonzero(data[:,0][data[:,0] != data[:,0]]) # numpy.count_nonzero是用于統計數組中非零元素的個數 nan_num 1 data[:,0] array([nan, 1., 2., 3.]) data[:,0] != data[:,0] array([ True, False, False, False]) np.nan != np.nan # np.nan 原意為 not a number,所以當然不能判斷兩個np.nan 是否相等啦 True a array([[-1.28396641, -2.01191074, -0.18834465, 1.1 ],[ 0.12013781, -1.43581686, -0.63207426, 1.1 ]]) a.shape (2, 4) a.reshape(-1,2) # 自動計算功能,不想指定的位置用-1來填補即可 array([[-1.28396641, -2.01191074],[-0.18834465, 1.1 ],[ 0.12013781, -1.43581686],[-0.63207426, 1.1 ]])3.8 總結
總結
以上是生活随笔為你收集整理的六、Numpy的使用(详解)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 一、在vue项目中使用mock.js(详
- 下一篇: 四、模块系统