linux scsi相关的一些学习笔记
最近看scsi相關(guān)處理的一些備忘,比較零碎,僅作參考。
先從最顯而易見的打印入手:
[0:0:0:0] disk ATA INTEL SSDSC2BX20 0150 - [0:0:1:0] disk ATA INTEL SSDSC2BX20 0150 - [0:1:0:0] disk LSI Logical Volume 3000 /dev/sda [5:0:0:0] enclosu AIC 12G 4U60: Hub 0c29 - [5:0:1:0] disk SEAGATE ST4000NM0025 N003 /dev/sdb [5:0:2:0] disk SEAGATE ST4000NM0025 N004 /dev/sdc [5:0:3:0] disk SEAGATE ST4000NM0025 N003 /dev/sdd [5:0:4:0] disk SEAGATE ST4000NM0025 N003 /dev/sde [5:0:5:0] disk SEAGATE ST4000NM0025 N003 /dev/sdf [5:0:6:0] disk SEAGATE ST4000NM0025 N003 /dev/sdg [5:0:7:0] disk SEAGATE ST4000NM0025 N004 /dev/sdh [5:0:8:0] disk SEAGATE ST4000NM0025 N003 /dev/sdi [5:0:9:0] disk SEAGATE ST4000NM0025 N004 /dev/sdj [5:0:10:0] disk SEAGATE ST4000NM0025 N003 /dev/sdk [5:0:11:0] disk SEAGATE ST4000NM0025 N004 /dev/sdl [5:0:12:0] disk SEAGATE ST4000NM0025 N004 /dev/sdm [5:0:13:0] disk SEAGATE ST4000NM0025 N004 /dev/sdn [5:0:14:0] disk SEAGATE ST4000NM0025 N004 /dev/sdo [5:0:15:0] disk SEAGATE ST4000NM0025 N003 /dev/sdp [5:0:16:0] disk SEAGATE ST4000NM0025 N003 /dev/sdq [5:0:17:0] disk SEAGATE ST4000NM0025 N003 /dev/sdr [5:0:18:0] disk SEAGATE ST4000NM0025 N004 /dev/sds [5:0:19:0] disk SEAGATE ST4000NM0025 N003 /dev/sdt [5:0:20:0] disk SEAGATE ST4000NM0025 N003 /dev/sdu [5:0:21:0] enclosu AIC 12G 4U60: Edge-C 0c2a - [5:0:22:0] disk SEAGATE ST4000NM0025 N003 /dev/sdv [5:0:23:0] disk SEAGATE ST4000NM0025 N003 /dev/sdw [5:0:24:0] disk SEAGATE ST4000NM0025 N004 /dev/sdx [5:0:25:0] disk SEAGATE ST4000NM0025 N003 /dev/sdy [5:0:26:0] disk SEAGATE ST4000NM0025 N003 /dev/sdz [5:0:27:0] disk SEAGATE ST4000NM0025 N003 /dev/sdaa [5:0:28:0] disk SEAGATE ST4000NM0025 N003 /dev/sdab [5:0:29:0] disk SEAGATE ST4000NM0025 N004 /dev/sdac [5:0:30:0] disk SEAGATE ST4000NM0025 N004 /dev/sdad [5:0:31:0] disk SEAGATE ST4000NM0025 N003 /dev/sdae [5:0:32:0] disk SEAGATE ST4000NM0025 N004 /dev/sdaf [5:0:33:0] disk SEAGATE ST4000NM0025 N003 /dev/sdag [5:0:34:0] disk SEAGATE ST4000NM0025 N003 /dev/sdah [5:0:35:0] disk SEAGATE ST4000NM0025 N003 /dev/sdai [5:0:36:0] disk SEAGATE ST4000NM0025 N003 /dev/sdaj [5:0:37:0] disk SEAGATE ST4000NM0025 N004 /dev/sdak [5:0:38:0] disk SEAGATE ST4000NM0025 N003 /dev/sdal [5:0:39:0] disk SEAGATE ST4000NM0025 N003 /dev/sdam [5:0:40:0] disk SEAGATE ST4000NM0025 N003 /dev/sdan [5:0:41:0] disk SEAGATE ST4000NM0025 N004 /dev/sdao [5:0:42:0] enclosu AIC 12G 4U60: Edge-R 0c2a - [5:0:43:0] disk SEAGATE ST4000NM0025 N003 /dev/sdap [5:0:44:0] disk SEAGATE ST4000NM0025 N003 /dev/sdaq [5:0:45:0] disk SEAGATE ST4000NM0025 N003 /dev/sdar [5:0:46:0] disk SEAGATE ST4000NM0025 N003 /dev/sdas [5:0:47:0] disk SEAGATE ST4000NM0025 N003 /dev/sdat [5:0:48:0] disk SEAGATE ST4000NM0025 N004 /dev/sdau [5:0:49:0] disk SEAGATE ST4000NM0025 N004 /dev/sdav [5:0:50:0] disk SEAGATE ST4000NM0025 N004 /dev/sdaw [5:0:51:0] disk SEAGATE ST4000NM0025 N003 /dev/sdax [5:0:52:0] disk SEAGATE ST4000NM0025 N004 /dev/sday [5:0:53:0] disk SEAGATE ST4000NM0025 N003 /dev/sdaz [5:0:54:0] disk SEAGATE ST4000NM0025 N003 /dev/sdba [5:0:55:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbb [5:0:56:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbc [5:0:57:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbd [5:0:58:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbe [5:0:59:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbf [5:0:60:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbg [5:0:61:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbh [5:0:62:0] disk SEAGATE ST4000NM0025 N003 /dev/sdbi [5:0:63:0] enclosu AIC 12G 4U60: Edge-L 0c2a - [6:0:0:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbj [6:0:1:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbk [6:0:2:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbl [6:0:3:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbm [6:0:4:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbn [6:0:5:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbo [6:0:6:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbp [6:0:7:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbq [7:0:0:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbr [7:0:1:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbs [7:0:2:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbt [7:0:3:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbu [7:0:4:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbv [7:0:5:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbw [7:0:6:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdbx [7:0:7:0] disk HGST SDLL1DLR960GCAA1 W150 /dev/sdby
前面第一列數(shù)字是什么?各個(gè)數(shù)字之間的關(guān)系是什么?內(nèi)核中對(duì)scsi層的抽象是怎么做的?scsi命令的抽象是什么?
scsi命令下發(fā)后遇到錯(cuò)誤怎么辦,返回超時(shí)怎么辦?正常返回的流程是什么樣的?下面就帶著這些疑問來看代碼。
前面第一列數(shù)字是什么?
lsscsi顯示的第一列是scsi設(shè)備在內(nèi)核中展示的各級(jí)編號(hào),根據(jù)編號(hào)可以唯一確定一個(gè)設(shè)備,
如果使用cat /proc/scsi/scsi 來查看會(huì)顯得好理解一些:
cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 01 Id: 00 Lun: 00 Vendor: LSI Model: Logical Volume Rev: 3000 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: INTEL SSDSC2BX20 Rev: 0150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: ATA Model: INTEL SSDSC2BX20 Rev: 0150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 00 Lun: 00 Vendor: AIC 12G Model: 4U60: Hub Rev: 0c29 Type: Enclosure ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 04 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 05 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 06 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 07 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 08 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 09 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 10 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 11 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 12 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 13 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 14 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 15 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 16 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 17 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 18 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 19 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 20 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 21 Lun: 00 Vendor: AIC 12G Model: 4U60: Edge-C Rev: 0c2a Type: Enclosure ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 22 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 23 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 24 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 25 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 26 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 27 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 28 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 29 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 30 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 31 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 32 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 33 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 34 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 35 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 36 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 37 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 38 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 39 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 40 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 41 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 42 Lun: 00 Vendor: AIC 12G Model: 4U60: Edge-R Rev: 0c2a Type: Enclosure ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 43 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 44 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 45 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 46 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 47 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 48 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 49 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 50 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 51 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 52 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N004 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 53 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 54 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 55 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 56 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 57 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 58 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 59 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 60 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 61 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 62 Lun: 00 Vendor: SEAGATE Model: ST4000NM0025 Rev: N003 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi5 Channel: 00 Id: 63 Lun: 00 Vendor: AIC 12G Model: 4U60: Edge-L Rev: 0c2a Type: Enclosure ANSI SCSI revision: 05 Host: scsi6 Channel: 00 Id: 00 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi6 Channel: 00 Id: 01 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi6 Channel: 00 Id: 02 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi6 Channel: 00 Id: 03 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi6 Channel: 00 Id: 04 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi6 Channel: 00 Id: 05 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi6 Channel: 00 Id: 06 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi6 Channel: 00 Id: 07 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 00 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 01 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 02 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 03 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 04 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 05 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 06 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06 Host: scsi7 Channel: 00 Id: 07 Lun: 00 Vendor: HGST Model: SDLL1DLR960GCAA1 Rev: W150 Type: Direct-Access ANSI SCSI revision: 06
從編號(hào)可以看出,第一級(jí)是host,第二級(jí)是channel,第三級(jí)是target編號(hào),第四級(jí)是LUN號(hào)
h == hostadapter id (first one being 0) c == SCSI channel on hostadapter (first one being 0) t == ID l == LUN (first one being 0)
各個(gè)數(shù)字之間的關(guān)系是什么?
一個(gè)主板可能接多個(gè)host,比如上面的服務(wù)器,在有多個(gè)sas芯片的情況下,肯定就有多個(gè)host。一個(gè)sas芯片又可以分割為多個(gè)通道,也就是channel,也叫bus。一個(gè)通道下多個(gè)target,一個(gè)target下多個(gè)lun。
如果一個(gè)硬盤支持雙通道,那么在scsi層,就是展示為兩個(gè)scsi標(biāo)號(hào)。
內(nèi)核中對(duì)scsi層的抽象是怎么做的?
對(duì)于device,有個(gè)scsi_device的抽象,host成員指向它歸屬的scsi_host,siblings成員嵌入到host的__device成員中。同時(shí),它的sdev_gendev 成員的parent指向 對(duì)應(yīng)的scsi_target的dev地址,
這個(gè)只要熟悉linux的驅(qū)動(dòng)模型就能理解了。
下面看一下scsi_device的實(shí)際例子:
crash> scsi_device ffff881fcee44800
struct scsi_device {
host = 0xffff883fd0e38000,-----------------指向scsi_host,這個(gè)會(huì)在后面描述
request_queue = 0xffff883fc1e28828,--------這個(gè)大家應(yīng)該清楚,就是之前申請(qǐng)存放下發(fā)io的request_queue,要注意區(qū)分單隊(duì)列和多隊(duì)列
siblings = {-------------------------------當(dāng)前host下的所有scsi_device通過這個(gè)串起來,他們是兄弟關(guān)系,所以成員名就叫siblings
next = 0xffff881fcece9810,
prev = 0xffff881fcee44010
},
same_target_siblings = {------------------這個(gè)是同一個(gè)target下的scsi_device的串接,這里有個(gè)問題是,串接這個(gè)也需要獲取host的鎖,其實(shí)可以優(yōu)化。
next = 0xffff883fc1e21c18,
prev = 0xffff883fc1e21c18
},
{
device_busy = {
counter = 6
},
__UNIQUE_ID_rh_kabi_hide20 = {
device_busy = 6
},
{<No data fields>}
},
list_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 1215842424,
tickets = {
head = 18552,
tail = 18552
}
}
}
}
}
},
cmd_list = {
next = 0xffff881f49a2d508,
prev = 0xffff883eeccee308
},
starved_entry = {
next = 0xffff881fcee44848,
prev = 0xffff881fcee44848
},
current_cmnd = 0x0,
queue_depth = 254,
max_queue_depth = 254,
last_queue_full_depth = 0,
last_queue_full_count = 0,
last_queue_full_time = 0,
queue_ramp_up_period = 120000,
last_queue_ramp_up = 0,
id = 4,--------------------------------這個(gè)一般賦值為target的id
lun = 0,-------------------------------就是大家看到的四級(jí)編號(hào)的最后一級(jí),lun
channel = 0,---------------------------通道號(hào)
manufacturer = 0,
sector_size = 512,
hostdata = 0xffff883fca92ed20,
type = 0 '00',
scsi_level = 7 'a',
inq_periph_qual = 0 '00',
inquiry_len = 144 '220',
inquiry = 0xffff883fc1e60b40 "",
vendor = 0xffff883fc1e60b48 "SEAGATE ST4000NM0025 N003ZC18ASFP",
model = 0xffff883fc1e60b50 "ST4000NM0025 N003ZC18ASFP",
rev = 0xffff883fc1e60b60 "N003ZC18ASFP",
current_tag = 0 '00',
sdev_target = 0xffff883fc1e21c00,------這個(gè)指向scsi_target,按注釋是說single lun的時(shí)候才有效,但我看target的single lun的值為0,比較奇怪,穩(wěn)妥取scsi_target最好不用這個(gè)
sdev_bflags = 0,
eh_timeout = 10000,
writeable = 1,
removable = 0,
changed = 0,
busy = 0,
lockable = 0,
locked = 0,
borken = 0,
disconnect = 0,
soft_reset = 0,
sdtr = 0,
wdtr = 0,
ppr = 1,
tagged_supported = 1,
simple_tags = 0,
ordered_tags = 0,
was_reset = 0,
expecting_cc_ua = 0,
use_10_for_rw = 1,
use_10_for_ms = 0,
no_report_opcodes = 0,
no_write_same = 0,
use_16_for_rw = 1,
skip_ms_page_8 = 0,
skip_ms_page_3f = 0,
skip_vpd_pages = 0,
use_192_bytes_for_3f = 0,
no_start_on_add = 0,
allow_restart = 0,
manage_start_stop = 0,
start_stop_pwr_cond = 0,
no_uld_attach = 0,
select_no_atn = 0,
fix_capacity = 0,
guess_capacity = 0,
retry_hwerror = 0,
last_sector_bug = 0,
no_read_disc_info = 0,
no_read_capacity_16 = 0,
try_rc_10_first = 0,
is_visible = 1,
wce_default_on = 0,
no_dif = 0,
broken_fua = 0,
vpd_reserved = 0,
xcopy_reserved = 0,
lun_in_cdb = 0,
disk_events_disable_depth = {
counter = 0
},
supported_events = {0},
pending_events = {0},
event_list = {
next = 0xffff881fcee44900,
prev = 0xffff881fcee44900
},
event_work = {
data = {
counter = 68719476704
},
entry = {
next = 0xffff881fcee44918,
prev = 0xffff881fcee44918
},
func = 0xffffffff814241a0 <scsi_evt_thread>
},
{
device_blocked = {
counter = 0
},
__UNIQUE_ID_rh_kabi_hide21 = {
device_blocked = 0
},
{<No data fields>}
},
max_device_blocked = 3,
iorequest_cnt = {------------下發(fā)的io
counter = 4641
},
iodone_cnt = {-----------------完成的io
counter = 4635
},
ioerr_cnt = {
counter = 283----------------這個(gè)要關(guān)注,出錯(cuò)的io統(tǒng)計(jì),這個(gè)會(huì)導(dǎo)出到/proc/diskstat中
},
sdev_gendev = {----------------設(shè)備模型,scsi_device的sdev_gendev的的parent指向scsi_target的dev成員,驅(qū)動(dòng)的樹狀模型體現(xiàn)。
parent = 0xffff883fc1e21c28,
p = 0xffff883fd0cf2b40,
kobj = {
name = 0xffff883fc21d9790 "5:0:4:0",----------四級(jí)命名的name,說明host_no為5,channel為0,target的id為4,lun為0
entry = {
next = 0xffff881fcee44c00,
prev = 0xffff883fc1e21c40
},
parent = 0xffff883fc1e21c38,
kset = 0xffff881fff86b6c0,
ktype = 0xffffffff81a14e40 <device_ktype>,
sd = 0xffff883fccc5ee70,
kref = {
refcount = {
counter = 25
}
},
state_initialized = 1,
state_in_sysfs = 1,
state_add_uevent_sent = 1,
state_remove_uevent_sent = 0,
uevent_suppress = 0
},
init_name = 0x0,
type = 0xffffffff81a19760 <scsi_dev_type>,
mutex = {
count = {
counter = 1
},
wait_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
wait_list = {
next = 0xffff881fcee449b0,
prev = 0xffff881fcee449b0
},
owner = 0x0,
{
osq = 0x0,
__UNIQUE_ID_rh_kabi_hide0 = {
spin_mlock = 0x0
},
{<No data fields>}
}
},
bus = 0xffffffff81a19320 <scsi_bus_type>,---------總線類型的指針,也在sdev_gendev成員中
driver = 0xffffffffa011e008 <sd_template+8>,
platform_data = 0x0,
power = {
power_state = {
event = 0
},
can_wakeup = 0,
async_suspend = 1,
is_prepared = false,
is_suspended = false,
ignore_children = false,
early_init = true,
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 1310740,
tickets = {
head = 20,
tail = 20
}
}
}
}
}
},
entry = {
next = 0xffff881fcee44c98,
prev = 0xffff883fc1e21cd8
},
completion = {
done = 2147483647,
wait = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 131074,
tickets = {
head = 2,
tail = 2
}
}
}
}
}
},
task_list = {
next = 0xffff881fcee44a18,
prev = 0xffff881fcee44a18
}
}
},
wakeup = 0x0,
wakeup_path = false,
syscore = false,
suspend_timer = {
entry = {
next = 0x0,
prev = 0x0
},
expires = 0,
base = 0xffff883fd0c94000,
function = 0xffffffff81402e90 <pm_suspend_timer_fn>,
data = 18446612268929272136,
slack = -1,
start_pid = -1,
start_site = 0x0,
start_comm = "000000000000000000000000000000"
},
timer_expires = 0,
work = {
data = {
counter = 68719476704
},
entry = {
next = 0xffff881fcee44a98,
prev = 0xffff881fcee44a98
},
func = 0xffffffff81402f10 <pm_runtime_work>
},
wait_queue = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
task_list = {
next = 0xffff881fcee44ab8,
prev = 0xffff881fcee44ab8
}
},
usage_count = {
counter = 2
},
child_count = {
counter = 0
},
disable_depth = 0,
idle_notification = 0,
request_pending = 0,
deferred_resume = 0,
run_wake = 0,
runtime_auto = 0,
no_callbacks = 0,
irq_safe = 0,
use_autosuspend = 1,
timer_autosuspends = 0,
memalloc_noio = 1,
request = RPM_REQ_NONE,
runtime_status = RPM_ACTIVE,
runtime_error = 0,
autosuspend_delay = -1,
last_busy = 4295244282,
active_jiffies = 0,
suspended_jiffies = 0,
accounting_timestamp = 4294683149,
subsys_data = 0x0,
qos = 0x0
},
pm_domain = 0x0,
numa_node = 1,
dma_mask = 0x0,
coherent_dma_mask = 0,
dma_parms = 0x0,
dma_pools = {
next = 0xffff881fcee44b40,
prev = 0xffff881fcee44b40
},
dma_mem = 0x0,
archdata = {
dma_ops = 0x0,
iommu = 0x0
},
of_node = 0x0,
acpi_node = {
{
companion = 0x0,
__UNIQUE_ID_rh_kabi_hide9 = {
handle = 0x0
},
{<No data fields>}
}
},
devt = 0,
id = 0,
devres_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
devres_head = {
next = 0xffff881fcee44b88,
prev = 0xffff881fcee44b88
},
knode_class = {
n_klist = 0x0,
n_node = {
next = 0x0,
prev = 0x0
},
n_ref = {
refcount = {
counter = 0
}
}
},
class = 0x0,
groups = 0x0,
release = 0x0,
iommu_group = 0x0,
offline_disabled = false,
offline = false,
device_rh = 0xffff881fcee33378
},
sdev_dev = {
parent = 0xffff881fcee44948,
p = 0xffff883fd0cf2cc0,
kobj = {
name = 0xffff883fc21d9798 "5:0:4:0",---在 scsi_sysfs_device_initialize 函數(shù)中,設(shè)置為和scsi_device.sdev_gendev一樣的name
entry = {
next = 0xffff883fc1854418,
prev = 0xffff881fcee44960
},
parent = 0xffff881fcee4cde0,
kset = 0xffff881fff86b6c0,
ktype = 0xffffffff81a14e40 <device_ktype>,
sd = 0xffff883fcac120e0,
kref = {
refcount = {
counter = 3
}
},
state_initialized = 1,
state_in_sysfs = 1,
state_add_uevent_sent = 1,
state_remove_uevent_sent = 0,
uevent_suppress = 0
},
init_name = 0x0,
type = 0x0,
mutex = {
count = {
counter = 1
},
wait_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
wait_list = {
next = 0xffff881fcee44c50,
prev = 0xffff881fcee44c50
},
owner = 0x0,
{
osq = 0x0,
__UNIQUE_ID_rh_kabi_hide0 = {
spin_mlock = 0x0
},
{<No data fields>}
}
},
bus = 0x0,
driver = 0x0,
platform_data = 0x0,
power = {
power_state = {
event = 0
},
can_wakeup = 0,
async_suspend = 1,
is_prepared = false,
is_suspended = false,
ignore_children = false,
early_init = true,
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
entry = {
next = 0xffff883fc18544b0,
prev = 0xffff881fcee449f8
},
completion = {
done = 2147483647,
wait = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 131074,
tickets = {
head = 2,
tail = 2
}
}
}
}
}
},
task_list = {
next = 0xffff881fcee44cb8,
prev = 0xffff881fcee44cb8
}
}
},
wakeup = 0x0,
wakeup_path = false,
syscore = false,
suspend_timer = {
entry = {
next = 0x0,
prev = 0x0
},
expires = 0,
base = 0xffff883fd0c94000,
function = 0xffffffff81402e90 <pm_suspend_timer_fn>,
data = 18446612268929272808,
slack = -1,
start_pid = -1,
start_site = 0x0,
start_comm = "000000000000000000000000000000"
},
timer_expires = 0,
work = {
data = {
counter = 68719476704
},
entry = {
next = 0xffff881fcee44d38,
prev = 0xffff881fcee44d38
},
func = 0xffffffff81402f10 <pm_runtime_work>
},
wait_queue = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
task_list = {
next = 0xffff881fcee44d58,
prev = 0xffff881fcee44d58
}
},
usage_count = {
counter = 0
},
child_count = {
counter = 0
},
disable_depth = 1,
idle_notification = 0,
request_pending = 0,
deferred_resume = 0,
run_wake = 0,
runtime_auto = 1,
no_callbacks = 0,
irq_safe = 0,
use_autosuspend = 0,
timer_autosuspends = 0,
memalloc_noio = 0,
request = RPM_REQ_NONE,
runtime_status = RPM_SUSPENDED,
runtime_error = 0,
autosuspend_delay = 0,
last_busy = 0,
active_jiffies = 0,
suspended_jiffies = 0,
accounting_timestamp = 4294680236,
subsys_data = 0x0,
qos = 0x0
},
pm_domain = 0x0,
numa_node = 1,
dma_mask = 0x0,
coherent_dma_mask = 0,
dma_parms = 0x0,
dma_pools = {
next = 0xffff881fcee44de0,
prev = 0xffff881fcee44de0
},
dma_mem = 0x0,
archdata = {
dma_ops = 0x0,
iommu = 0x0
},
of_node = 0x0,
acpi_node = {
{
companion = 0x0,
__UNIQUE_ID_rh_kabi_hide9 = {
handle = 0x0
},
{<No data fields>}
}
},
devt = 0,
id = 0,
devres_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
devres_head = {
next = 0xffff881fcee44e28,
prev = 0xffff881fcee44e28
},
knode_class = {
n_klist = 0xffff883fcff106a8,
n_node = {
next = 0xffff881fcece9e40,
prev = 0xffff881fcee44640
},
n_ref = {
refcount = {
counter = 1
}
}
},
class = 0xffffffff81a193e0 <sdev_class>,
groups = 0x0,
release = 0x0,
iommu_group = 0x0,
offline_disabled = false,
offline = false,
device_rh = 0xffff881fcee33398
},
ew = {
work = {
data = {
counter = 0
},
entry = {
next = 0x0,
prev = 0x0
},
func = 0x0
}
},
requeue_work = {
data = {
counter = 68719476704
},
entry = {
next = 0xffff881fcee44eb0,
prev = 0xffff881fcee44eb0
},
func = 0xffffffff814236e0 <scsi_requeue_run_queue>
},
scsi_dh_data = 0x0,
sdev_state = SDEV_RUNNING,---------------當(dāng)前設(shè)備的狀態(tài)為運(yùn)行態(tài)
{
vpd_pg83 = 0xffff883fc1e62400 "",
__UNIQUE_ID_rh_kabi_hide22 = {
vpd_reserved1 = 0xffff883fc1e62400
},
{<No data fields>}
},
{
vpd_pg83_len = 76,
__UNIQUE_ID_rh_kabi_hide23 = {
vpd_reserved2 = 0x4c
},
{<No data fields>}
},
{
vpd_pg80 = 0xffff883fc1e62300 "",
__UNIQUE_ID_rh_kabi_hide24 = {
vpd_reserved3 = 0xffff883fc1e62300
},
{<No data fields>}
},
{
vpd_pg80_len = 24,
__UNIQUE_ID_rh_kabi_hide25 = {
vpd_reserved4 = 0x18
},
{<No data fields>}
},
vpd_reserved5 = 0 '00',
vpd_reserved6 = 0 '00',
vpd_reserved7 = 0 '00',
vpd_reserved8 = 0 '00',
vpd_reserved9 = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
rh_reserved1 = 0x0,
rh_reserved2 = 0x0,
rh_reserved3 = 0x0,
rh_reserved4 = 0x0,
rh_reserved5 = 0x0,
rh_reserved6 = 0x0,
scsi_mq_reserved1 = {
counter = 0
},
scsi_mq_reserved2 = {
counter = 0
},
sdev_data = 0xffff881fcee44f38
}
通過scsi_device 怎么找到它歸屬的scsi_target呢?從前面的打印看,
crash> scsi_device.sdev_target ffff881fcee44800
sdev_target = 0xffff883fc1e21c00
crash> struct -xo scsi_device.sdev_gendev ffff881fcee44800
struct scsi_device {
[ffff881fcee44948] struct device sdev_gendev;
}
crash> device.parent ffff881fcee44948
parent = 0xffff883fc1e21c28
crash> struct -xo scsi_target.dev
struct scsi_target {
[0x28] struct device dev;
}
crash> px 0xffff883fc1e21c28-0x28
$4 = 0xffff883fc1e21c00--------------和直接取的sdev_target是一樣的,不過建議還是用第二種方法
也可以直接看,不用一級(jí)一級(jí)查看:
crash> scsi_device.sdev_gendev.parent ffff881fcee44800
sdev_gendev.parent = 0xffff883fc1e21c28,
scsi_target 的dev,也就是從linux驅(qū)動(dòng)層的角度來說,通過設(shè)備樹的方式來管理scsi_targe和scsi_device.
對(duì)于target,有個(gè)scsi_target 的抽象。它的starget_sdev_user成員指向當(dāng)前active的lun,
/*
* scsi_target: representation of a scsi target, for now, this is only
* used for single_lun devices. If no one has active IO to the target,-------注釋過時(shí)了么?
* starget_sdev_user is NULL, else it points to the active sdev.
*/
struct scsi_target {
struct scsi_device *starget_sdev_user;---要么之前當(dāng)前active的scsi_device,要么為NULL,用于當(dāng)前target只支持一個(gè)lun的場景
...
struct devicedev;---device設(shè)備樹,
....
下面是一個(gè)scsi_target的例子:
crash> scsi_target 0xffff883fc1e21c00----這個(gè)就是前面scsi_device的歸屬scsi_target
struct scsi_target {
starget_sdev_user = 0x0,
siblings = {
next = 0xffff883fc1e5f008,-------------這個(gè)成員嵌入到host的__target成員
prev = 0xffff883fcaa4b408
},
devices = {-------------------一個(gè)target下的scsi_device的鏈
next = 0xffff881fcee44820,
prev = 0xffff881fcee44820
},
dev = {-----------------------從驅(qū)動(dòng)模型說,scsi_device的sdev_gendev的parent指向scsi_target的dev
parent = 0xffff883fcaa4fc00,
p = 0xffff883fd0cf29c0,
kobj = {
name = 0xffff883fc1e00660 "target5:0:4",
entry = {
next = 0xffff881fcee44960,
prev = 0xffff883fc1853018
},
parent = 0xffff883fcaa4fc10,
kset = 0xffff881fff86b6c0,
ktype = 0xffffffff81a14e40 <device_ktype>,
sd = 0xffff883fccc5ea10,
。。。。。。-----------------------------省略了其他device模型的又臭又長的結(jié)構(gòu)體
reap_ref = 0,
channel = 0,
id = 4,
create = 0,
single_lun = 0,
pdt_1f_for_no_lun = 0,
no_report_luns = 0,
expecting_lun_change = 0,
{
target_busy = {
counter = 0
},
__UNIQUE_ID_rh_kabi_hide19 = {
target_busy = 0
},
{<No data fields>}
},
can_queue = 0,
{
target_blocked = {
counter = 0
},
__UNIQUE_ID_rh_kabi_hide20 = {
target_blocked = 0
},
{<No data fields>}
},
max_target_blocked = 3,
scsi_level = 7 'a',
ew = {
work = {
data = {
counter = 0
},
entry = {
next = 0x0,
prev = 0x0
},
func = 0x0
}
},
state = STARGET_RUNNING,
hostdata = 0xffff883fc1e22000,
rh_reserved1 = 0x0,
rh_reserved2 = 0x0,
rh_reserved3 = 0x0,
rh_reserved4 = 0x0,
scsi_mq_reserved1 = {
counter = 0
},
scsi_mq_reserved2 = {
counter = 0
},
starget_data = 0xffff883fc1e21f48
}
scsi_request_fn 函數(shù)在給某個(gè)設(shè)備發(fā)送io請(qǐng)求的時(shí)候,還會(huì)判斷當(dāng)前設(shè)備歸屬的scsi_target 是否busy。
static void scsi_request_fn(struct request_queue *q)
__releases(q->queue_lock)
__acquires(q->queue_lock)
{
。。。。
if (!scsi_target_queue_ready(shost, sdev))
goto not_ready;
。。。。
}
雖然從內(nèi)核管理的角度說,scsi_target和scsi_device是一對(duì)多的,但是我看到的實(shí)際情況卻是一對(duì)一,由于這個(gè)starget_sdev_user 成員會(huì)指向active的scsi_device,但這個(gè)是個(gè)瞬間態(tài)。
大多時(shí)候是為NULL的。
對(duì)于bus/channel,沒有抽象,有一個(gè)id來表示,在host中有一個(gè)最大的channel編號(hào)max_channel 成員來區(qū)分一個(gè)host下的各個(gè)channel。
對(duì)于scsi的host,有個(gè)scsi_host的抽象。它通過__devices 成員串接它管理的所有scsi_device,通過__targets成員串接它管理的所有target,通過scsi_add_host 函數(shù)往系統(tǒng)增加host。
下面是一個(gè)host的例子:
crash> struct Scsi_Host 0xffff883fd0e38000
struct Scsi_Host {
__devices = {
next = 0xffff881fcee42810,
prev = 0xffff883fc18a6010
},
__targets = {
next = 0xffff883fc1e19008,
prev = 0xffff883fc18f2408
},
cmd_pool = 0xffffffff81a18680 <scsi_cmd_pool>,
free_list_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
free_list = {
next = 0xffff883fc8db0008,
prev = 0xffff883fc8db0008
},
starved_list = {
next = 0xffff883fd0e38040,
prev = 0xffff883fd0e38040
},
default_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 93193614,
tickets = {
head = 1422,
tail = 1422
}
}
}
}
}
},
host_lock = 0xffff883fd0e38050,
scan_mutex = {
count = {
counter = 1
},
wait_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
wait_list = {
next = 0xffff883fd0e38068,
prev = 0xffff883fd0e38068
},
owner = 0x0,
{
osq = 0x0,
__UNIQUE_ID_rh_kabi_hide1 = {
spin_mlock = 0x0
},
{<No data fields>}
}
},
eh_cmd_q = {
next = 0xffff883fd0e38088,
prev = 0xffff883fd0e38088
},
ehandler = 0xffff881fcc24a280,----這個(gè)對(duì)應(yīng)的是PID: 680 TASK: ffff881fcc24a280 CPU: 37 COMMAND: "scsi_eh_5"
eh_action = 0x0,
host_wait = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
task_list = {
next = 0xffff883fd0e380b0,
prev = 0xffff883fd0e380b0
}
},
hostt = 0xffffffffa00d01c0,--------host的自己模板
transportt = 0xffff883fcd090000,----這個(gè)就是 mpt3sas_transport_template,不同的host類型有不同的傳輸類型模板
{
bqt = 0x0,
tag_set = 0x0
},
{
host_busy = {
counter = 8----------------8個(gè)busy的io,其實(shí)就是目前已經(jīng)離開request_queue之后的io統(tǒng)計(jì)
},
__UNIQUE_ID_rh_kabi_hide30 = {
host_busy = 8
},
{<No data fields>}
},
host_failed = 0,--------------目前沒有fail的
host_eh_scheduled = 0,
host_no = 5,------------------這個(gè)關(guān)聯(lián)的錯(cuò)誤處理內(nèi)核線程,有多少個(gè)host就有多少個(gè)錯(cuò)誤處理線程--680 2 37 ffff881fcc24a280 IN 0.0 0 0 [scsi_eh_5]
eh_deadline = -1,
last_reset = 0,
max_id = 4294967295,
max_lun = 16895,
max_channel = 0,
unique_id = 1,
max_cmd_len = 32,
this_id = -1,
can_queue = 2936,
cmd_per_lun = 7,
sg_tablesize = 128,
sg_prot_tablesize = 0,
max_sectors = 32767,
dma_boundary = 4294967295,
cmd_serial_number = 0,
active_mode = 1,
unchecked_isa_dma = 0,
use_clustering = 1,
use_blk_tcq = 0,
host_self_blocked = 0,
reverse_ordering = 0,
ordered_tag = 0,
tmf_in_progress = 0,
async_scan = 0,
eh_noresume = 0,
no_write_same = 0,
use_blk_mq = 0,------------------是否使用多隊(duì)列
no_scsi2_lun_in_cdb = 0,
work_q_name = "00000000000000000000000000000000000000",
work_q = 0x0,
tmf_work_q = 0xffff881fcef98800,
{
host_blocked = {
counter = 0
},
__UNIQUE_ID_rh_kabi_hide31 = {
host_blocked = 0
},
{<No data fields>}
},
max_host_blocked = 7,
prot_capabilities = 7,
prot_guard_type = 3 '03',
uspace_req_q = 0x0,
base = 0,
io_port = 0,
n_io_port = 0 '00',
dma_channel = 255 '377',
irq = 0,
shost_state = SHOST_RUNNING,---------60個(gè)硬盤的也是running狀態(tài)
shost_gendev = {
parent = 0xffff883fcfded098,
p = 0xffff883fd0c8b500,
kobj = {
name = 0xffff883fcd0661c0 "host5",---------設(shè)備驅(qū)動(dòng)類型的名稱
entry = {
next = 0xffff883fd0e38448,
prev = 0xffff881fca1cd818
},
parent = 0xffff883fcfded0a8,
kset = 0xffff881fff86b6c0,
ktype = 0xffffffff81a14e40 <device_ktype>,
sd = 0xffff883fccc9e690,
kref = {
refcount = {
counter = 39
}
},
state_initialized = 1,
state_in_sysfs = 1,
state_add_uevent_sent = 1,
state_remove_uevent_sent = 0,
uevent_suppress = 0
},
init_name = 0x0,
type = 0xffffffff81a18a80 <scsi_host_type>,
mutex = {
count = {
counter = 1
},
wait_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
wait_list = {
next = 0xffff883fd0e381f8,
prev = 0xffff883fd0e381f8
},
owner = 0x0,
{
osq = 0x0,
__UNIQUE_ID_rh_kabi_hide1 = {
spin_mlock = 0x0
},
{<No data fields>}
}
},
bus = 0xffffffff81a19320 <scsi_bus_type>,
driver = 0x0,
platform_data = 0x0,
power = {
power_state = {
event = 0
},
can_wakeup = 0,
async_suspend = 1,
is_prepared = false,
is_suspended = false,
ignore_children = false,
early_init = true,
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 9568402,
tickets = {
head = 146,
tail = 146
}
}
}
}
}
},
entry = {
next = 0xffff883fd0e384e0,
prev = 0xffff881fca1cd8b0
},
completion = {
done = 2147483647,
wait = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 131074,
tickets = {
head = 2,
tail = 2
}
}
}
}
}
},
task_list = {
next = 0xffff883fd0e38260,
prev = 0xffff883fd0e38260
}
}
},
wakeup = 0x0,
wakeup_path = false,
syscore = false,
suspend_timer = {
entry = {
next = 0x0,
prev = 0x0
},
expires = 0,
base = 0xffff883fd0e24000,
function = 0xffffffff81402e90 <pm_suspend_timer_fn>,
data = 18446612406401728912,
slack = -1,
start_pid = -1,
start_site = 0x0,
start_comm = "000000000000000000000000000000"
},
timer_expires = 0,
work = {
data = {
counter = 68719476704
},
entry = {
next = 0xffff883fd0e382e0,
prev = 0xffff883fd0e382e0
},
func = 0xffffffff81402f10 <pm_runtime_work>
},
wait_queue = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 262148,
tickets = {
head = 4,
tail = 4
}
}
}
}
}
},
task_list = {
next = 0xffff883fd0e38300,
prev = 0xffff883fd0e38300
}
},
usage_count = {
counter = 0
},
child_count = {
counter = 0
},
disable_depth = 0,
idle_notification = 0,
request_pending = 0,
deferred_resume = 0,
run_wake = 0,
runtime_auto = 1,
no_callbacks = 0,
irq_safe = 0,
use_autosuspend = 0,
timer_autosuspends = 0,
memalloc_noio = 1,
request = RPM_REQ_NONE,
runtime_status = RPM_SUSPENDED,
runtime_error = 0,
autosuspend_delay = 0,
last_busy = 0,
active_jiffies = 9752,
suspended_jiffies = 0,
accounting_timestamp = 4294683155,
subsys_data = 0x0,
qos = 0x0
},
pm_domain = 0x0,
numa_node = 1,
dma_mask = 0x0,
coherent_dma_mask = 0,
dma_parms = 0x0,
dma_pools = {
next = 0xffff883fd0e38388,
prev = 0xffff883fd0e38388
},
dma_mem = 0x0,
archdata = {
dma_ops = 0x0,
iommu = 0x0
},
of_node = 0x0,
acpi_node = {
{
companion = 0x0,
__UNIQUE_ID_rh_kabi_hide7 = {
handle = 0x0
},
{<No data fields>}
}
},
devt = 0,
id = 0,
devres_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
devres_head = {
next = 0xffff883fd0e383d0,
prev = 0xffff883fd0e383d0
},
knode_class = {
n_klist = 0x0,
n_node = {
next = 0x0,
prev = 0x0
},
n_ref = {
refcount = {
counter = 0
}
}
},
class = 0x0,
groups = 0x0,
release = 0x0,
iommu_group = 0x0,
offline_disabled = false,
offline = false,
device_rh = 0xffff883fccc84738
},
shost_dev = {
parent = 0xffff883fd0e38190,
p = 0xffff883fd0c8b5c0,
kobj = {
name = 0xffff883fcd0661c8 "host5",----------設(shè)備驅(qū)動(dòng)類型的名稱
entry = {
next = 0xffff881fce507818,
prev = 0xffff883fd0e381a8
},
parent = 0xffff883fcd039180,
kset = 0xffff881fff86b6c0,
ktype = 0xffffffff81a14e40 <device_ktype>,
sd = 0xffff883fccc9eb60,
kref = {
refcount = {
counter = 3
}
},
state_initialized = 1,
state_in_sysfs = 1,
state_add_uevent_sent = 1,
state_remove_uevent_sent = 0,
uevent_suppress = 0
},
init_name = 0x0,
type = 0x0,
mutex = {
count = {
counter = 1
},
wait_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
wait_list = {
next = 0xffff883fd0e38498,
prev = 0xffff883fd0e38498
},
owner = 0x0,
{
osq = 0x0,
__UNIQUE_ID_rh_kabi_hide1 = {
spin_mlock = 0x0
},
{<No data fields>}
}
},
bus = 0x0,
driver = 0x0,
platform_data = 0x0,
power = {
power_state = {
event = 0
},
can_wakeup = 0,
async_suspend = 1,
is_prepared = false,
is_suspended = false,
ignore_children = false,
early_init = true,
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
entry = {
next = 0xffff881fce5078b0,
prev = 0xffff883fd0e38240
},
completion = {
done = 2147483647,
wait = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 131074,
tickets = {
head = 2,
tail = 2
}
}
}
}
}
},
task_list = {
next = 0xffff883fd0e38500,
prev = 0xffff883fd0e38500
}
}
},
wakeup = 0x0,
wakeup_path = false,
syscore = false,
suspend_timer = {
entry = {
next = 0x0,
prev = 0x0
},
expires = 0,
base = 0xffff883fd0e24000,
function = 0xffffffff81402e90 <pm_suspend_timer_fn>,
data = 18446612406401729584,
slack = -1,
start_pid = -1,
start_site = 0x0,
start_comm = "000000000000000000000000000000"
},
timer_expires = 0,
work = {
data = {
counter = 68719476704
},
entry = {
next = 0xffff883fd0e38580,
prev = 0xffff883fd0e38580
},
func = 0xffffffff81402f10 <pm_runtime_work>
},
wait_queue = {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
task_list = {
next = 0xffff883fd0e385a0,
prev = 0xffff883fd0e385a0
}
},
usage_count = {
counter = 0
},
child_count = {
counter = 0
},
disable_depth = 1,
idle_notification = 0,
request_pending = 0,
deferred_resume = 0,
run_wake = 0,
runtime_auto = 1,
no_callbacks = 0,
irq_safe = 0,
use_autosuspend = 0,
timer_autosuspends = 0,
memalloc_noio = 0,
request = RPM_REQ_NONE,
runtime_status = RPM_SUSPENDED,
runtime_error = 0,
autosuspend_delay = 0,
last_busy = 0,
active_jiffies = 0,
suspended_jiffies = 0,
accounting_timestamp = 4294671976,
subsys_data = 0x0,
qos = 0x0
},
pm_domain = 0x0,
numa_node = 1,
dma_mask = 0x0,
coherent_dma_mask = 0,
dma_parms = 0x0,
dma_pools = {
next = 0xffff883fd0e38628,
prev = 0xffff883fd0e38628
},
dma_mem = 0x0,
archdata = {
dma_ops = 0x0,
iommu = 0x0
},
of_node = 0x0,
acpi_node = {
{
companion = 0x0,
__UNIQUE_ID_rh_kabi_hide7 = {
handle = 0x0
},
{<No data fields>}
}
},
devt = 0,
id = 0,
devres_lock = {
{
rlock = {
raw_lock = {
{
head_tail = 0,
tickets = {
head = 0,
tail = 0
}
}
}
}
}
},
devres_head = {
next = 0xffff883fd0e38670,
prev = 0xffff883fd0e38670
},
knode_class = {
n_klist = 0xffff883fcff102a8,
n_node = {
next = 0xffff883fd0cb2688,
prev = 0xffff881fce2b6688
},
n_ref = {
refcount = {
counter = 1
}
}
},
class = 0xffffffff81a18ac0 <shost_class>,
groups = 0xffffffff81a19470 <scsi_sysfs_shost_attr_groups>,
release = 0x0,
iommu_group = 0x0,
offline_disabled = false,
offline = false,
device_rh = 0xffff883fccc84758
},
sht_legacy_list = {
next = 0x0,
prev = 0x0
},
shost_data = 0xffff883fcd0391e0,
dma_dev = 0xffff883fcfded098,
rh_reserved1 = 0x0,
rh_reserved2 = 0x0,
rh_reserved3 = 0x0,
rh_reserved4 = 0x0,
rh_reserved5 = 0x0,
rh_reserved6 = 0x0,
scsi_mq_reserved1 = 0,
scsi_mq_reserved2 = 0,
scsi_mq_reserved3 = 0x0,
scsi_mq_reserved4 = 0x0,
scsi_mq_reserved5 = {
counter = 0
},
scsi_mq_reserved6 = {
counter = 0
},
hostdata = 0xffff883fd0e38740---這個(gè)一般存放控制器的相關(guān)數(shù)據(jù),如MPT3SAS_ADAPTER,MPT2SAS_ADAPTER等
}
一般在scsi主機(jī)適配器驅(qū)動(dòng)的probe里面,先是scsi_alloc_host,然后scsi_add_host,緊接著就調(diào)用scsi_scan_host掃描scsi總線。
scsi總線掃描的目的是通過協(xié)議特定或芯片特定的方式探測出掛接在主機(jī)適配器后面的目標(biāo)節(jié)點(diǎn)和邏輯單元,為它們在內(nèi)存中構(gòu)建相應(yīng)的數(shù)據(jù)結(jié)構(gòu),將它們添加到系統(tǒng)中。
scsi中間層依次以可能的ID和LUN構(gòu)造INQUIRY命令,之后將這些INQUIRY命令提交到塊IO系統(tǒng),后者最終將調(diào)用中間層的策略例程,再次提取到SCSI命令后,調(diào)用scsi底層驅(qū)動(dòng)的queuecommand回調(diào)函數(shù)。其實(shí)內(nèi)核中,只要涉及到注冊的,基本都涉及到往上層和往下層的關(guān)系的建立。
各個(gè)Scsi_Host之間什么關(guān)系?
從設(shè)備驅(qū)動(dòng)模型的角度說,各個(gè)host的shost_dev.parent指向同一個(gè)device,其他沒有相關(guān)性。
crash> device.parent ffff883fd0cb4190 parent = 0xffff883fcfdef098 crash> device.parent 0xffff883fd0e38190 parent = 0xffff883fcfded098 crash> device.parent 0xffff883fd0cb2190 parent = 0xffff883fcfdee098 crash> device.parent 0xffff881fce2b6190 parent = 0xffff883fcfdb0098
SCSI 子系統(tǒng)處理塊訪問請(qǐng)求
當(dāng) SCSI 子系統(tǒng)的請(qǐng)求隊(duì)列處理函數(shù)被通用塊層調(diào)用后,SCSI 中間層會(huì)根據(jù)塊訪問請(qǐng)求的內(nèi)容,生成、初始并提交 SCSI 命令 (struct scsi_cmd) 到 SCSI TARGET 端。
scsi這些是按層級(jí)去描述對(duì)應(yīng)通信的設(shè)備的,分別為host級(jí),bus級(jí),target級(jí),device級(jí)。前面提到的scsi_device就是device層的抽象,對(duì)應(yīng)的是lun,可能是磁盤,也可能是光盤之類的,
如果是磁盤,則還會(huì)生成一個(gè)scsi_disk的對(duì)象,光盤的話,則會(huì)產(chǎn)生一個(gè)scsi_cd 的對(duì)象來和scsi_device 對(duì)應(yīng)。
在scsi總線掃描的時(shí)候,每當(dāng)探測到一個(gè)設(shè)備,就會(huì)調(diào)用scsi_alloc_sdev()函數(shù),然后里面會(huì)繼續(xù)調(diào)用scsi_alloc_queue(),也就是當(dāng)內(nèi)核識(shí)別到一個(gè)scsi設(shè)備之后,需要為該設(shè)備設(shè)置一個(gè)request_queue,這個(gè)動(dòng)作在下面完成,具體怎么識(shí)別到scsi_device ,有一堆探測的流程,在此不展開。
struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
{
struct request_queue *q;
q = __scsi_alloc_queue(sdev->host, scsi_request_fn);----------申請(qǐng)常見的request_queue,并且設(shè)置它的成員,scsi_request_fn 用用來執(zhí)行request調(diào)用的
if (!q)
return NULL;
blk_queue_prep_rq(q, scsi_prep_fn);-------------------scsi_prep_fn準(zhǔn)備scsi命令用的函數(shù)
blk_queue_unprep_rq(q, scsi_unprep_fn);
blk_queue_softirq_done(q, scsi_softirq_done);
blk_queue_rq_timed_out(q, scsi_times_out);
blk_queue_lld_busy(q, scsi_lld_busy);
return q;
}
scsi命令的抽象:
內(nèi)核中使用scsi_cmnd 來管理生成的scsi命令,包括命令的時(shí)間,重試次數(shù),上下文指針,承載CDB的命令體等。一個(gè)典型的fs下發(fā)的request包含的scsi_cmnd 例子如下:
crash> scsi_cmnd 0xffff881f49a2d500
struct scsi_cmnd {
device = 0xffff881fcee44800,------這個(gè)命令歸屬的scsi_device對(duì)象的指針
list = {
next = 0xffff881f49a2cfc8,
prev = 0xffff881fcee44838
},
eh_entry = {----嵌入到錯(cuò)誤處理鏈表的成員,當(dāng)該scsi命令出現(xiàn)錯(cuò)誤或者超時(shí)的時(shí)候用到
next = 0x0,
prev = 0x0
},
abort_work = {----命令出現(xiàn)超時(shí)的時(shí)候用到,這個(gè)會(huì)嵌入到scsi_host的一個(gè)workqueue中去處理
work = {
data = {
counter = 68719476704
},
entry = {
next = 0xffff881f49a2d530,
prev = 0xffff881f49a2d530
},
func = 0xffffffff8141eee0 <scmd_eh_abort_handler>----work_struct中的處理函數(shù)
},
timer = {
entry = {
next = 0x0,
prev = 0x0
},
expires = 0,
base = 0xffff881fd2d8c002,
function = 0xffffffff8109c100 <delayed_work_timer_fn>,
data = 18446612266693612840,
slack = -1,
start_pid = -1,
start_site = 0x0,
start_comm = "000000000000000000000000000000"
},
wq = 0x0,
cpu = 0
},
eh_eflags = 0,
serial_number = 0,------------------命令編號(hào)
jiffies_at_alloc = 4298774713,------這個(gè)命令在alloc時(shí)的時(shí)戳
retries = 0,
allowed = 5,
prot_op = 0 '00',
prot_type = 0 '00',
cmd_len = 16,
sc_data_direction = DMA_FROM_DEVICE,
cmnd = 0xffff883e9d3f7e98 "210",
sdb = {
table = {
sgl = 0xffff880cdf50fe00,
nents = 1,
orig_nents = 1
},
length = 4096,
resid = 0
},
prot_sdb = 0x0,
underflow = 4096,
transfersize = 512,
request = 0xffff883e9d3f7d80,------------------命令對(duì)應(yīng)的blk層的request
sense_buffer = 0xffff880168be0f00 "",
scsi_done = 0xffffffff81420a90 <scsi_done>,---命令執(zhí)行后的回調(diào)
SCp = {
ptr = 0x0,
this_residual = 0,
buffer = 0x0,
buffers_residual = 0,
dma_handle = 0,
Status = 0,
Message = 0,
have_data_in = 0,
sent_command = 0,
phase = 0
},
host_scribble = 0x0,
result = 0,
tag = 255 '377',
rh_reserved1 = 0x0,
rh_reserved2 = 0x0,
rh_reserved3 = 0x0,
rh_reserved4 = 0x0
}
SCSI 命令初始化和提交
除了通用塊層下發(fā)的scsi命令之外,可以通過sg來下發(fā)scsi命令。
SCSI 子系統(tǒng)的錯(cuò)誤處理
由于 硬盤底層驅(qū)動(dòng)是由廠商自己實(shí)現(xiàn)的,在此就不予討論。除此之外,SCSI 子系統(tǒng)的出錯(cuò)處理,主要是由 SCSI 中間層完成。在第一次回調(diào)過程中,SCSI 底層驅(qū)動(dòng)將 SCSI 命令的處理結(jié)果以及獲取的 SCSI 狀態(tài)信息返回給 SCSI 中間層,SCSI 中間層先對(duì) SCSI 底層驅(qū)動(dòng)返回的 SCSI 命令執(zhí)行的結(jié)果進(jìn)行判斷,若無法得到明確的結(jié)論,則對(duì) SCSI 底層驅(qū)動(dòng)返回的 SCSI 狀態(tài)、感測數(shù)據(jù)等進(jìn)行判斷。對(duì)于判斷結(jié)論為處理成功的 SCSI 命令,SCSI 中間層會(huì)直接進(jìn)行第二次回調(diào);對(duì)于判斷結(jié)論為需要重試的命令,則會(huì)被加入塊設(shè)備請(qǐng)求對(duì)列,重新被處理。這個(gè)過程可稱為 SCSI 中間層對(duì) SCSI 命令執(zhí)行結(jié)果的基本判斷方法。
一切看起來似乎是這么簡單,但是實(shí)際上并非如此,有些錯(cuò)誤是沒有明確的判斷依據(jù)的,如感測數(shù)據(jù)錯(cuò)誤或 TIMEOUT 錯(cuò)誤。為了解決這個(gè)問題,LINUX 內(nèi)核中 SCSI 子系統(tǒng)引入了一個(gè)專門進(jìn)行錯(cuò)誤處理的線程,對(duì)于無法判斷錯(cuò)誤原因的 SCSI 命令,都會(huì)交由該線程進(jìn)行處理。線程處理過程和兩個(gè)隊(duì)列密切相關(guān),一個(gè)是錯(cuò)誤處理隊(duì)列(eh_work_q),一個(gè)是錯(cuò)誤處理完成隊(duì)列 (done_q) 。錯(cuò)誤處理隊(duì)列記錄了需要進(jìn)行錯(cuò)誤處理的 SCSI 命令,錯(cuò)誤處理完成隊(duì)列記錄了在錯(cuò)誤處理過程中被處理完成的 SCSI 命令。下圖顯示了線程對(duì)錯(cuò)誤處理隊(duì)列上記錄的命令進(jìn)行錯(cuò)誤處理的過程。
錯(cuò)誤處理的過程
static void scsi_unjam_host(struct Scsi_Host *shost)
{
unsigned long flags;
LIST_HEAD(eh_work_q);
LIST_HEAD(eh_done_q);
spin_lock_irqsave(shost->host_lock, flags);
list_splice_init(&shost->eh_cmd_q, &eh_work_q);
spin_unlock_irqrestore(shost->host_lock, flags);
SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
if (!scsi_eh_abort_cmds(&eh_work_q, &eh_done_q))
scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
spin_lock_irqsave(shost->host_lock, flags);
if (shost->eh_deadline != -1)
shost->last_reset = 0;
spin_unlock_irqrestore(shost->host_lock, flags);
scsi_eh_flush_done_q(&eh_done_q);
}
整個(gè)處理過程可歸納為四個(gè)階段:
感測數(shù)據(jù)查詢階段
通過查詢感測數(shù)據(jù),為處理 SCSI 命令重新提供判斷依據(jù),并按照前述基本判斷方法進(jìn)行判斷。如果判斷結(jié)果為成功或者重試,則可將該命令從錯(cuò)誤處理隊(duì)列移到錯(cuò)誤處理完成隊(duì)列。若判斷失敗,則命令將會(huì)繼續(xù)保留在 SCSI 錯(cuò)誤處理隊(duì)列中,錯(cuò)誤處理進(jìn)入到 ABORT 階段。
ABORT階段
在這個(gè)階段中,錯(cuò)誤處理隊(duì)列上的 SCSI 命令會(huì)被主動(dòng) ABORT 掉。被 ABORT 的命令,會(huì)被加入到錯(cuò)誤處理完成隊(duì)列。若 ABORT 過程結(jié)束,錯(cuò)誤處理隊(duì)列上還存在未能被處理的命令,則需進(jìn)入 START STOP UNIT 階段進(jìn)行處理。
START STOP UNIT階段
在這個(gè)階段,START STOP UNIT[6] 命令會(huì)被發(fā)送到與錯(cuò)誤處理隊(duì)列上的命令相關(guān)的 SCSI DEVICE 上,去試圖恢復(fù) SCSI DEVICE,如果在 START STOP UNIT 階段結(jié)束后,依舊有命令在錯(cuò)誤處理隊(duì)列上,則需要進(jìn)入 RESET 階段進(jìn)行處理。
RESET階段
RESET 階段的處理過程分四個(gè)層次:DEVICE RESET,TARGET RESET, BUS RESET 和 HOST RESET 。首先對(duì)與錯(cuò)誤隊(duì)列上的命令相關(guān)的 SCSI DEVICE,進(jìn)行 RESET 操作,如果 DEVICE RESET 后,SCSI 設(shè)備能處于正常狀態(tài),則和該設(shè)備相關(guān)的錯(cuò)誤處理隊(duì)列上的錯(cuò)誤命令,會(huì)被加入到錯(cuò)誤處理完成隊(duì)列中。若通過 DEVICE RESET 不能處理所有的錯(cuò)誤命令,則需進(jìn)入TARGET RESET,再失敗則需進(jìn)入到 BUS RESET 階段,BUS RESET 會(huì)對(duì)與錯(cuò)誤處理隊(duì)列上的命令相關(guān)的 BUS,進(jìn)行 RESET 操作。若 BUS RESET 還不能成功處理所有錯(cuò)誤處理隊(duì)列上的 SCSI 命令,則會(huì)進(jìn)入到 HOST RESET 階段,HOST RESET 會(huì)對(duì)與錯(cuò)誤處理隊(duì)列上的命令相關(guān)的 HOST 進(jìn)行 RESET 操作。當(dāng)然,很有可能 HOST RESET 也不能成功處理所有錯(cuò)誤命令,則只能認(rèn)為錯(cuò)誤處理隊(duì)列上錯(cuò)誤命令相關(guān)的 SCSI 設(shè)備不能被使用了。這些不能被使用的設(shè)備會(huì)被標(biāo)記為不能使用狀態(tài),同時(shí)相關(guān)的錯(cuò)誤命令都會(huì)被加入到錯(cuò)誤處理完成隊(duì)列中。對(duì)應(yīng)的函數(shù)如下:
那些簡寫:
stu-------START_UNIT ,比如scsi_eh_stu 函數(shù)。
tmf-------Task management function
scsi_abort_command---在timeout之后,abort一個(gè)命令,如果host已經(jīng)處于EH的recovery狀態(tài),則返回false,否則會(huì)將該命令加入到host中的tmf_work_q 成員中去,
queue_delayed_work(shost->tmf_work_q, &scmd->abort_work, HZ / 100);
最終這個(gè)work queue會(huì)回調(diào) scmd_eh_abort_handler。
在request_queue中,有一個(gè)timeout成員,負(fù)責(zé)定時(shí)做什么事呢?就是每個(gè)下發(fā)給驅(qū)動(dòng)的request,都會(huì)掛在
timeout_list 成員中,timeout成員就通過blk_rq_timed_out_timer 來查看那些命令已經(jīng)超時(shí)了,進(jìn)行超時(shí)的處理。
void blk_rq_timed_out_timer(unsigned long data)
{
struct request_queue *q = (struct request_queue *) data;
unsigned long flags, next = 0;
struct request *rq, *tmp;
int next_set = 0;
spin_lock_irqsave(q->queue_lock, flags);
list_for_each_entry_safe(rq, tmp, &q->timeout_list, timeout_list)
blk_rq_check_expired(rq, &next, &next_set);-------遍歷下發(fā)給驅(qū)動(dòng)的request,查看這些request是否超時(shí)了,這些request都串接在timeout_list中
if (next_set)
mod_timer(&q->timeout, round_jiffies_up(next));
spin_unlock_irqrestore(q->queue_lock, flags);
}
這里有一個(gè)需要注意的地方,從網(wǎng)上看,之前是一個(gè)request一個(gè)定時(shí)器,這樣定時(shí)器就可能設(shè)置很多,而且這些定時(shí)器很有可能都沒有用到,畢竟超時(shí)的概率還是比較低的,所以要不停創(chuàng)建和插入加刪除定時(shí)器,而目前是一個(gè)request_queue一個(gè)定時(shí)器,然后這個(gè)定時(shí)器負(fù)責(zé)掃描到期的request,且這個(gè)定時(shí)器是常駐內(nèi)存的。
blk_rq_check_expired 的機(jī)制:
static void blk_rq_check_expired(struct request *rq, unsigned long *next_timeout,
unsigned int *next_set)
{
if (time_after_eq(jiffies, rq->deadline)) {-------這個(gè)request超時(shí)了
list_del_init(&rq->timeout_list);-------從request_queue的timeout_list中摘取出來
/*
* Check if we raced with end io completion
*/
if (!blk_mark_rq_complete(rq))---防止并發(fā)
blk_rq_timed_out(rq);------------處理這個(gè)超時(shí)的req
} else if (!*next_set || time_after(*next_timeout, rq->deadline)) {
*next_timeout = rq->deadline;
*next_set = 1;
}
}
blk_rq_timed_out 的處理如下:
static void blk_rq_timed_out(struct request *req)
{
struct request_queue *q = req->q;
enum blk_eh_timer_return ret;
ret = q->rq_timed_out_fn(req);---我們調(diào)用的是 scsi_times_out
switch (ret) {
case BLK_EH_HANDLED:
/* Can we use req->errors here? */
__blk_complete_request(req);
break;
case BLK_EH_RESET_TIMER:
blk_add_timer(req);
blk_clear_rq_complete(req);
break;
case BLK_EH_NOT_HANDLED:
/*
* LLD handles this for now but in the future
* we can send a request msg to abort the command
* and we can move more of the generic scsi eh code to
* the blk layer.
*/
break;
default:
printk(KERN_ERR "block: bad eh return: %d
", ret);
break;
}
}
scsi_times_out 函數(shù)有兩種主要的出口,先是 調(diào)用scsi_abort_command 將命令abort,如果abort失敗,則發(fā)送scsi_eh_scmd_add 將命令加入到 host的
eh_cmd_q list中,然后喚醒錯(cuò)誤處理線程處理。
scmd_eh_abort_handler-->
scsi_try_to_abort_cmd-->
兩類命令,一類是正常的命令,可能返回錯(cuò)誤,或者不返回,直接timeout,還有一種是錯(cuò)誤恢復(fù)的命令,這些命令也有可能錯(cuò)誤或者超時(shí),針對(duì)前一種,對(duì)應(yīng)的錯(cuò)誤處理函數(shù)為:
scsi_abort_command----超時(shí)之后
Scsi_Host:
enum scsi_host_state {
SHOST_CREATED = 1,
SHOST_RUNNING,
SHOST_CANCEL,
SHOST_DEL,
SHOST_RECOVERY,
SHOST_CANCEL_RECOVERY,
SHOST_DEL_RECOVERY,
};
host你可以理解為在主板上的sas芯片,比如2008或者3008,它的狀態(tài)非常重要,主要關(guān)注兩個(gè)狀態(tài),一個(gè)是running,一個(gè)是recovery,處于recovery狀態(tài)的host,那么它管理的所有硬盤,
這個(gè)時(shí)候都無法發(fā)送命令下去,也就是會(huì)阻塞該host下所有硬盤的io。而目前來看,只要是任何一條命令出現(xiàn)錯(cuò)誤,都會(huì)把host置為
scsi_host_set_state(shost, SHOST_RECOVERY),我覺得內(nèi)核對(duì)這個(gè)處理得比較武斷,我們經(jīng)常遇到一塊磁盤出現(xiàn)超時(shí)的情況,這個(gè)時(shí)候內(nèi)核會(huì)將整個(gè)host下所有的io阻塞住。
If all scmds either complete or fail, the number of in-flight scmds
becomes equal to the number of failed scmds - i.e. shost->host_busy ==
shost->host_failed. This wakes up SCSI EH thread. So, once woken up,
SCSI EH thread can expect that all in-flight commands have failed and
are linked on shost->eh_cmd_q.
對(duì)于LUN的定義位于中間層的scsi_device結(jié)構(gòu)體。而對(duì)于node的定義是中間層的scsi_target結(jié)構(gòu)體,channel沒有對(duì)應(yīng)的結(jié)構(gòu)體,如果對(duì)應(yīng)的是硬盤,則還有一個(gè)scsi_disk的抽象,光盤的話,則有一個(gè)類似的scsi_cd 結(jié)構(gòu)。
系統(tǒng)中也有可能同時(shí)存在多個(gè)SCSI控制芯片,比如常見的服務(wù)器帶jbod的方式接入存儲(chǔ),也即多個(gè)SCSIhost。對(duì)于如何定位每個(gè)LUN設(shè)備就需要一種編碼方式。根據(jù)拓?fù)浣Y(jié)構(gòu)可以很容易的知道定位的編碼方式是:host_id: channel_id: node_id:lun_id。這些ID的生成方式不討論,但是根據(jù)每個(gè)各設(shè)備的編號(hào)就可以定位到具體的單個(gè)lun設(shè)備了。
對(duì)于被加入到錯(cuò)誤處理完成隊(duì)列上的請(qǐng)求,若是在設(shè)備狀態(tài)正確,命令重試次數(shù)小于允許次數(shù)的情況下,這些命令將被重新加入到塊訪問請(qǐng)求隊(duì)列中,進(jìn)行重新處理;否則,直接進(jìn)行第二次回調(diào)處理,完成 SCSI 子系統(tǒng)對(duì)塊訪問請(qǐng)求的處理。這樣,SCSI 子系統(tǒng)就完成了 SCSI 命令錯(cuò)誤處理的整個(gè)過程。
static void scsi_softirq_done(struct request *rq)
{
struct scsi_cmnd *cmd = rq->special;
unsigned long wait_for = (cmd->allowed + 1) * rq->timeout;
int disposition;
INIT_LIST_HEAD(&cmd->eh_entry);
atomic_inc(&cmd->device->iodone_cnt);
if (cmd->result)
atomic_inc(&cmd->device->ioerr_cnt);
disposition = scsi_decide_disposition(cmd);
if (disposition != SUCCESS &&
time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
sdev_printk(KERN_ERR, cmd->device,
"timing out command, waited %lus
",
wait_for/HZ);
disposition = SUCCESS;
}
scsi_log_completion(cmd, disposition);
switch (disposition) {
case SUCCESS:
scsi_finish_command(cmd);
break;
case NEEDS_RETRY:
scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY);
break;
case ADD_TO_MLQUEUE:
scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
break;
default:
if (!scsi_eh_scmd_add(cmd, 0))
scsi_finish_command(cmd);
}
}
sd表示磁盤,(你可以使用scsi_disk簡寫的方式來記憶,對(duì)應(yīng)的模塊是sd_mod)sr表示光盤,st表示磁帶,sg表示通用,文件系統(tǒng)向下調(diào)用磁盤中的文件需要用到的是sd,而sg內(nèi)核驅(qū)動(dòng)的存在使我們可以不使用文件系統(tǒng),直接在用戶空間調(diào)用scsi命令,比如有一次crash,看到大多數(shù)命令都是REQ_TYPE_FS,但是有一個(gè)是dfs通過ioctl直接訪問硬盤,命令類型就是REQ_TYPE_BLOCK_PC。一個(gè)lun可能對(duì)應(yīng)一個(gè)sd或sr,也可能對(duì)應(yīng)一個(gè)級(jí)聯(lián)phy口。Linux中的SCSI層看起來只包含SCSI命令,并不完全實(shí)現(xiàn)標(biāo)準(zhǔn)的scsi協(xié)議,你可以把linux的scsi理解為符合協(xié)議的一個(gè)命令構(gòu)造,命令執(zhí)行,命令返回的控制層。
sd,sr等,都需要實(shí)例化一個(gè)scsi_driver 的對(duì)象,
struct scsi_driver {
struct module *owner;
struct device_driver gendrv;
void (*rescan)(struct device *);
int (*init_command)(struct scsi_cmnd *);
void (*uninit_command)(struct scsi_cmnd *);
int (*done)(struct scsi_cmnd *);
int (*eh_action)(struct scsi_cmnd *, int);
int (*scsi_mq_reserved1)(struct scsi_cmnd *);
void (*scsi_mq_reserved2)(struct scsi_cmnd *);
void (*rh_reserved)(void);
};
比如我們的sd,則實(shí)例化如下:
static struct scsi_driver sd_template = {
.owner = THIS_MODULE,
.gendrv = {
.name = "sd",
.probe = sd_probe,
.remove = sd_remove,
.shutdown = sd_shutdown,
.pm = &sd_pm_ops,
},
.rescan = sd_rescan,
.init_command = sd_init_command,
.uninit_command = sd_uninit_command,
.done = sd_done,-----------------------阮中斷回調(diào)
.eh_action = sd_eh_action,
};
正常返回時(shí):
0xffffffffc0273860 : sd_done+0x0/0x350 [sd_mod] 0xffffffff8147527d : scsi_finish_command+0xcd/0x140 [kernel] 0xffffffff8147f7b2 : scsi_softirq_done+0x142/0x190 [kernel]---這個(gè)就是req->q->softirq_done_fn 0xffffffff8130ec66 : blk_done_softirq+0x96/0xc0 [kernel]------處理io返回的軟中斷 0xffffffff810960ed : __do_softirq+0xfd/0x290 [kernel] 0xffffffff816cf45c : call_softirq+0x1c/0x30 [kernel] 0xffffffff8102d465 : do_softirq+0x65/0xa0 [kernel] 0xffffffff81096535 : irq_exit+0x175/0x180 [kernel] 0xffffffff810522b9 : smp_call_function_single_interrupt+0x39/0x40 [kernel] 0xffffffff816ceb77 : call_function_single_interrupt+0x87/0x90 [kernel]
scsi_finish_command 是一個(gè)很關(guān)鍵的函數(shù),比如清除上層request的定時(shí)器之類的動(dòng)作在這個(gè)函數(shù)中調(diào)用完成。
0xffffffff81307d10 : blk_finish_request+0x0/0x100 [kernel] 0xffffffff814800f6 : scsi_end_request+0x116/0x1e0 [kernel] 0xffffffff81480388 : scsi_io_completion+0x168/0x6a0 [kernel] 0xffffffff8147528c : scsi_finish_command+0xdc/0x140 [kernel] 0xffffffff8147f7b2 : scsi_softirq_done+0x142/0x190 [kernel] 0xffffffff8130ec66 : blk_done_softirq+0x96/0xc0 [kernel] 0xffffffff810960ed : __do_softirq+0xfd/0x290 [kernel] 0xffffffff816cf45c : call_softirq+0x1c/0x30 [kernel] 0xffffffff8102d465 : do_softirq+0x65/0xa0 [kernel] 0xffffffff81096535 : irq_exit+0x175/0x180 [kernel] 0xffffffff810522b9 : smp_call_function_single_interrupt+0x39/0x40 [kernel] 0xffffffff816ceb77 : call_function_single_interrupt+0x87/0x90 [kernel]
__blk_complete_request:
涉及到io的硬中斷回來之后,投遞軟中斷的cpu選擇。
硬中斷的回調(diào):
0xffffffff8147ec70 : scsi_done+0x0/0x60 [kernel] 0xffffffffc0166fd7 : _scsih_io_done+0x117/0x11a0 [mpt3sas] 0xffffffffc0156ad7 : _base_interrupt+0x247/0xc80 [mpt3sas] 0xffffffff81138c74 : __handle_irq_event_percpu+0x44/0x1c0 [kernel] 0xffffffff81138e22 : handle_irq_event_percpu+0x32/0x80 [kernel] 0xffffffff81138eac : handle_irq_event+0x3c/0x60 [kernel] 0xffffffff8113bbaf : handle_edge_irq+0x7f/0x150 [kernel] 0xffffffff8102d321 : handle_irq+0xe1/0x1c0 [kernel] 0xffffffff816d058d : __irqentry_text_start+0x4d/0xf0 [kernel] 0xffffffff816c4287 : ret_from_intr+0x0/0x15 [kernel]
對(duì)于用于錯(cuò)誤恢復(fù)的scsi命令,比如scsi_send_eh_cmnd 函數(shù),設(shè)置的scmd->scsi_done = scsi_eh_done;而正常下發(fā)的命令則一般是scsi_done.
上層從通用塊層接收到了數(shù)據(jù)訪問的請(qǐng)求,將其轉(zhuǎn)化為SCSI命令,這個(gè)命令在上層中定義為scsi_cmnd結(jié)構(gòu)體。然后調(diào)用中間層的scsi_host_template結(jié)構(gòu)體中定義的queuecommand接口,將此命令交付中層處理。在命令處理結(jié)束,本層的回調(diào)函數(shù)會(huì)被以軟中斷的形式調(diào)用,以處理與命令相關(guān)的后續(xù)操作和通知通用塊層該條命令的執(zhí)行結(jié)果。
root 1007 2 0 Feb26 ? 00:00:00 [scsi_eh_0] root 1019 2 0 Feb26 ? 00:00:00 [scsi_eh_1] root 1030 2 0 Feb26 ? 00:00:00 [scsi_eh_2] root 1036 2 0 Feb26 ? 00:00:00 [scsi_eh_3] root 1046 2 0 Feb26 ? 00:00:00 [scsi_eh_4] root 1054 2 0 Feb26 ? 00:00:00 [scsi_eh_5]
response
對(duì)CDB命令的響應(yīng)命令叫sense。但是這個(gè)響應(yīng)可不是自動(dòng)產(chǎn)生的,需要scsi設(shè)備主動(dòng)使用sense request命令去查詢。所以對(duì)于發(fā)送request方來說,命令的執(zhí)行結(jié)束分為兩個(gè)階段,發(fā)送成功和磁盤設(shè)備執(zhí)行成功。函數(shù)調(diào)用結(jié)束的狀態(tài)只表示是本機(jī)發(fā)送該命令的結(jié)果狀態(tài),而不表示實(shí)際磁盤設(shè)備的執(zhí)行情況。如果需要獲得執(zhí)行情況,需要去手動(dòng)獲取sense數(shù)據(jù)。
目前的linux的scsi實(shí)現(xiàn)就是這兩個(gè)階段的回調(diào),一個(gè)是處理本機(jī)處理結(jié)果,另一個(gè)是發(fā)送sense request查詢設(shè)備的執(zhí)行結(jié)果,才會(huì)繼續(xù)向下執(zhí)行。
SCSI Enclosure Services (SES)
參考資料:
Documentation/scsi/scsi_eh.txt
彪哥的博客《http://blog.chinaunix.net/uid-14528823-id-4924157.html》
總結(jié)
以上是生活随笔為你收集整理的linux scsi相关的一些学习笔记的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: IOS开发基础知识--碎片13
- 下一篇: Swift学习字符串、数组、字典