生活随笔
收集整理的這篇文章主要介紹了
Nodejs 批量检测 Excel 中url链接是否可访问
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
最近領(lǐng)導(dǎo)給我一個(gè)Excel表格,讓我檢查下其中的 Url 是否有效。作為一個(gè)開發(fā)人員,我內(nèi)心是拒絕的,一個(gè)一個(gè)點(diǎn)擊測試,估計(jì)手抽筋也測試不完。我想,人類的進(jìn)步,源于使用工具,那我就寫個(gè)程序跑一下吧,然后我去喝杯茶。
思路
讀取 Excel 中數(shù)據(jù)檢測 url 是否有效把數(shù)據(jù)寫入 Excel檢查核對下數(shù)據(jù)
Excel 格式示例
序號分類名稱網(wǎng)址狀態(tài)
| 1 | A類 | 某某網(wǎng) | http://www.xxx.com/ | 未檢測 |
| 2 | A類 | 某某網(wǎng) | http://www.xxx.com/ | 未檢測 |
| 3 | A類 | 某某網(wǎng) | http://www.xxx.com/xxx | 未檢測 |
擼起袖子,動(dòng)手干吧
根據(jù)思路, 整理下代碼吧
新建 index.js
const urlIsSuccess
= require('./urlIsSuccess')
const { readSheets
, getSheet
, sheetsBuffer
, writeFile
} = require('./xlsx')const entryfilePath
= `${ __dirname }/test.xlsx`
const outFilePath
= `${ __dirname }/out_test.xlsx`
const sheetIndex
= 0;
const urlColIndex
= 3
const statusColIndex
= 4 let sheets
= readSheets(entryfilePath
)
const sheet
= getSheet(sheets
, sheetIndex
) const isUrl = url => {if (!url
) return falsereturn /\s*http(s)?:\/\/.+/.test(url
)
}
let promises
= sheet
.map(rows => {const url
= rows
[ urlColIndex
]if (isUrl(url
)) {return urlIsSuccess(url
).then(res => {rows
[ statusColIndex
] = res
return Promise
.resolve(rows
)})}return Promise
.resolve(rows
)
})Promise
.all(promises
).then((res) => {sheets
[ sheetIndex
].data
= res
;const buffer
= sheetsBuffer(sheets
)writeFile(buffer
, outFilePath
)
}).catch(err => {console
.log('err=', err
)
})
操作 Excel 中數(shù)據(jù)
需要安裝node-xlsx npm install node-xlsx --save
新建 xlsx.js
const fs
= require("fs")
const xlsx
= require('node-xlsx').default
;
const readSheets = (filePath) => {if (!filePath
) return []return xlsx
.parse(filePath
)
}
const getSheet = (sheets, nameOrIndex) => {if (nameOrIndex
=== undefined) return sheets
if (typeof nameOrIndex
=== 'string') {return sheets
.find(sheet => nameOrIndex
=== sheet
.name
)?.data
}if (typeof nameOrIndex
=== 'number') {return sheets
[ nameOrIndex
]?.data
}throw new Error(`nameOrIndex 不合法: ${ nameOrIndex }`)
}const sheetsBuffer = (sheets) => {if (!sheets
) {throw new Error(`sheets 參數(shù)是必填項(xiàng)`)}return xlsx
.build(sheets
)
};
const writeFile = (buffer, outFilePath) => {fs
.writeFile(outFilePath
, buffer
, function (err) {if (err
) {console
.log("寫入失敗: " + err
);return;}console
.log("寫入完成");});
}module
.exports
= {readSheets
,getSheet
,sheetsBuffer
,writeFile
}
檢測 url 是否有效
使用 nodejs 的 child_process.exec(),調(diào)用 命令行curl命令, 根據(jù)返回的錯(cuò)誤狀態(tài)碼或者HTTP狀態(tài)碼,判斷 URL 是否有效
新建 urlIsSuccess.js
const child_process
= require("child_process");const urlIsSuccess = (url) => {if (!url
) {return Promise
.resolve('')}const userAgent
= '-A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36"'const curl
= `curl -I ${ userAgent } ${ url }`return new Promise((resolve, reject) => {child_process
.exec(curl
, function (err, stdout, stderr) {console
.log(url
, '完成')if (err
) {resolve(err
.code
|| '失敗')return}let status
= stdout
.split('\n')if (status
.length
) {let code
= status
[ 0 ].match(/\s(\d{3})\s/)if (code
&& code
[ 1 ]) {status
= +code
[ 1 ]}}if (!status
) {status
= stdout
;}resolve(status
|| '成功')});})}module
.exports
= urlIsSuccess
;
注意
有部分網(wǎng)站,做了人機(jī)檢測, 需要對請求頭等進(jìn)行配置,否則通過curl訪問會失敗
總結(jié)
以上是生活随笔為你收集整理的Nodejs 批量检测 Excel 中url链接是否可访问的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。