Linux/macOS 安装 Kaldi
文章目錄
-
- 一、關(guān)于 kaldi
- 二、安裝
-
- 1、下載源碼
- 2、查看 INSTALL 文件
-
- root -- INSTALL
- tools -- INSTALL
- 3、處理tools
-
- 報錯 ERROR: cannot verify
- 安裝 mkl
- 安裝 irstlm、kaldi_lm、openblas
- 4、處理 src
- 三、測試
-
- 報錯1:Bad FST header
- 報錯2:gmm-init-mono: command not found
- 報錯3:arpa2fst: command not found
一、關(guān)于 kaldi
Kaldi is a toolkit for speech recognition, intended for use by speech recognition researchers and professionals.
- 官網(wǎng) : https://www.kaldi-asr.org
- Github : https://github.com/kaldi-asr/kaldi
- 已有的模型:https://www.kaldi-asr.org/models.html
- 官方文檔:https://www.kaldi-asr.org/doc/
參考
- ubuntu 18.04 安裝Kaldi教程(總結(jié)安裝過程中碰到的坑)
https://zhuanlan.zhihu.com/p/148524930 - AssemblyAI / kaldi-install-tutorial
https://github.com/AssemblyAI/kaldi-install-tutorial/blob/main/setup.sh
二、安裝
1、下載源碼
你可以從 https://github.com/kaldi-asr/kaldi 直接下載;
也有用戶反饋是用這個版本更好:
git clone https://github.com/kaldi-asr/kaldi.git kaldi-trunk --origin golden
網(wǎng)絡(luò)不好可以在這里下載:https://download.csdn.net/download/lovechris00/87301550
2、查看 INSTALL 文件
root – INSTALL
根目錄下的 INSTALL 內(nèi)容為:
This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation.
[Option 1 in the following does not apply to native Windows install, see windows/INSTALL or following Option 2]Option 1 (bash + makefile):Steps:(1) go to tools/ and follow INSTALL instructions there.(2) go to src/ and follow INSTALL instructions there.Option 2 (cmake):Go to cmake/ and follow INSTALL.md instructions there.Note, it may not be well tested and some features are missing currently.
tools – INSTALL
tools 下的 INSTALL 文件內(nèi)容為:
To check the prerequisites for Kaldi, first run
extras/check_dependencies.sh
and see if there are any system-level installations you need to do. Check the output carefully. There are some things that will make your life a lot easier if you fix them at this stage. If your system default C++ compiler is not supported, you can do the check with another compiler by setting the CXX environment variable, e.g.
CXX=g++-4.8 extras/check_dependencies.sh
Then run
make
which by default will install ATLAS headers, OpenFst, SCTK and sph2pipe.
OpenFst requires a relatively recent C++ compiler with C++11 support, e.g.g++ >= 4.7, Apple clang >= 5.0 or LLVM clang >= 3.3.
If your system default compiler does not have adequate support for C++11, you can specify a C++11
compliant compiler as a command argument, e.g.
make CXX=g++-4.8
3、處理tools
從根目錄進(jìn)入 tools 文件夾
cd tools# 檢查
./extras/check_dependencies.sh
如果缺少什么包,這個腳本會提示你安裝;
macOS 下使用 brew install xxx 來安裝
編譯
make -j 4
運(yùn)行這個腳本,會下載第三方軟件包,并自動解壓;
如果后續(xù)軟件安裝失敗(沒有安裝、包大小有問題),可以再次執(zhí)行 make 命令;
沒有自動解壓的就手動解壓一下。
報錯 ERROR: cannot verify
ERROR: cannot verify www.openfst.org’s certificate, issued by ‘CN=R3,O=Let’s Encrypt,C=US’:
Issued certificate has expired.
To connect to www.openfst.org insecurely, use `–no-check-certificate’.
ERROR: cannot verify www.openslr.org’s certificate, issued by ‘CN=R3,O=Let’s Encrypt,C=US’:
此時需要修改 Makefile
找到 www.openfst.org 的位置:
以下是原來的內(nèi)容
openfst-$(OPENFST_VERSION).tar.gz:if [ -d "$(DOWNLOAD_DIR)" ]; then \cp -p "$(DOWNLOAD_DIR)/openfst-$(OPENFST_VERSION).tar.gz" .; \else \$(WGET) -nv -T 10 -t 1 http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-$(OPENFST_VERSION).tar.gz || \$(WGET) -nv -T 10 -t 3 -c https://www.openslr.org/resources/2/openfst-$(OPENFST_VERSION).tar.gz; \fi
在 (WGET) 命令后添加 --no-check-certificate 參數(shù):
$(WGET) -nv -T 10 -t 1 http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-$(OPENFST_VERSION).tar.gz --no-check-certificate || \$(WGET) -nv -T 10 -t 3 -c https://www.openslr.org/resources/2/openfst-$(OPENFST_VERSION).tar.gz --no-check-certificate; \
然后再次運(yùn)行 make 命令
make 第三方包
make openfst
make cub
make sclite
make sph2pipe
后面過程中如果出現(xiàn)報錯:you may not have installed OpenFst 一般都是因為這里沒有編譯好 OpenFst。
參考文章:https://blog.csdn.net/weixin_42103947/article/details/119842650
安裝 mkl
linux 可以使用下面命令安裝:
./extras/install_mkl.sh
Mac 上執(zhí)行命令會報錯:
./extras/install_mkl.sh: This script can be used on Linux only, and your system is Darwin.
Installer packages for Mac and Windows are available for download from Intel:
你需要前往下面網(wǎng)站下載:
https://software.intel.com/mkl/choose-download
這里我下載的是離線安裝包,點(diǎn)擊app安裝即可。
安裝 irstlm、kaldi_lm、openblas
sudo ./extras/install_irstlm.shsudo ./extras/install_kaldi_lm.shsudo ./extras/install_openblas.sh
4、處理 src
src 是和 tools 平行的 src 文件夾
從 tools 切換到 src
cd ../src
./configure --shared
如果使用 cuda,需要執(zhí)行以下代碼:
./configure --use-cuda --cudatk-dir=/usr/local/cuda-11.3/
你可以使用 nvcc -V 查看 cuda 版本。上面以 cuda-11.3為例。
cuda 一般安裝在 /usr/local/cuda-v.xx 下,這里設(shè)置為你的 cuda 地址就好。
否則后續(xù)make 過程會報以下錯誤:
fatal error: cublas_v2.h: No such file or directory #include <cublas_v2.h>
這個錯誤如果只是將 /usr/local/cuda-11.3/targets/x86_64-linux/lib 和 /usr/local/cuda-11.3/targets/x86_64-linux/include 添加到環(huán)境變量,是沒法解決的。
make depend -j 8make -j 8
三、測試
在kaldi目錄下
cd egs/yesno/s5
./run.sh
如果得到類似下方結(jié)果,代表基本運(yùn)行成功(kaldi安裝成功)
steps/diagnostic/analyze_lats.sh: see stats in exp/mono0a/decode_test_yesno/log/analyze_lattice_depth_stats.log
local/score.sh --cmd utils/run.pl data/test_yesno exp/mono0a/graph_tgpr exp/mono0a/decode_test_yesno
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
%WER 0.00 [ 0 / 232, 0 in , 0 del, 0 ub ] exp/mono0a/decode_te t_ye no/wer_10_0.0
報錯1:Bad FST header
如果你出現(xiàn)下述報錯:
ERROR: FstHeader::Read: Bad FST header: standard input
需要將 openfst bin 目錄添加到環(huán)境變量;
你也可以添加到 egs/yesno/s5/path.sh 下
export FST_PATH='/Users/xx/kaldi-trunk/tools/openfst-1.7.2/bin'
然后執(zhí)行
source path.sh
./run.sh
報錯2:gmm-init-mono: command not found
run.pl: job failed, log is in exp/mono0a/log/init.log
# gmm-init-mono --shared-phones=data/lang/phones/sets.int "--train-feats=ark,s,cs:apply-cmvn --utt2spk=ark:data/train_yesno/split1/1/utt2spk scp:data/train_yesno/split1/1/cmvn.scp scp:data/train_yesno/split1/1/feats.scp ark:- | add-deltas ark:- ark:- | subset-feats --n=10 ark:- ark:-|" data/lang/topo 39 exp/mono0a/0.mdl exp/mono0a/tree
# Started at Fri Dec 16 20:27:09 CST 2022
#
bash: line 1: gmm-init-mono: command not found
# Accounting: time=0 threads=1
# Ended (code 127) at Fri Dec 16 20:27:09 CST 2022, elapsed time 0 seconds
根據(jù)猜測,gmm-init-mono 是個命令工具,但終端找不到他的地址;
經(jīng)過搜索 kaldi 文件夾,可以發(fā)現(xiàn)它位于 src/gmmbin/gmm-init-mono 目錄下,那么將這個目錄添加到環(huán)境變量;
macOS 下是 ~/.bash_profile, linux 下是 ~/.bashrc
export GMMBIN_PATH='/Users/XX/XX/XX/kaldi-trunk/src/gmmbin'
報錯3:arpa2fst: command not found
local/prepare_lm.sh: line 13: arpa2fst: command not found
arpa2fst 命令位于 xx/kaldi-trunk/src/lmbin 目錄下,可以將這個目錄,添加到環(huán)境變量
export PATH=$PATH:~/scode/kaldi-trunk/src/lmbin
然后繼續(xù)執(zhí)行
source ~/.bash_profile
./run.sh
伊織 2022-12-16(五)
總結(jié)
以上是生活随笔為你收集整理的Linux/macOS 安装 Kaldi的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 语法制导的翻译
- 下一篇: Codeforces Round #70