當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ROS实战( 三 )利用科大讯飞tts实现ROS下语音合成播报

發(fā)布時(shí)間：2024/1/18 编程问答 43 豆豆

生活随笔收集整理的這篇文章主要介紹了 ROS实战( 三 )利用科大讯飞tts实现ROS下语音合成播报小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

一.前言

繼上篇博客的內(nèi)容,下面主要介紹流程:

我們從圖中可以看出,首先xf_tts節(jié)點(diǎn)訂閱了/voice/xf_tts_topic這個(gè)話題,這個(gè)話題的類型是std_msgs/String,然后通過調(diào)用科大訊飛在線合成代碼形成節(jié)點(diǎn)將收到的文本輸入進(jìn)語音合成文件,文件類型是.wav,最后通過system函數(shù)來調(diào)用play命令,來播放.wav文件.

二.操作流程

首先默認(rèn)你安裝了ros,,并配置好了相關(guān)的路徑和環(huán)境,沒有安裝的參考我這篇博客
https://blog.csdn.net/weixin_40522162/article/details/79244089
打開終端,cd到ros工作路徑下源文件下,即/catkin_ws/src下,首先創(chuàng)建一個(gè)自己的包

catkin_create_pkg voice_system roscpp rospy std_msgs cd到你下載的科大訊飛sdk下,即/samples下 cp tts_sample.c ~/catkin_ws/src/voice_system/src/ cd ../../include cp * ~/catkin_ws/src/voice_system/include/ cd ~/catkin_ws/src/voice_system/src mv tts_sample.c xf_tts.cpp vim xf_tts.cpp

未修改的tts_sample.c如下:

/* * 語音合成（Text To Speech，TTS）技術(shù)能夠自動將任意文字實(shí)時(shí)轉(zhuǎn)換為連續(xù)的 * 自然語音，是一種能夠在任何時(shí)間、任何地點(diǎn)，向任何人提供語音信息服務(wù)的 * 高效便捷手段，非常符合信息時(shí)代海量數(shù)據(jù)、動態(tài)更新和個(gè)性化查詢的需求。 */#include <stdio.h> #include <string.h> #include <stdlib.h> #include <unistd.h>#include "qtts.h" #include "msp_cmn.h" #include "msp_errors.h"/* wav音頻頭部格式 */ typedef struct _wave_pcm_hdr {char riff[4]; // = "RIFF"int size_8; // = FileSize - 8char wave[4]; // = "WAVE"char fmt[4]; // = "fmt "int fmt_size; // = 下一個(gè)結(jié)構(gòu)體的大小 : 16short int format_tag; // = PCM : 1short int channels; // = 通道數(shù) : 1int samples_per_sec; // = 采樣率 : 8000 | 6000 | 11025 | 16000int avg_bytes_per_sec; // = 每秒字節(jié)數(shù) : samples_per_sec * bits_per_sample / 8short int block_align; // = 每采樣點(diǎn)字節(jié)數(shù) : wBitsPerSample / 8short int bits_per_sample; // = 量化比特?cái)?shù): 8 | 16char data[4]; // = "data";int data_size; // = 純數(shù)據(jù)長度 : FileSize - 44 } wave_pcm_hdr;/* 默認(rèn)wav音頻頭部數(shù)據(jù) */ wave_pcm_hdr default_wav_hdr = {{ 'R', 'I', 'F', 'F' },0,{'W', 'A', 'V', 'E'},{'f', 'm', 't', ' '},16,1,1,16000,32000,2,16,{'d', 'a', 't', 'a'},0 }; /* 文本合成 */ int text_to_speech(const char* src_text, const char* des_path, const char* params) {int ret = -1;FILE* fp = NULL;const char* sessionID = NULL;unsigned int audio_len = 0;wave_pcm_hdr wav_hdr = default_wav_hdr;int synth_status = MSP_TTS_FLAG_STILL_HAVE_DATA;if (NULL == src_text || NULL == des_path){printf("params is error!\n");return ret;}fp = fopen(des_path, "wb");if (NULL == fp){printf("open %s error.\n", des_path);return ret;}/* 開始合成 */sessionID = QTTSSessionBegin(params, &ret);if (MSP_SUCCESS != ret){printf("QTTSSessionBegin failed, error code: %d.\n", ret);fclose(fp);return ret;}ret = QTTSTextPut(sessionID, src_text, (unsigned int)strlen(src_text), NULL);if (MSP_SUCCESS != ret){printf("QTTSTextPut failed, error code: %d.\n",ret);QTTSSessionEnd(sessionID, "TextPutError");fclose(fp);return ret;}printf("正在合成 ...\n");fwrite(&wav_hdr, sizeof(wav_hdr) ,1, fp); //添加wav音頻頭，使用采樣率為16000while (1) {/* 獲取合成音頻 */const void* data = QTTSAudioGet(sessionID, &audio_len, &synth_status, &ret);if (MSP_SUCCESS != ret)break;if (NULL != data){fwrite(data, audio_len, 1, fp);wav_hdr.data_size += audio_len; //計(jì)算data_size大小}if (MSP_TTS_FLAG_DATA_END == synth_status)break;printf(">");usleep(150*1000); //防止頻繁占用CPU}printf("\n");if (MSP_SUCCESS != ret){printf("QTTSAudioGet failed, error code: %d.\n",ret);QTTSSessionEnd(sessionID, "AudioGetError");fclose(fp);return ret;}/* 修正wav文件頭數(shù)據(jù)的大小 */wav_hdr.size_8 += wav_hdr.data_size + (sizeof(wav_hdr) - 8);/* 將修正過的數(shù)據(jù)寫回文件頭部,音頻文件為wav格式 */fseek(fp, 4, 0);fwrite(&wav_hdr.size_8,sizeof(wav_hdr.size_8), 1, fp); //寫入size_8的值fseek(fp, 40, 0); //將文件指針偏移到存儲data_size值的位置fwrite(&wav_hdr.data_size,sizeof(wav_hdr.data_size), 1, fp); //寫入data_size的值fclose(fp);fp = NULL;/* 合成完畢 */ret = QTTSSessionEnd(sessionID, "Normal");if (MSP_SUCCESS != ret){printf("QTTSSessionEnd failed, error code: %d.\n",ret);}return ret; }int main(int argc, char* argv[]) {int ret = MSP_SUCCESS;const char* login_params = "appid = 5afcee34, work_dir = .";//登錄參數(shù),appid與msc庫綁定,請勿隨意改動/** rdn: 合成音頻數(shù)字發(fā)音方式* volume: 合成音頻的音量* pitch: 合成音頻的音調(diào)* speed: 合成音頻對應(yīng)的語速* voice_name: 合成發(fā)音人* sample_rate: 合成音頻采樣率* text_encoding: 合成文本編碼格式**/const char* session_begin_params = "voice_name = xiaoyan, text_encoding = utf8, sample_rate = 16000, speed = 50, volume = 50, pitch = 50, rdn = 2";const char* filename = "tts_sample.wav"; //合成的語音文件名稱const char* text = "親愛的用戶，您好，這是一個(gè)語音合成示例，感謝您對科大訊飛語音技術(shù)的支持！科大訊飛是亞太地區(qū)最大的語音上市公司，股票代碼：002230"; //合成文本/* 用戶登錄 */ret = MSPLogin(NULL, NULL, login_params);//第一個(gè)參數(shù)是用戶名，第二個(gè)參數(shù)是密碼，第三個(gè)參數(shù)是登錄參數(shù)，用戶名和密碼可在http://www.xfyun.cn注冊獲取if (MSP_SUCCESS != ret){printf("MSPLogin failed, error code: %d.\n", ret);goto exit ;//登錄失敗，退出登錄}printf("\n###########################################################################\n");printf("## 語音合成（Text To Speech，TTS）技術(shù)能夠自動將任意文字實(shí)時(shí)轉(zhuǎn)換為連續(xù)的 ##\n");printf("## 自然語音，是一種能夠在任何時(shí)間、任何地點(diǎn)，向任何人提供語音信息服務(wù)的 ##\n");printf("## 高效便捷手段，非常符合信息時(shí)代海量數(shù)據(jù)、動態(tài)更新和個(gè)性化查詢的需求。 ##\n");printf("###########################################################################\n\n");/* 文本合成 */printf("開始合成 ...\n");ret = text_to_speech(text, filename, session_begin_params);if (MSP_SUCCESS != ret){printf("text_to_speech failed, error code: %d.\n", ret);}printf("合成完畢\n");exit:printf("按任意鍵退出 ...\n");getchar();MSPLogout(); //退出登錄return 0; }

將它修改成ros的節(jié)點(diǎn)文件如下:

/* * 語音合成（Text To Speech，TTS）技術(shù)能夠自動將任意文字實(shí)時(shí)轉(zhuǎn)換為連續(xù)的 * 自然語音，是一種能夠在任何時(shí)間、任何地點(diǎn)，向任何人提供語音信息服務(wù)的 * 高效便捷手段，非常符合信息時(shí)代海量數(shù)據(jù)、動態(tài)更新和個(gè)性化查詢的需求。 */#include <stdio.h> #include <string.h> #include <stdlib.h> #include <unistd.h> #include <ros/ros.h> #include <std_msgs/String.h> #include "qtts.h" #include "msp_cmn.h" #include "msp_errors.h"const char* fileName="/home/zc/Music/voice.wav"; const char* playPath="play /home/zc/Music/voice.wav";/* wav音頻頭部格式 */ typedef struct _wave_pcm_hdr {char riff[4]; // = "RIFF"int size_8; // = FileSize - 8char wave[4]; // = "WAVE"char fmt[4]; // = "fmt "int fmt_size; // = 下一個(gè)結(jié)構(gòu)體的大小 : 16short int format_tag; // = PCM : 1short int channels; // = 通道數(shù) : 1int samples_per_sec; // = 采樣率 : 8000 | 6000 | 11025 | 16000int avg_bytes_per_sec; // = 每秒字節(jié)數(shù) : samples_per_sec * bits_per_sample / 8short int block_align; // = 每采樣點(diǎn)字節(jié)數(shù) : wBitsPerSample / 8short int bits_per_sample; // = 量化比特?cái)?shù): 8 | 16char data[4]; // = "data";int data_size; // = 純數(shù)據(jù)長度 : FileSize - 44 } wave_pcm_hdr;/* 默認(rèn)wav音頻頭部數(shù)據(jù) */ wave_pcm_hdr default_wav_hdr = {{ 'R', 'I', 'F', 'F' },0,{'W', 'A', 'V', 'E'},{'f', 'm', 't', ' '},16,1,1,16000,32000,2,16,{'d', 'a', 't', 'a'},0 }; /* 文本合成 */ int text_to_speech(const char* src_text, const char* des_path, const char* params) {int ret = -1;FILE* fp = NULL;const char* sessionID = NULL;unsigned int audio_len = 0;wave_pcm_hdr wav_hdr = default_wav_hdr;int synth_status = MSP_TTS_FLAG_STILL_HAVE_DATA;if (NULL == src_text || NULL == des_path){printf("params is error!\n");return ret;}fp = fopen(des_path, "wb");if (NULL == fp){printf("open %s error.\n", des_path);return ret;}/* 開始合成 */sessionID = QTTSSessionBegin(params, &ret);if (MSP_SUCCESS != ret){printf("QTTSSessionBegin failed, error code: %d.\n", ret);fclose(fp);return ret;}ret = QTTSTextPut(sessionID, src_text, (unsigned int)strlen(src_text), NULL);if (MSP_SUCCESS != ret){printf("QTTSTextPut failed, error code: %d.\n",ret);QTTSSessionEnd(sessionID, "TextPutError");fclose(fp);return ret;}printf("正在合成 ...\n");fwrite(&wav_hdr, sizeof(wav_hdr) ,1, fp); //添加wav音頻頭，使用采樣率為16000while (1) {/* 獲取合成音頻 */const void* data = QTTSAudioGet(sessionID, &audio_len, &synth_status, &ret);if (MSP_SUCCESS != ret)break;if (NULL != data){fwrite(data, audio_len, 1, fp);wav_hdr.data_size += audio_len; //計(jì)算data_size大小}if (MSP_TTS_FLAG_DATA_END == synth_status)break;printf(">");usleep(15*1000); //防止頻繁占用CPU}printf("\n");if (MSP_SUCCESS != ret){printf("QTTSAudioGet failed, error code: %d.\n",ret);QTTSSessionEnd(sessionID, "AudioGetError");fclose(fp);return ret;}/* 修正wav文件頭數(shù)據(jù)的大小 */wav_hdr.size_8 += wav_hdr.data_size + (sizeof(wav_hdr) - 8);/* 將修正過的數(shù)據(jù)寫回文件頭部,音頻文件為wav格式 */fseek(fp, 4, 0);fwrite(&wav_hdr.size_8,sizeof(wav_hdr.size_8), 1, fp); //寫入size_8的值fseek(fp, 40, 0); //將文件指針偏移到存儲data_size值的位置fwrite(&wav_hdr.data_size,sizeof(wav_hdr.data_size), 1, fp); //寫入data_size的值fclose(fp);fp = NULL;/* 合成完畢 */ret = QTTSSessionEnd(sessionID, "Normal");if (MSP_SUCCESS != ret){printf("QTTSSessionEnd failed, error code: %d.\n",ret);}return ret; } /* make topic callback to wav file */ void makeTextToWav(const char* text,const char* filename) {int ret=MSP_SUCCESS;const char* login_params="appid = 5afcee34, work_dir = .";//登錄參數(shù),appid與msc庫綁定,請勿隨意改動/** rdn: 合成音頻數(shù)字發(fā)音方式* volume: 合成音頻的音量* pitch: 合成音頻的音調(diào)* speed: 合成音頻對應(yīng)的語速* voice_name: 合成發(fā)音人* sample_rate: 合成音頻采樣率* text_encoding: 合成文本編碼格式**/const char* session_begin_params = "voice_name = xiaowanzi, text_encoding = utf8, sample_rate = 16000, speed = 60, volume = 60, pitch = 50, rdn = 0";/* const char* filename = "tts_sample.wav"; //合成的語音文件名稱const char* text = "大家好，我叫小倩，我今年15歲,我的車牌號是A123456,今天是2018年5月17號"; //合成文本 *//* 用戶登錄 */ret = MSPLogin(NULL, NULL, login_params);//第一個(gè)參數(shù)是用戶名，第二個(gè)參數(shù)是密碼，第三個(gè)參數(shù)是登錄參數(shù)，用戶名和密碼可在http://www.xfyun.cn注冊獲取if (MSP_SUCCESS != ret){printf("MSPLogin failed, error code: %d.\n", ret);}/* 文本合成 */ else {printf("開始合成 ...\n");ret = text_to_speech(text, filename, session_begin_params);if (MSP_SUCCESS != ret){printf("text_to_speech failed, error code: %d.\n", ret);}printf("合成完畢\n"); }MSPLogout(); } /* play compose.wav file */ void playWav() { system(playPath); } /* topic auto invoke,make text to wav file,then play wav file */ void topicCallBack(const std_msgs::String::ConstPtr& msg) { std::cout<<"get topic text:"<<msg->data.c_str();makeTextToWav(msg->data.c_str(),fileName);playWav();} int main(int argc, char* argv[]) { const char* start = "科大訊飛在線語音合成模塊啟動成功"; makeTextToWav(start,fileName); playWav();ros::init(argc,argv,"xf_tts_node");ros::NodeHandle nd;ros::Subscriber sub = nd.subscribe("/voice/xf_tts_topic",3,topicCallBack);ros::spin();return 0; }

總的修改思路如下:將main函數(shù)里的內(nèi)容剪切,然后放進(jìn)另一個(gè)自己定義的函數(shù)里,那么main函數(shù)里面空了.參考發(fā)布節(jié)點(diǎn)的main函數(shù)里的格式,在main函數(shù)里面加入初始化,句柄,回調(diào)函數(shù),訂閱者等內(nèi)容,另外開頭加上ROS的頭文件以及消息類型的頭文件,具體參考如下wiki上的編寫訂閱器節(jié)點(diǎn)文件:

#include "ros/ros.h" #include "std_msgs/String.h"/*** This tutorial demonstrates simple receipt of messages over the ROS system.*/ void chatterCallback(const std_msgs::String::ConstPtr& msg) {ROS_INFO("I heard: [%s]", msg->data.c_str()); }int main(int argc, char **argv) {/*** The ros::init() function needs to see argc and argv so that it can perform* any ROS arguments and name remapping that were provided at the command line. For programmatic* remappings you can use a different version of init() which takes remappings* directly, but for most command-line programs, passing argc and argv is the easiest* way to do it. The third argument to init() is the name of the node.** You must call one of the versions of ros::init() before using any other* part of the ROS system.*/ros::init(argc, argv, "listener");/*** NodeHandle is the main access point to communications with the ROS system.* The first NodeHandle constructed will fully initialize this node, and the last* NodeHandle destructed will close down the node.*/ros::NodeHandle n;/*** The subscribe() call is how you tell ROS that you want to receive messages* on a given topic. This invokes a call to the ROS* master node, which keeps a registry of who is publishing and who* is subscribing. Messages are passed to a callback function, here* called chatterCallback. subscribe() returns a Subscriber object that you* must hold on to until you want to unsubscribe. When all copies of the Subscriber* object go out of scope, this callback will automatically be unsubscribed from* this topic.** The second parameter to the subscribe() function is the size of the message* queue. If messages are arriving faster than they are being processed, this* is the number of messages that will be buffered up before beginning to throw* away the oldest ones.*/ros::Subscriber sub = n.subscribe("chatter", 1000, chatterCallback);/*** ros::spin() will enter a loop, pumping callbacks. With this version, all* callbacks will be called from within this thread (the main one). ros::spin()* will exit when Ctrl-C is pressed, or the node is shutdown by the master.*/ros::spin(); return 0; }

最后cmakelist里加上要執(zhí)行的文件以及依賴庫

add_executable(xf_tts_node src/xf_tts.cpp) target_link_libraries(xf_tts_node ${catkin_LIBRARIES} -lmsc -lrt -ldl -lpthread)

以及前面include_directories里加上include

include_directories(include${catkin_INCLUDE_DIRS} )

這樣就行了

三.測試運(yùn)行

開多個(gè)終端
第一個(gè)終端

roscore

第二個(gè)終端

rosrun voice_system xf_tts_node

可以聽見科大訊飛在線語音模塊啟動成功的聲音
這時(shí)可以打開topic看看有哪些節(jié)點(diǎn)在運(yùn)行,以及手動用命令發(fā)布消息讓其播報(bào)
命令如下:

rostopic list rostopic info /voice/xf_tts_topic rostopic pub /voice/xf_tts_topic std_msgs/String "你好"

貼圖如下:

這時(shí)能聽見你好的聲音,說明成功了.程序改的沒問題.

總結(jié)

以上是生活随笔為你收集整理的ROS实战( 三 )利用科大讯飞tts实现ROS下语音合成播报的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：学习defi备忘
下一篇：计算机不能取代老师的英语作文,电脑是否会