桂花蒸

# Sixth Lichuang Electric Competition#Intelligent Voice Minion

 
Overview

* 1. Introduction to project functions

An intelligent voice minion created using the Qiyingtailun offline voice module. With an MP3-TF-16P module, it can play offline audio resources, such as children's music and fairy tales, and can also imitate fire trucks, police cars, ambulances, as well as lanterns and night lights.

Sample voice content for this project:

1. Hello and good night

2. Minion’s magical laughter

3. Imitate fire trucks, police cars, and ambulances

4. Children’s poetry recitation

5. Idiom stories

6. Fairy tales

7. Children’s songs

8. Children’s songs

9. Play specific songs

Other functions:

RGB lights, night lights, eye movements, audio playback, eye LED rhythm

 

*2. Project attributes

Original published for the first time.

 

* 3. Open source agreement

GPL3.0

 

*4. Hardware part


Offline speech recognition uses Qiyingtailun's CI-C22GS02S module. In order to realize the playback of TF card offline resources, the MP3 module model MP3-TF-16P is used together. The voice module sends commands through serial port 1 for control. .

This project has gone through 4 versions of iterations:

  1. The first version tested the basic functions, but the LED rhythm function was not normal.

This version uses two speakers, one for playing the response words of the voice module and one for playing the MP3 module resources. The MP3 module speaker output is designed with a voltage doubler circuit for audio LED rhythm. However, because the LEDs are connected in series and the voltage is insufficient, the effect is almost invisible after being installed inside the casing.

  1. In the second version, the capacitor of the LED rhythm voltage doubler circuit was changed to an electrolytic capacitor, but the effect was still not good.
  2. The third edition has been significantly improved in both software and hardware:
    1.  Directly use the MP3 module to play the response words, which can remove the speaker of the voice module and reduce the original 2 speakers to 1. The sound quality of response word playback is also greatly improved compared to using the voice module. Another benefit brought by this is that if the command words are not changed, but only the broadcast words are updated, there is no need to recompile and burn the firmware, and only the audio files corresponding to the broadcast words of the MP3 module are updated.
    2. The LED rhythm voltage doubling circuit is changed to 4 times the voltage, and the rhythm function can work normally (see demonstration video).
    3. Added voice command feedback for physical eye movements. This is achieved by controlling a small electromagnet. There was a small bug in the first design, but the debugging was successful after cutting and flying wires. The fourth edition fixed this line issue.
    4. Added TP4056 lithium battery charging circuit.
    5. An RGB color LED light strip is added, three of which are placed on the back to emit light outwards, and the other light beads emit light internally. The RGB color light code uses part of the code of the official SDKcolor_light component, but the HSV related code is removed because the math library is introduced after adding the HSV code, which will cause the firmware size to be too large and compile errors.
    6. Add police lights, colored lights, and night light functions.
    7. Added a power toggle switch.
    8. In order to realize the above functions, all available GPIOs of the voice module are used. Including RX0/TX0/RX1/TX1/MCLK and three-way PWM.
  3. The fourth edition has been upgraded as follows: a. Fixed the problem of the electromagnet circuit.    b. Adjusted the socket type of RGB LED.    c. Increased the number of RGB LEDs.    d. The night light turns on the RGB light strip and eye LED at the same time.    e. Add a jumper to the MP3 BUSY pin. This pin needs to be disconnected when burning, otherwise it will interfere with the serial port during burning and cause the firmware download to fail.    f. Goodnight automatically turns on the night light.
       




  4. In order not to change the PCB area of ​​the third version, a very small number of components were attached to the back of the fourth version.

 

 

*5. Software part


Offline resources, including audio files for broadcasting sounds, are placed on the TF card of the MP3 module. Audio resources of the same type are placed in the same folder. The voice module controls the MP3 module to play the specified file through serial port commands.

-- == Simple Development Guide == --

Get SDK

1. Register an account on Qiyingtailun AI platform .

2. Log in to the platform account
Log in

3. As shown in the figure below, select "Development Data" on the left menu bar to enter the main interface of development data;

Entrance

 

4. Step 3: As shown in the figure below, select "Software and Firmware (SDK Development Kit, Standard Demo Firmware, etc.)";

download

5. Select the SDK version you need to download, as shown in the figure below (this figure takes "CI112X_SDK_V1.2.9" as an example);

GGsbITFWXvYiIkuf109IUKG0gv05K8Jeoxy3AFHp.png

6. As shown in the picture below, click "CI112X_SDK_V1.2.9.zip" to download;

ADq7LRbibaWYpcr2Lg4C2ei9l5qvFCbzBpawaW3g.png

7. As shown in the figure below, wait for the loading to complete and then save the file;

Please note: Do not exit or refresh the interface during the loading process, otherwise the production progress will be terminated!

cDqAKHwJC5daMmSqWv5v2FoOkY1fuNuAHZEk7orm.png

8. Select the specified folder to store the SDK compressed package;

 

Get IDE

The official has configured a green version of Eclipse. On the "Development Materials" - "Related Tools and Manuals" page, obtain the Baidu Netdisk download address.

 

vtdmlxA0sRE8M4gwvkFV1168tiAYpbvCTg8fm94g.png

 

software development

Unzip the downloaded Eclipse IDE and SDK separately (note that it must be an English folder), then use Eclipse to import the SDK project to start development.

The key is to modify the user_msg_deal.c file and execute the function based on cmd_id. cmd_id is the serial number in front of the "command word list" in the "Language Model", where cmd_id starting from 2 is the command word (cmd_id=1 is the wake-up word).

The voice module uses freeRTOS, and the programming language is C language.

 

The core business logic code is as follows:

8eI9F7tvp0ZNW6SI6awKYNg0994eGSK8B98dClte.png

I have made many encapsulations of the play_chengyu() function, but its essence is to play a folder in a loop (the folder parameter dir is the folder name in numeric form. I put the same type of files in the numeric folder, such as 01, 02, 03,04, which is the requirement of MP3 module for file access).

#define DIR_CYCLING_CMD         0x17
mp3_send_cmd(DIR_CYCLING_CMD, 0x00, dir);

The function to send commands from the serial port is as follows:

void mp3_send_cmd (uint8_t cmd, uint8_t high_arg, uint8_t low_arg) {
uint8_t i;
uint16_t checksum;
mp3_cmd_buf[3] = cmd;
mp3_cmd_buf[5] = high_arg;
mp3_cmd_buf[6] = low_arg;
checksum = mp3_checksum();
mp3_cmd_buf[7] = (uint8_t) ((checksum >> 8) & 0xFF);
mp3_cmd_buf[8] = (uint8_t) (checksum & 0xFF);
// Send command to UART1
for (i=0; i<10; i++) {
        UartPollingSenddata(UART1,mp3_cmd_buf[i]);
    }

For the check calculation part, please refer to the debugging manual of the MP3 module document.

 

For more information on voice module SDK development, please refer to Qiyingtailun AI platform documentation .

 

Create a language model

For details, please refer to the relevant sections of the Beginner’s Guide in the Document Center.

First enter the Qiyingtailun AI platform , select language model in the left menu bar, click Create, and enter the main interface for language model production.

ul0VgEfHUsx4BgQ1Gl6w3Lmhvos7XhDqNVBh3fS0.png

Language model production

  • ①Project name: Fill in the name corresponding to the language model (please fill it in correctly to facilitate search if you need to find it later);
  • ②Chip model: Select the corresponding chip model (if you don’t know the chip model, you can browse ☞Hardware Selection Guide );
  • ③ Product type: Select the product that the language model is used for. There are currently more than 100 product types. Users can click the "Search More" button in the drop-down list to find products that match the language model. If they are not in the list, they can select " other";
  • ④Language type: Select the language corresponding to the language model. Currently, there are Chinese, English, Chinese-English, and Japanese for users to choose from;
  • ⑤Acoustic model type: After selecting the chip model and language type, the corresponding available acoustic models will automatically appear in the drop-down list, and users can choose according to their own needs;
  • ⑥ Upload command word production file: Users can fill in the command words they want to generate a language model into a file in the required format, click this button to upload it to the platform, and the platform uses the file to create the corresponding language model;
  • ⑦Download sample: Provides a template for command word production files. According to the user's language type selection, a download link for the corresponding language sample will be automatically generated;
  • ⑧Add detail lines: You can also click here to directly edit the command word list.
  • ⑨Save or discard: After confirming that it is correct, select Save to generate the file.
    kOXa0x0DVZBCSHoKXZIZbuJ9cPiYL3IVcPqkUYua.png
    ITgnb5vTJ2zcHBlgZjcwAxGdnFx3T6t22ajcHC6L.png
    Examples of command words:

    The Chinese sample is as follows:

    Chinese sample

    The English sample is as follows:

    English sample

    Japanese sample is as follows:

    Japanese sample

    保存后即可下载声学模型和语言模型,声学模型只需要第一次下载然后合并到固件即可。未来如果同一产品的命令词有更新,只需重新建立一个语言模型,然后再次下载并更新语言模型即可,声学模型可以保持不变。

 

播报音合成

点击左侧的菜单中的“播报音合成”进入该模块,如下图所示:

Overview

请用户参考如下流程进行操作(注意下方数字标识对应下图中的数字标识),便可进入定制播报音的主界面:

  • ①选择菜单栏左侧的“播报音合成”按键
  • ②点击创建,即可进入主界面。

interface

播报音合成界面说明

  • ①语音合成项目名称:用户填写对应的项目名称;
  • ②语言类型:目前提供中文以及英文播报音制作;
  • ③人声分类:目前提供成年男声,成年女声,男童声,女童声这四种人声制作;
  • ④语速:有20个等级,默认等级为10,等级越高语速越快;
  • ⑤合成人声:选择完成人声分类后,选择对应的人声;
  • ⑥下载样例:选择语言类型后,提供制作表格样例的下载链接;
  • ⑦音量:有20个等级,默认等级为10,等级越高音量越大;
  • ⑧语音合成文件上传:用户可以按要求的格式将希望生成播报音的词条填入一个文件中,按此按钮上传到平台中,平台使用该文件制作对应的播报音;
  • ⑨试听样音:用户可以根据自己的合成人声进行样音试听;
  • ⑩填写说明:用户可以及时查看相应的说明;
  • ⑪保存or丢弃:确认无误后,选择保存,进行固件生成。

Production process

填写说明

  1. 该功能窗口可将文本批量转换成SDK中需要的播报语音。
  2. 根据需求选择相应的参数,点击试听按钮试听。
  3. 标注“推荐“字样的为推荐发音人。
  4. 语速:0最快-20最慢,推荐值10。
  5. 音量:0最小-10最大,推荐值10。

注意

  1. 上传的EXCEL中,第一列为音频序号,第二列为音频名,第三列为待合成文本。
  2. 音频名不宜过长且不能包含空格,待合成文本不宜超过四十字。
  3. 现仅支持上传EXCEL文件,请在“样例中”下载EXCEL模板。

 

传的播报音样例文件格式模板可以通过创建表单界面中的“下载样例”获取。用户可以按照该模板的格式,填写需要的播报语句,以及其内容,保存后上传。

中文的样例如下图所示:

Chinese

英文的样例如下图所示:

English

表单的提交

用户在合成播报音时,需先新建表单,填写好表单中对应的内容,如下图所示:

Fill out the form

填写完成后,点击“上传你的文件”,上传已经按照规范做好的excel文件。

form submission

上传完成后,点击左上角的“保存”按键;

Customized firmware save

等待文件被平台加载;

load

请注意:加载的过程当中请勿退出或者刷新界面,否则制作进度将会终止!

播报音下载

播报音合成成功后,选择“下载语音合成文件”,便可得到生成的播报音。

Firmware download

语言模型配置文件修改

1. 打开下载的语言模型文件夹中 CmdWordStructure 目录下配置文件[60000]{cmd_info}.xls

2. 将<0>cmd这个表格中的“播报音1ID”,改为从0开始依次递增。如果你需要开机播放欢迎语,那就把<welcome>一行的播报音ID改为一个与表格上方不重复的数字,本例中改为68。<Inactive>和<beep>播报音ID改为文件名前缀不会使用的一个大数字,比如1000。

ymNu6BPjFyAWUbi4EMSoe3CvA0hThuXNhept4FJj.png

9WeZuXn6ACkD7EvBZh8R9xTkbFmiaAOO5KVrJd3d.png

3. 将<1>wake表格中的“播报音1ID”的第一行改为0,即唤醒词“小黄人”的对应播报音ID为0,与第一个表格一致。如果你需要开机播放欢迎语,那就把<welcome>一行的播报音ID改为<0>cmd表格中<welcome>一行相同的播报音ID,上面是68,所以这里也是68。<Inactive>和<beep>播报音ID改为文件名前缀不会使用的一个大数字,比如1000。

P1TKGiON3zkUAyiR6iuQRn99PRWc3hDYWDJ1zirk.png

播报音文件名修改

解压上面合成并下载的播报音压缩文件,里面的文件名前缀应该是对应了上面那个<0>cmd表格中的播报音ID。

9o8y3G1FapjayDGCocnkYtEpXXtwu2XedCLEuSxq.png 

如果你在上面设置了欢迎语播报音ID,也可以在上面播报音合成步骤,将它加入播报音合成列表。或者也可以使用一段音乐作为开机欢迎语,关键是要将它的前缀改为<welcome>那一行设置的播报音ID,在上例中我们设置开机欢迎语播报音ID为68,那么开机欢迎语的文件名就应该是"[68]开机欢迎.wav"或者"[68]开机欢迎.mp3"。

 

注:上述步骤是将播报音合成到固件中。而本项目中使用MP3模块播放所有播报音,所以本项目是把所有播报音放在TF卡上,由操作系统发出命令来控制MP3模块播放指定文件。虽然我使用MP3模块播放播报音,但是固件合成仍然需要对应的播报音(既然实际不使用模块播报,可以使用任意的文件替代,但是不能没有),如果固件内缺少播报音会无法正常运行。播放指定文件夹串口命令,请参考上文代码。

 

声学模型、语言模型、播报音文件替换

  •  语言模型压缩包内各文件夹说明

CmdWordStructure:该文件夹存放的内容为{cmd_info}表格,为用户词条置信度相关参数配置文件;

GfstCmd:该文件夹存放的内容为平台生成的命令词模型文件;

GfstWake:该文件夹存放的内容为平台生成的唤醒词模型文件;

  • 语言模型文件替换方式

SDK语言模型文件夹位置:CI112X_SDK_ASR_Offline_V1.x.xsampleinternalsample_110xfirmware

用户将下载的语言模型文件夹中 GfstCmd 以及 GfstWake 这两个文件夹里的[0]asr_chinese_SE292_CI1122_normal.dat和[1]asr_chinese_SE292_CI1122_normal.dat文件替换SDK语言模型文件位置内的asr文件夹里的内容。

用户将下载的语言模型文件夹中 CmdWordStructure 目录下配置文件[60000]{cmd_info}.xls,放入SDK语言模型文件夹位置的 user_filecmd_info 内并替换原先的内容;(文件名必须要"[60000]"开头,可以改为类似"[60000](小黄人{cmd_info}.xls"之类的名称)

  • 声学模型文件替换方式

SDK声学模型文件夹位置:CI112X_SDK_V1.x.xCI112X_SDKsampleinternalsample_1122firmwarednn

用户将下载的声学模型里的内容放到SDK声学模型夹位置内,替换SDK原有的内容,即可使用。

  •  播报音文件替换

文件夹位置:CI112X_SDK_ASR_Offline_V1.x.xsampleinternalsample_110xfirmwarevocie

用户将新生成的播报音文件放入上述指定的目录中。

 

固件合成与烧录

在烧录前需要先将USB转串口的电源地(GND)、串口(TXD、RXD)收发引脚分别和模块对应的引脚连接起来,注意USB转串口的RXD和TXD分别对应模块的UART0_TX和UART0_RX。

 

1. 第一步:打开“合成分区bin文件.bat”;

Partition synthesis

2. Step 2: As shown in the figure below, you will be prompted to select the audio format when merging partitions. Novice users should select "adpcm". After completing the selection, press the Enter key. After the loading is completed, the interface will automatically close;

broadcast

3. Step 3: Open "package upgrade.bat";

Packaging tool

4. Step 4: Select the chip model corresponding to the development board you purchased (this operation occurs for the first time, please proceed directly to step 5 for subsequent uses);

Chip selection

5. Step 5: After confirming the chip model, click the "Firmware Packaging" button to enter the upgrade interface;

Firmware packaging options

6. Step 6: Fill in the firmware upgrade information:

  • Fill in the software and hardware related information in the version information area.

  • Select or fill in the bin file path of each partition.

  • Click "Refresh Address" and click "Package Firmware".

  • If a pop-up window prompts an address conflict, adjust the size of each partition and perform the previous step again.

Upgrade interface

  • A pop-up window prompting "Firmware has been generated" indicates that the packaging is successful. As shown below

success

 

Firmware burning

Step 1: Open "package upgrade.bat";

Packaging tool

Step 2: Select the chip model corresponding to the development board you purchased (this operation occurs for the first time, please proceed directly to step 3 for subsequent uses);

Chip selection

Step 3: After confirming the chip model, click the "Firmware Upgrade" button to enter the upgrade interface;

Firmware upgrade button

Step 4: Firmware upgrade

  • Select or fill in the firmware path.
  • Check the serial port connected to the device to be upgraded.
  • Other options: force update of all partitions, authentication files, encryption.
  • The module to be upgraded switches to upgrade mode (short-circuit the PG and EN pins).
  • Restart the device to be upgraded or power on the device again to start the upgrade.
  • Wait for the upgrade to complete. If it goes well, the progress bar will show 100% after the upgrade is successful, indicating that the update is successful. The device will automatically boot into the firmware code. If there is a power-on announcement, you can hear the power-on announcement.
  • Burning the firmware for the first time will take a long time. If you burn it again in the future, only the changed parts of the firmware will be updated.
  • If the power is accidentally cut off during the burning process, it may not work properly after the next burning. In this case, just check "Force update all partitions".

PA57aenRAaWMHK74OKf8omQ4COmH79M4c7qzKvsl.png

 

 

*6. BOM list

Hecc72LljUBgW1VXWfe0No1Q1KLafczzfzhB9TAY.png

 

*7. Contest LOGO verification


PCB bare board and logo verification.

bHblHw4hsTKzOvv5q4Hg73oL4LQnOViRjWreNjR0.png

Soldered Boards (4th Edition)

re5DhwameVMnCkFC3xLw6k3naxl6YB4qT8diHZTM.png Oh1tSiuJcXeRNsnMesV0t4sJrLAdXQJ1wBA67O3I.png

 Internal and external pictures of Minion after installation 

I5uce0sqFY6NzFUa71rGg7SfERSP32UVNbzLqoqr.png jt1vO1MpefOtbtmmaFnxyOmuwT94iuReegbIXoTc.png IhZALeurTu1ZcUenmxFmTjWpEj7y3xqZ5VeXfNti.png

 

* 8. Demonstrate your project and record it as a video for uploading


For the demonstration video, please refer to the link at Station B. The file is too large and cannot be uploaded.

https://www.bilibili.com/video/BV1V34y1X7jH/

 

参考设计图片
×
 
 
Search Datasheet?

Supported by EEWorld Datasheet

Forum More
Update:2025-05-14 14:32:53

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
community

Robot
development
community

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号