machine learning,-EEWorld Reference Design Center

This solution provides English digital speech machine learning recognition functions, and provides a graphical development platform that can quickly create human-computer interfaces. The speech recognition results can be used to create high-quality human-computer interaction screens through the Nuvoton development platform.

Voice control of electronic devices has become an unstoppable trend. Its advantages are that electronic devices can be controlled hands-free and can be operated in environments where buttons are inconvenient. This solution uses Google TensorFlow as the deep learning algorithm development environment for speech recognition, and implements the speech recognition function on the NuMaker-PFM-M487 platform. It uses the keyword recognition (KeyWord Spotting) sample program to implement offline and real-time speech. recognition system. A complete deep learning speech recognition system requires the use of two platforms, as shown in Figure 1-1. One is the PC platform, which uses TensorFlow and Python to write complete deep learning program code and train the model. Because this solution uses supervised learning ( Supervised Learning) (Note 1) mode, so it is necessary to provide the system with a large amount of training data and labels (Labels), and then train the extracted features with a deep neural network (DNN) model, and repeatedly revise the training model until the model Reaching the system optimization state; the second is the NuMaker-PFM-M487 platform, which uses the deep learning model and training results (model parameters) built on the PC to complete a real-time speech recognition system on the NuMaker-PFM-M487 platform.

Figure 1-1 Speech recognition system flow chart

The keywords recognized by this solution are 10 English numbers: One, Two, Three, Four, Five, Six, Seven, Eight, Nine, Zero. The NuMaker-PFM-M487 development board is used and paired with the M487 emWin GUI development platform to present speech recognition. As a result, when the user says "One" into the microphone, the keyword "One" will be correctly displayed on the LCD panel of this solution.

Note 1: Supervised Learning: All questions have corresponding standard answers, that is, the user first labels the data and tells the machine the corresponding answer during the training process.

Related IC/Platform

* Note: Nuvoton and NuMicro are trademarks of Nuvoton Technology Corp., and other trademarks and copyrights involved in this article belong to their original owners.