Design of non-specific human speech recognition system based on ARM processor

0 Preface

With the extensive use of high and new technology in the military field, weapons and equipment are gradually developing in a high, precise and sharp direction. Due to the long training time, high training cost and narrow training space, traditional military training often fails to achieve the expected training effect, which can not meet the needs of modern military training. In order to solve the above problems, simulation training came into being.

In order to further improve the training effect, this paper uses the intelligent voice interaction chip to design a teaching and playback system of a simulation trainer. The teaching system vividly demonstrates the standard operating procedures and corresponding operating phenomena for the operator, greatly shortening the training time for the operator and improving the training effect. The playback system records the password, sound intensity, motion, time, operation phenomenon, etc. of each operator during the training process, and repeats the training process after the end of the training, so that the operator can correct his problem in time. The teaching system can also be understood as a playback of the standard operating training process. The system does not require the support of virtual reality technology and can be implemented on a small embedded system.

1 System principle

The simulation trainer consists of a measurement and control computer and multiple slave devices. As shown in Figure 1. Here, only one slave device is introduced. The hardware system is mainly composed of a measurement and control computer, an Arduino mega2560 controller, a voice recognition unit, a sound intensity detection unit, a voice synthesis unit, a panel control unit, and an instrument panel. The panel control unit is more complex and contains a variety of control circuits. In the simulation training, the slave device is responsible for completing the entire training process under the control of the Arduino mega2560 controller, and repeats the phenomenon of the training operation just performed in the teaching and playback system. The specific circuit design is not described here.

Design of non-specific human speech recognition system based on ARM processor

The voice recognition unit is responsible for identifying the operator's operation password; the sound intensity detection unit is responsible for detecting the sound intensity and using this as a basis for determining which slave operator's password is used; the Arduino mega2560 controller is responsible for monitoring the status of each component of the instrument panel to identify The operator's actions complete the recording of the operational training process. The operation phenomenon of each instrument is prepared in advance according to the operation and operation without recording. During operation playback, the measurement and control computer reproduces the recorded operation process by controlling the corresponding slave device's Arduino mega2560 controller based on the recorded data.

2 unit system design

2.1 Speech recognition unit design

At present, the development of speech recognition technology is very rapid, and can be divided into specific person and non-specific person speech recognition according to the type of recognition object. A specific person refers to a person whose identification object is a special person. A non-specific person means that the recognition object is for most users. Generally, it is necessary to collect voices of multiple people for recording and training, and after learning, a higher recognition rate is achieved.

The LD3320 speech recognition chip used in this paper is a chip based on Speaker Independent AutomaTIc Speech Recognition TI (SI ASR) technology. The chip integrates high-precision A/D and D/A interfaces, eliminating the need for external auxiliary FLASH and RAM, which enables voice recognition, voice control, and human-machine dialogue. It provides a true single-chip voice recognition solution. . And, the identified keyword list is dynamically editable. The speech recognition process is shown in Figure 2.

Design of non-specific human speech recognition system based on ARM processor

The speech recognition unit uses ATmega168 as the MCU, which is responsible for controlling the LD3320 to perform all the work related to speech recognition, and uploads the recognition result to the Arduino mega2560 controller through the serial port. The various operations of the LD3320 chip must be completed by register operations. There are two ways to read and write registers (standard parallel mode and serial SPI mode). In this parallel mode, the data port of the LD3320 is connected to the I/O port of the MCU. Its hardware connection diagram is shown in Figure 3.

Design of non-specific human speech recognition system based on ARM processor

The speech recognition process works in an interrupt mode, and its workflow is divided into initialization, writing keywords, starting recognition, and responding to interrupts. The MCU program is written by ARDUINO IDE [5]. After the debugging is completed, the serial port is used for burning, the LD3320 is controlled to complete the speech recognition, and the recognition result is uploaded to the Arduino mega2560 controller. The software flow is shown in Figure 4.

Design of non-specific human speech recognition system based on ARM processor

2.2 Sound intensity detection unit design

In the speech recognition, it is necessary to judge the password of a slave device operator. To this end, the sound intensity detecting unit circuit is designed. The circuit only needs to be able to determine the relative sound intensity, and does not need to detect the sound level, and the detection accuracy is required. low.

The capacitive MIC sound sensor converts the external sound signal into an electrical signal, and is amplified by the NE5532 amplifying circuit, and converts the input weak audio signal into a voltage signal having a certain amplitude, and the voltage signal is loaded by the AC/DC RMS conversion circuit. After the change, it is amplified again and finally sampled by the A/D of the Arduino mega2560 controller. Figure 5 shows the circuit schematic of the sound intensity detection unit, where D1 terminates the A/D of the Arduinomega2560 controller, and INT1 terminates the external interrupt of the Arduino mega2560 controller. 1. When the external sound signal is greater than the preset threshold, the triode Turning on the INT1 terminal from high level to low level generates an external interrupt. The controller responds to the interrupt and performs A/D sampling. The sampled data is saved after the average filtering, and the sound intensity data is uploaded when the computer is inquired by the measurement and control computer.

Design of non-specific human speech recognition system based on ARM processor

2.3 Speech synthesis unit design

TTS (Text To Speech) text-to-speech technology is the trend of human-computer intelligent dialogue development. The voice system based on TTS technology can detect and synthesize voices according to the query conditions at any time without prior recording, thereby greatly reducing the workload of system maintenance. Using this technology, voice chip pronunciation can be controlled by MCU or PC [4].

This paper uses SYN6658 Chinese speech synthesis chip for speech synthesis. SYN6658 receives text data to be synthesized through UART interface or SPI interface communication mode, and realizes text-to-speech (or TTS voice) conversion [6]. The controller and the SYN6658 speech synthesis chip are connected through the UART interface. The controller sends control commands and texts to the SYN6658 speech synthesis chip through serial communication. The SYN6658 speech synthesis chip synthesizes the received text into a speech signal output, and the output signal is output via LM386. The power amplifier is amplified and connected to the speaker for playback. As shown in Figure 6.

Design of non-specific human speech recognition system based on ARM processor

The SYN6658 speech synthesis circuit is designed using the typical application circuit provided in the chip hardware data sheet [5]. It is not introduced here. The power amplifier circuit is amplified by the National Semiconductor's audio power amplifier LM386.

Initialization is performed during speech synthesis, including speaker selection, digital processing strategy, speech rate adjustment, tone adjustment, volume adjustment, and the like.

Since the system is to simulate multi-person pronunciation, different slaves set different speaker and intonation and speech rate to facilitate differentiation. After initialization, wait for the voice synthesis command of the measurement and control computer. After receiving the command, the chip will send a 1-byte status back to the host computer. The host computer can judge the current working state of the chip according to this backhaul. The speech synthesis flow chart is shown in Figure 7.

Design of non-specific human speech recognition system based on ARM processor

3 system software design

The software design of the teaching and playback system includes the software design of the measurement and control computer and the software design of the Arduino mega260 controller.

The measurement and control computer is the control core of the whole system. The software is written by C#. In the teaching and playback system, the operation data is mainly recorded to accurately play back the operation process according to the recorded data. The data to be recorded includes: The operation password, operation action, password and operation time of the slave operator, and the operation phenomenon corresponding to each operation. In order to simplify the recording of data, each event code is prepared in advance, and the recording process only records the code, which greatly improves the efficiency of the program. The structure is created as follows:

Design of non-specific human speech recognition system based on ARM processor

During the operation training, the measurement and control computer controls and polls the lower computer every 50 ms, and records the feedback data. The data is recorded in units of 50 ms. The timer is used to control the time. During the playback process, the current time and the recorded time are first compared. When the recorded time coincides with the current time, the measurement and control computer controls the lower computer to execute the event to complete the event playback.

The Arduino mega2560 controller is responsible for receiving the control commands of the measurement and control computer and executing the instructions, reading the speech recognition results, collecting and processing the sound intensity data, and controlling the speech synthesis unit for speech synthesis. The Arduinomega2560 controller uses a serial port interrupt for command reception.

Only if the command is correctly received will the result be executed and the result returned. If the measurement and control computer does not receive the return result within the limited time, it indicates that an error has occurred and the measurement and control computer needs to resend. The data receiving flow chart is shown in Figure 8.
Design of non-specific human speech recognition system based on ARM processor

4 Summary

This paper uses the intelligent voice chip to design a teaching and playback system of a simulation trainer. The system does not need the support of the popular virtual reality technology, and can only run under the control of the MCU. The system can also be implemented on small portable devices with good application prospects.

The advantages of the solution of motorcycle engine starter lithium battery, compared to the traditional Lead-Acid Battery, motorcycle special Starting Lithium Battery in addition to high energy density, there are the following advantages:

1. More environmental protection

Lead-acid batteries are polluted in the production process and may cause pollution if they are not recycled properly. Lithium batteries are green and environmental friendly.

2. Longer service life.

3. Low self-discharge rate, less than 10% per month, lead acid battery can reach 30%.

Motorcycle Starting Battery

Motorbike Starting Battery,Motorcycle Start Battery Box,Motorcycle Start Battery 12.8V,Motor Scooter Lithium Battery,E-motor Lithium Battery,Rechargeable Motorbike Starter Battery

Shenzhen Enershare Technology Co.,Ltd , https://www.enersharepower.com