Design method of voice-activated mouse based on speech recognition

Abstract: A voice-activated mouse cursor application based on speech recognition technology is implemented. It can be used to control the mouse cursor to move to any position on the screen. It can be used to help disabled people to operate the computer with only the sound without the mouse and keyboard . The delay defects in the use of voice control mouse cursors are analyzed, and targeted improvements are made.

This article refers to the address: http://

1 Overview

Computer voice technology has made great progress after years of development. There are already some products and projects that give people the opportunity to interact with computers for voice interaction. For example, IBM's Viavoice series software and Microsoft's new Office products have practical voice functions. It is possible to perform voice dictation and input of text, and some application systems based on voice technology have appeared.

The development of voice technology, especially voice recognition technology, makes it possible to control the computer with voice. This is of great significance to many disabled people in the world who cannot easily use traditional mouse and keyboard. In addition, it is not convenient to operate the computer in some occasions. It also makes sense when you have to use it, such as driving an electronic map while driving. At present, to truly control a computer with a graphical interface, you must use voice control to input data and control the cursor as you would a keyboard or mouse. Therefore, an effective voice control cursor program, that is, can be positioned anywhere on the screen, and can simulate a voice control program of various mouse actions such as clicking, double-clicking, and the like, which is meaningful for realizing voice control computer. Tool of.

This article uses Microsoft's Speech SDK 5.1 free speech recognition engine and analog mouse technology to implement a voice control mouse application with Delphi7.0, which can realize the function of controlling the screen cursor by moving the mouse movement, stopping, clicking, etc. The delay problem in the voice control mouse program is analyzed, and an improved method is proposed and implemented.

2 Speech recognition based cursor control type

There are currently two modes of voice-controlled mouse: one is target-directed cursor control and direction-directed cursor control [n]. For the former, the user needs to give a specific target name or position, such as an icon, a menu, or a screen area name. Then give the executed command such as "click", etc. This way is still valid for a single software, but when the target is increased, the user needs to memorize the names of many targets, and the same name target may occur, so work The error rate will increase. Another direction-oriented cursor control is divided into two types: non-continuous control and continuous control. For non-continuous situations, the user should simultaneously specify the direction and distance. For example, if the command "left 8 cm", the cursor moves to the left by 8 cm; For continuous situations, the user first states the direction as "left", and the cursor moves to the left until the user says "stop" and the cursor stops moving.

The voice-activated cursors discussed in this paper are continuous controls in the direction-oriented. This kind of mouse control is consistent with the daily usage habits, and the user is more comfortable to use.

3 implementation

The voice control of the voice control cursor program in this paper is based on Microsoft's Speech SDK 5.1 speech recognition engine and its API interface. This is a free development kit, and can be used to develop software with Chinese voice function. The speech recognition engine can usually be divided into two working modes. One is the Command and Control mode. In this mode, the speech recognition engine can recognize short voice commands to execute the corresponding programs. The other is continuous dictation. In this mode, the speech recognition engine needs to recognize continuous speech. This function is more complicated to implement than the speech control. Because the speech dictation process needs to analyze and judge the context and the words with the same similar pronunciation, and control the voice mode in the command. No context analysis is required. This article uses the command control method, because the realization of voice control mouse only needs to identify a limited number of short commands, such as "left", "right", "stop" and so on. Figure 1 is a block diagram of a voice controlled mouse program.

Figure 1 voice control mouse program structure

Figure 1 voice control mouse program structure

The application mainly includes two parts: The first part of the voice control application main program part mainly calls the speech recognition engine to recognize the user's voice command.

This part of the program mainly completes several tasks:

1 Import the grammar grammar file (XML format, which defines the voice command of interest), complete the initialization of the speech recognition engine interface, activate the speech recognition engine; 2 receive the recognition result of the speech recognition engine, and call the corresponding mouse control according to the recognition result program.

The syntax rules for the direction command and the mouse event command are defined in the following grammar file:








Double click

shut down

When the program is running, once the command defined in the above grammar file is successfully recognized, the inRecgnition response function of the program can be queried to recognize that the return value is a value in 1~8, according to different return values. You can call the mouse control program to move the mouse in a certain direction or click the analog control of the event.

The second part of the mouse control program is to call the program to simulate various mouse events such as controlling mouse movement or clicking. This part mainly uses Windows API functions to simulate mouse events to achieve control of the mouse cursor.

Simulating mouse movement can be achieved by looping through the Windows API function SetcursorPos(x1, y1), which controls mouse movements in any direction by controlling changes in x1 and y1 in the loop. Multi-threaded control is required to simulate mouse movement in the program, otherwise the mouse cannot be stopped or turned at any time during the movement. Here is the mobile control code in the mobile thread:

For I := 1 to 500 do


If bstop=1 then break;

/ / Stop the cursor movement when the stop command is encountered

Case Dr of

/ / According to the parameter Dr to determine the cursor running direction

1: x1:=x1+n1; //Shift right

2: x1:=x1- n1; //left shift

3: y1:=y1+n1; //Shift down

4: y1:=y1- n1; // move up




Another type of mouse control program is a mouse event that simulates a mouse click and double click, mainly using a mouse_event function to simulate a mouse event in a program. The following code in the program simulates a left mouse click:

Windows.mouse_event (MOUSEEVENTF_LEFTDOWN, 0,0,0,0);

// left button pressed


// left button up

Figure 2 is an example of an ideal implementation of a voice-controlled mouse using this program. The initial position of the cursor is at the lower left of the screen, and the target rectangle is in the upper right area of ​​the screen. First, the user issues a "right" command to the microphone, and the cursor moves to the right. When the cursor reaches the bottom of the target, the user says "up" command, and the cursor changes upward. Move (or say "stop" command first, the cursor stops); When the cursor reaches the target rectangle, the user says "stop" command, the cursor stops, and finally the user says "click", the program simulates the left mouse click event, which is equivalent to the target The rectangle is clicked.

Figure 2 An example of a voice-controlled mouse operation

Figure 2 An example of a voice-controlled mouse operation

4 voice control mouse defects

Although a few simple voice commands can control the mouse to move to any position on the screen, and can simulate various mouse events such as click and double click according to voice commands, but there are still defects in actual use.

For large targets, there is no problem with this voice control mouse control. If the target area is small, it will increase the difficulty of the user. For example, when the cursor moves to the rectangle, it will stop "stop" and the cursor will continue to move for a short period of time. Stop, then the cursor may have crossed the target rectangle. This situation is related to the delay in the voice recognition control. Every time the user issues a voice command until the command is executed, there is a process. First, the user needs to speak the voice command, and the speed and speed of the person and the speed of speech Slow people say that the same command takes a different amount of time; in addition, the speech recognition engine needs a time to successfully recognize a voice command. Therefore, the voice control mouse must have a delay in the control process. Therefore, from the beginning of the voice command to the action is executed, the cursor must have a position error: △ S = V × Δt ( △ S is the position error, V is the mouse movement speed, △ t is the delay caused by the speech and recognition) .

Sear et al. have studied the use of a virtual mouse mechanism to solve the delay error problem, that is, a mouse is created before the real mouse movement, and a voice command is issued when the fake mouse reaches the target, and the mouse just reaches the target when executed. But their test results are not ideal. Because each person's habits and speeds are different, and the speed of speech is different in different states, the delay Δt is not constant, so the position error ΔS is not the same every time, but a fixed The pilot virtual mouse of the distance does not achieve good results.

5 An improved method

The factors that affect the position control effect are related to the size of the target, the speed of movement, and the delay. What can be changed is speed control, so this paper adopts a scheme to improve the position control accuracy. The improved position control error ΔS of the voice control mouse is significantly reduced, and the position control accuracy is improved when the target is small.

The improvement method is to increase the speed control of the mouse in the program: In the case of a small target, the cursor moves at the normal speed V1 first, and the deceleration control is first performed when the target is near the target, and the voice command "slow" is used to control the mouse to reduce the moving speed. To V2 (V2=1/3 V1 in the actual design), after the target is reached, use the voice command "stop" to stop the mouse movement. Figure 3 reflects the change of the cursor speed during this process. Thus, the delay time does not change, and since the cursor moving speed V is much lower, it is known that ΔS = V × Δt that the position error ΔS is also greatly reduced.

Figure 3 Speed ​​controllable voice control cursor movement speed change diagram

Figure 3 Speed ​​controllable voice control cursor movement speed change diagram

In the case of a relatively large target, because the position control accuracy is sufficient, the "stop" control can be directly controlled without selecting the "slow" command control.


This paper studies the application of the voice-controlled mouse, realizes the basic voice control of the mouse, and analyzes and improves the position control error caused by the delay in the voice control. The content of this paper is of positive significance for the development of interactive tools that use computers and disabled people who are inconvenient to use the mouse and keyboard.

Tests have shown that using this program to control mouse movements, clicks, etc. can be achieved by browsing the web, opening and closing programs and other computer operations. However, since the voice command is issued multiple times during the control of the mouse, and attention is paid to observing the position of the cursor, there is also a problem that the user is prone to fatigue.

Further research will be conducted on improving the comfort of voice control mouse use and improving control efficiency.

XLPE Power Cable

XLPE Power Cable,Copper Power Cable,33kv Power Cable,XLPE Insulated Cable

Huayuan Gaoke Cable Co.,Ltd. ,