Activity monitoring for seniors living alone
Installation and Commissioning
Seheiah [14] itself is relatively easy to install. You just need to download the Python scripts and install the packages stated in the how-to using apt-get
. Seheiah is a daemon implemented in Python 2.7 comprising four concurrent threads (database, behavior monitoring, alarm cascade, speech recognition); it is configured in the central configuration file, seheiah.cfg
. The database is set up via
sqlite3 <name.db> < <path_to_seheiah>/helpers/activity_log.sql
The flow sensor requires some manual attention that involves adding a series resistor. A tutorial can be found online [15], and an Arduino sketch is available under <path_to_seheiah>/helpers/flowmeter.c
.
A udev rule, like the one shown in Listing 4, ensures that the Arduino appears on the same interface if possible and is not identified as /dev/ttyUSB0
or /dev/ttyUSB1
from one run to the next. The lsusb
command provides the necessary parameters after the microprocessor board is connected to the Rasp Pi. The executing user should be a member of the plugdev
group so that data sent by the Arduino can be read easily later (you need to be root for this: adduser <username> plugdev
).
Listing 4
udev Rules
#/etc/udev/rules.d/70-microcontrollers.rules #arduino uno SUBSYSTEMS=="usb", KERNEL=="<ttyACM[0-9]*>", ATTRS{idVendor}=="<2341>", ATTRS{idProduct}=="<0001>", SYMLINK+="<sensors/arduino_%s{serial}>", MODE="660", GROUP="plugdev" #seeeduino SUBSYSTEMS=="usb", KERNEL=="<ttyUSB[0-9]*>", ATTRS{idVendor}=="<0403>", ATTRS{idProduct}=="<6001>", SYMLINK+="<sensors/arduino_%s{serial}>", MODE="660", GROUP="plugdev"
Speech Recognition
The voice recognition software requires the most effort. PocketSphinx [16] is used for Seheiah, and a private acoustic model is generated. The advantage of this setup is that it is directly optimized for the future user, and slurred pronunciation or dialects do not cause any problems. The disadvantage is that the acoustic model must be trained intensively. Once that's done, four commands ("alarm off," "bye bye," "help," and "test") are all it takes to control Seheiah. These commands must be preceded by the "Seheiah" trigger to avoid false positives. It would be tragic, for example, if Grandma came home from a long trip, phoned her loved ones to report back, said "bye bye" at the end of the call, and then had a nasty fall.
For voice control of Seheiah, you need Sphinxbase [17] and SphinxTrain [18] on top of PocketSphinx. A number of dependencies can be resolved by issuing the
apt-get install cython python-gst0.10 python-gst0.10-dev gstreamer-tools gstreamer0.10-plugins-base libpulse-dev gstreamer0.10-pulseaudio
command. PocketSphinx is called in Seheiah via a GStreamer pipeline [19]. The export
line
GST_PLUGIN_PATH=/usr/local/lib/gstreamer-0.10
in ~/.profile
ensures that the matching plugin is found later without much ado.
Furthermore, some adjustments are still needed to save yourself some surprises down the line. In the Sphinxbase directory, you need to delete the python/sphinxbase.c
file, along with the python/pocketsphinx.c
file in the Pocketsphinx folder. These files are buggy and are regenerated during the Cython installation process.
In the gstpocketsphinx.c
and gstvader.c
files below /pocketsphinx-0.8/src/gst-plugin
, you will want to change the sample rate [rate=(int)
] from 8000
to 16000
. A rate of 8,000Hz is only intended for voice recognition via phone. You can then install Sphinxbase, PocketSphinx, and SphinxTrain by running ./configure
, make clean all
, and then, with root privileges, make install
for each.
To help train the acoustic model, Seheiah comes with a language model and some configuration files (<path_to_seheiah>/acoustic_model/
). At the moment, these are customized German models. If your senior is a native English speaker, take a look at the PocketSphinx wiki [20] [21] to build your own model and customize the gstSphinxCli.py
method final_result(self, hyp, uttid)
.
The biggest chore before going live is recording enough raw material. The Sphinx developers refer to five hours of audio for each speaker, given a small vocabulary. In our lab, viable results were obtained after 50 repetitions of each command. To reach the training set of five hours recommended by the CMU Sphinx developers, each command needs to be repeated 400 to 500 times.
The commands are stored in the wav/
subdirectory using
arecord -r 16000 -D hw:1,0 -d 5 -f S16_LE -c 1 <filename#>.wav
File names of the form <filename#>.wav
are intended for testing purposes. Three instances exist, complete with all commands and the trigger (Seheiah+command). The file names for the individual commands are shown in Table 1.
Tabelle 1
File Names and Associated Commandsa
File Name | Command |
---|---|
|
SEHEIAH ALARM OFF |
|
OFF |
|
SEHEIAH BYE BYE |
|
SEHEIAH HELP |
|
HELP |
|
SEHEIAH TEST |
a For illustrational purposes; the Seheiah acoustic and language modules presently are only in German. |
PocketSphinx did have difficulty with the OFF command in our lab, so it should be practiced intensively. You can specify the number of files individually. The files to modify are 7646_test.fileids
, 7646_test.transcription
, 7646_train.fileids
, and 7646_train.transcription
in the etc
directory of the acoustic model. Care must be taken to ensure that the entry in the nth row of <file>.fileids
matches <file>.transcription
to prevent nonsensical behavior of the voice recognition system.
Once the files exist, the training process can be initiated in the <path_to_seheiah>/acoustic_model
directory by issuing the sphinxtrain run
command. At the end of each training session, a test is carried out that uses the test files to identify the recognition rate (Figure 4). During training, it is advisable to give the commands from different positions in the room and optimize the recording level.
During speech recognition, the Rasp Pi complained when used with ALSA that the recording stream could not be interpreted quickly enough. The problem was solved by changing to the alternative PulseAudio, which was associated with a surprising number of configuration steps before the voice recognition system worked smoothly [19]. This setup required an /etc/asound.conf
file (Listing 5).
Listing 5
/etc/asound.conf
#/etc/asound.conf pcm.pulse { type pulse } ctl.pulse { type pulse } pcm.!default { type pulse } ctl.!default { type pulse }
Specifically, the value of DISALLOW_MODULE_LOADING
in /etc/default/pulseaudio
had to be set to 0
. And, in /etc/libao.conf
, pulse
had to be specified as the default driver instead of alsa
. Other changes related to /etc/pulse/daemon.conf
are shown in Listing 6. Finally, the executing user still needs to join the pulse-access
group by running adduser <username> pulse-access
.
Listing 6
/etc/pulse/daemon.conf
#/etc/pulse/daemon.conf daemonize = yes high-priority = yes nice-level = 5 exit-idle-time = -1 resample-method = src-sinc-medium-quality default-sample-format = s16le default-sample-rate = 48000 default-sample-channels = 2"
Buy this article as PDF
Pages: 8
(incl. VAT)