Activity monitoring for seniors living alone

Installation and Commissioning

Seheiah [14] itself is relatively easy to install. You just need to download the Python scripts and install the packages stated in the how-to using apt-get. Seheiah is a daemon implemented in Python 2.7 comprising four concurrent threads (database, behavior monitoring, alarm cascade, speech recognition); it is configured in the central configuration file, seheiah.cfg. The database is set up via

sqlite3 <name.db> < <path_to_seheiah>/helpers/activity_log.sql

The flow sensor requires some manual attention that involves adding a series resistor. A tutorial can be found online [15], and an Arduino sketch is available under <path_to_seheiah>/helpers/flowmeter.c.

A udev rule, like the one shown in Listing 4, ensures that the Arduino appears on the same interface if possible and is not identified as /dev/ttyUSB0 or /dev/ttyUSB1 from one run to the next. The lsusb command provides the necessary parameters after the microprocessor board is connected to the Rasp Pi. The executing user should be a member of the plugdev group so that data sent by the Arduino can be read easily later (you need to be root for this: adduser <username> plugdev).

Listing 4

udev Rules

#/etc/udev/rules.d/70-microcontrollers.rules
#arduino uno
SUBSYSTEMS=="usb", KERNEL=="<ttyACM[0-9]*>", ATTRS{idVendor}=="<2341>", ATTRS{idProduct}=="<0001>", SYMLINK+="<sensors/arduino_%s{serial}>", MODE="660", GROUP="plugdev"
#seeeduino
SUBSYSTEMS=="usb", KERNEL=="<ttyUSB[0-9]*>", ATTRS{idVendor}=="<0403>", ATTRS{idProduct}=="<6001>", SYMLINK+="<sensors/arduino_%s{serial}>", MODE="660", GROUP="plugdev"

Speech Recognition

The voice recognition software requires the most effort. PocketSphinx [16] is used for Seheiah, and a private acoustic model is generated. The advantage of this setup is that it is directly optimized for the future user, and slurred pronunciation or dialects do not cause any problems. The disadvantage is that the acoustic model must be trained intensively. Once that's done, four commands ("alarm off," "bye bye," "help," and "test") are all it takes to control Seheiah. These commands must be preceded by the "Seheiah" trigger to avoid false positives. It would be tragic, for example, if Grandma came home from a long trip, phoned her loved ones to report back, said "bye bye" at the end of the call, and then had a nasty fall.

For voice control of Seheiah, you need Sphinxbase [17] and SphinxTrain [18] on top of PocketSphinx. A number of dependencies can be resolved by issuing the

apt-get install cython python-gst0.10 python-gst0.10-dev gstreamer-tools gstreamer0.10-plugins-base libpulse-dev gstreamer0.10-pulseaudio

command. PocketSphinx is called in Seheiah via a GStreamer pipeline [19]. The export line

GST_PLUGIN_PATH=/usr/local/lib/gstreamer-0.10

in ~/.profile ensures that the matching plugin is found later without much ado.

Furthermore, some adjustments are still needed to save yourself some surprises down the line. In the Sphinxbase directory, you need to delete the python/sphinxbase.c file, along with the python/pocketsphinx.c file in the Pocketsphinx folder. These files are buggy and are regenerated during the Cython installation process.

In the gstpocketsphinx.c and gstvader.c files below /pocketsphinx-0.8/src/gst-plugin, you will want to change the sample rate [rate=(int)] from 8000 to 16000. A rate of 8,000Hz is only intended for voice recognition via phone. You can then install Sphinxbase, PocketSphinx, and SphinxTrain by running ./configure, make clean all, and then, with root privileges, make install for each.

To help train the acoustic model, Seheiah comes with a language model and some configuration files (<path_to_seheiah>/acoustic_model/). At the moment, these are customized German models. If your senior is a native English speaker, take a look at the PocketSphinx wiki [20] [21] to build your own model and customize the gstSphinxCli.py method final_result(self, hyp, uttid).

The biggest chore before going live is recording enough raw material. The Sphinx developers refer to five hours of audio for each speaker, given a small vocabulary. In our lab, viable results were obtained after 50 repetitions of each command. To reach the training set of five hours recommended by the CMU Sphinx developers, each command needs to be repeated 400 to 500 times.

The commands are stored in the wav/ subdirectory using

arecord -r 16000 -D hw:1,0 -d 5 -f S16_LE -c 1 <filename#>.wav

File names of the form <filename#>.wav are intended for testing purposes. Three instances exist, complete with all commands and the trigger (Seheiah+command). The file names for the individual commands are shown in Table 1.

Tabelle 1

File Names and Associated Commandsa

File Name

Command

alarm_off#.wav

SEHEIAH ALARM OFF

off#.wav

OFF

bye#.wav

SEHEIAH BYE BYE

help#.wav

SEHEIAH HELP

ohhelp#.wav

HELP

test#.wav

SEHEIAH TEST

a For illustrational purposes; the Seheiah acoustic and language modules presently are only in German.

PocketSphinx did have difficulty with the OFF command in our lab, so it should be practiced intensively. You can specify the number of files individually. The files to modify are 7646_test.fileids, 7646_test.transcription, 7646_train.fileids, and 7646_train.transcription in the etc directory of the acoustic model. Care must be taken to ensure that the entry in the nth row of <file>.fileids matches <file>.transcription to prevent nonsensical behavior of the voice recognition system.

Once the files exist, the training process can be initiated in the <path_to_seheiah>/acoustic_model directory by issuing the sphinxtrain run command. At the end of each training session, a test is carried out that uses the test files to identify the recognition rate (Figure 4). During training, it is advisable to give the commands from different positions in the room and optimize the recording level.

Figure 4: Twenty iterations per command still leads to a high failure rate. Fifty training sets per command are better, but 500 is optimum for best results.

During speech recognition, the Rasp Pi complained when used with ALSA that the recording stream could not be interpreted quickly enough. The problem was solved by changing to the alternative PulseAudio, which was associated with a surprising number of configuration steps before the voice recognition system worked smoothly [19]. This setup required an /etc/asound.conf file (Listing 5).

Listing 5

/etc/asound.conf

#/etc/asound.conf
pcm.pulse {
    type pulse
}
ctl.pulse {
    type pulse
}
pcm.!default {
    type pulse
}
ctl.!default {
    type pulse
}

Specifically, the value of DISALLOW_MODULE_LOADING in /etc/default/pulseaudio had to be set to 0. And, in /etc/libao.conf, pulse had to be specified as the default driver instead of alsa. Other changes related to /etc/pulse/daemon.conf are shown in Listing 6. Finally, the executing user still needs to join the pulse-access group by running adduser <username> pulse-access.

Listing 6

/etc/pulse/daemon.conf

#/etc/pulse/daemon.conf
daemonize = yes
high-priority = yes
nice-level = 5
exit-idle-time = -1
resample-method = src-sinc-medium-quality
default-sample-format = s16le
default-sample-rate = 48000
default-sample-channels = 2"

Buy this article as PDF

Express-Checkout as PDF

Pages: 8

Price $2.95
(incl. VAT)

Buy Raspberry Pi Geek

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content