blog/_posts/2017-09-03-kalliope-assista...

5.4 KiB

title updated tags description
Kalliope, a voice-controlled personal assistant (running on qemu) 2017-09-16 23:00
qemu
ssh
assistant
voice
parabola
debian
kalliope
How to setup Kalliope, a voice-controlled personal assistant, if you don't run an officially supported distro like Debian

Hello again,

It's been some time I wanted to try out a personal vocal assistant that is 100% free software, in working condition and simple to customize. Unfortunately there aren't that many that satisfy all the those conditions.

There is however an interesting project called Kalliope which apparently has all the requirements I was looking for (we are very close anyway :) ) Here's a video posted by one of the authors

Introduction

There are a few aspects of Kalliope that I've grasped in a few hours of usage and that I would like to share. First of all basic configuration is done through YAML files, where you need to describe expected input and outputs for each case. Inputs are vocal commands, while outputs can be a composition of voice, shell scripts or ad-hoc python scripts. You will find examples once you install a specific language "starter".

According to the official documentation there are just a handful of supported GNU/Linux distros. Unfortunately no Arch Linux derivative (including Parabola) is among them. Infact, after trying both the manual installation and an AUR package and given the results I decided to switch to one of the officially supported distros: Debian (at least for the moment).

Alternative installation

Now, since I don't have any working installation of Debian 8 (Jessie) I attempted to run it inside QEMU. To do this I used my qvm script I made some while ago. Once I had Debian up and running I tried the microphone test. It turned out that I didn't have any soundcards available in the virtual machine. A couple of options added in the qvm script did the trick. Now I could record and listen to my voice through QEMU.

If you are on Parabola or Arch Linux please make sure to install qemu instead of qemu-headless otherwise the audio would be unusable. Apparently the ALSA audio driver is not included in the qemu-headless package. See $ qemu-system-x86_64 -audio-help for more information.

According to the official documentation, with Debian 8, the contrib and non-free repositories should be enabled before installing all the dependencies. Fortunately, only one of them is non free: libttspico-utils. I simply removed that from the installation command and left the sources.list file as-is:

# apt-get update
# apt-get dist-upgrade
# apt-get install git python-dev libsmpeg0 flac dialog libffi-dev libffi-dev libssl-dev portaudio19-dev build-essential libssl-dev libffi-dev sox libatlas3-base mplayer libav-tools

After that, I continued with the official documentation. I followed Method 1 - User install using the PIP package. Apparently every other component is free software, but this still needs to be verified thoroughly. Because I removed libttspico-utils, which is an offline text to speech engine, I had to use another one. All of them except espeak are online TTS. Here you will find the installation instructions for espeak. The speech to text system used is Google's one because, although cloud-based, it's much more precise than the offline STT at the moment.

Finally I downloaded the english starter version and tested it out. Everything worked.

{% include image.html file="0.png" alt="kalliope 0" caption="Kalliope is waiting for the trigger word" %}

{% include image.html file="1.png" alt="kalliope 0" caption="Kalliope has listened and executes a command" %}

SSH

After testing all of this stuff, using the -x option for qvm, I made another "discovery", by running qvm with:

$ ./qvm -n
$ ./qvm -a

i.e: connecting via SSH only. Apparently this also has the ability to route the audio. If you put this into the account, including one of the [previous posts]({{ site.baseurl }}/notes/qemu-ssh-tunnel.html), maybe in some way you could have one instance of Kalliope running in QEMU with an unlimited (!?) number of remote IO audio devices...

Cheers!