Asterisk Voicemail Transcription via IBM Cloud Speech-to-Text API

Update June 2022: I updated this guide to use the new IBM speech-to-text model. They have replaced the old “en-US_NarrowbandModel” model with a new “en-US_Telephony” model. Simply replace the old model name in your API request with the new model name. The old model is deprecated and scheduled to be removed from service on September 15, 2022.

Updated February 2020: I recently updated this guide to reflect setup on a new Asterisk Server running FreePBX 14 and Asterisk 13. Our current Asterisk server runs on a small Vultr Cloud Compute server (2 CPU, 4GB RAM, 80GB SSD) for $20/month. They give you the option to upload custom ISO (e.g. the latest FreePBX distro) or you can choose an ISO from their library (e.g. a recent FreePBX distro). Pro Tip- After you setup your server, don’t forget to remove the ISO (aka CD image) from your server configuration so it does not keep booting to the ISO after each reboot.

Introduction

This guide briefly explains how to configure Asterisk PBX to send voicemail as an email with message as mp3 attachment and a text transcription via the IBM Cloud Speech-to-Text API. IBM provides 500 minutes of Speech-to-Text for free per month, then charges $0.02/minute for each additional minute.

I was using this Asterisk Transcriptions with Google (backup link) script, but moved to the IBM service because Google has discontinued V1 of their Speech Recognition API and Google seems to charge for V2 of their API.  I did not use AWS speech to text because their API does not provide an immediate transcription response. You have to upload the job and keep checking for results.

Sending MP3-formatted Voicemail Attachments

Go to the article Asterisk setup voicemail to send email with mp3 attachment and follow Nicolas Bernaerts’ instructions. You must successfully implement sending emails with MP3 attachments via his custom script (/usr/sbin/sendmailmp3) before you proceed.

Get IBM Cloud Credentials for the Speech-to-Text service

Go to the IBM Speech to Text page for current rates. Click “Get Started Free” to signup. Once signed up, go to your IBM Resource List and open your Speech to Text service to view your IBM credentials. Make note of your API KEY and your API URL.

Sending Transcription with Voicemail Attachments

  1. We will use the “curl” command to send your voicemail file to IBM and retrieve the transcription results.  If the command is not already installed, install it now.
    # Debian or Ubuntu OS
    apt-get install curl
    # Redhat or CentOS
    yum install curl
  2. Test IBM Cloud Speech-to-Text API by running the following command.  Replace API_PASSWORD and API_URL with your IBM Cloud Credentials.  Specify the full path to an existing recording from your Asterisk mailboxes (/var/spool/asterisk/voicemail/default/) or your Asterisk custom recordings (/var/lib/asterisk/sounds/custom/).

    curl -X POST -u apikey:API_PASSWORD --header "Content-Type: audio/wav" --data-binary @/var/lib/asterisk/sounds/en/cdir-transferring-further-assistance.wav "API_URL/v1/recognize?model=en-US_Telephony&smart_formatting=true"

    The example above should produce a result similar to the following:

    {
    "results": [
    {
    "alternatives": [
    {
    "confidence": 0.9182463884353638,
    "transcript": "we are now transferring you out of the company directory please hold on for further assistance "
    }
    ],
    "final": true
    }
    ],
    "result_index": 0
    }
  3. Go to the article Asterisk Voicemail with Speech Recognition using Google API (backup link) and download/install Nicolas Bernaerts’ updated sendmail script. This will replace your existing WAV2MP3 script with a new script that can fetch transcriptions.

    Note: You do NOT need to install the “sox” or “flac” packages they mention. Asterisk records voicemails in wav audio format. In the guide above, the files had to be converted to “.flac” format for Google, but IBM can use the original “.wav” audio file.
  4. Replace the following lines:
    # convert wav file to flac compatible for Google speech recognition
    sox stream.part3.wav -r 16000 -b 16 -c 1 audio.flac vad reverse vad reverse lowpass -2 2500
    # call Google Voice Recognition sending flac file as POST
    curl --data-binary @audio.flac --header 'Content-type: audio/x-flac; rate=16000' 'https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&pfilter=0&lang='$LANGUAGE'&maxresults=1' 1>audio.txt
    # extract the transcript and confidence results
    FILETOOBIG=`cat audio.txt | grep "<HTML>"`
    TRANSCRIPT=`cat audio.txt | cut -d"," -f3 | sed 's/^.*utterance\":\"\(.*\)\"$/\1/g'`
    CONFIDENCE=`cat audio.txt | cut -d"," -f4 | sed 's/^.*confidence\":0.\([0-9][0-9]\).*$/\1/g'`


    With these new lines:

    CURL_OPTS=""
    API_PASSWORD="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
    API_URL="https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"

    # Send WAV to IBM Cloud Speech to Text API. Must use "Narrowband" (aka 8k) model since WAV is 8k sample.
    curl -s $CURL_OPTS -k -u apikey:$API_PASSWORD -X POST \
    --limit-rate 40000 \
    --header "Content-Type: audio/wav" \
    --data-binary @stream.part3.wav \
    "${API_URL}/v1/recognize?model=en-US_Telephony&smart_formatting=true" 1>audio.txt

    # Extract transcript results from JSON response
    TRANSCRIPT=`cat audio.txt | grep transcript | sed 's#^.*"transcript": "##g' | sed 's# "$##g'`

  5. Optional – If you use a proxy server, you may need to specify curl options similar to those shown below.
    CURL_OPTS="-x squid.example.org:3128"
  6. Optional Tip – I added the following option to my CURL OPTS on an older server that was receiving an SSL error. This forced TLSv1 to resolve the issue.
    CURL_OPTS="--tlsv1 -x squid.example.org:3128"
  7. You may want to remove the extra lines related to FILETOOBIG and CONFIDENCE since we do not use these with IBM Cloud.
  8. Call into your Asterisk PBX and leave a message.  Asterisk should now send you an email with a transcription!

Debian Wheezy + Postfix Cluebringer (Policyd v2) + IPv6

I enabled Postfix Cluebringer (policyd v2) on Debian Wheezy so that I could place restrictions on customer outbound mail traffic.  I disabled all modules except core and quota.  I configured two quota rules (X messages/day and X MB/day).  Mail was flowing.  The quota status was being tracked in SQL.  Everything seemed to be working fine.

Several customers were unable to send messages.  The affected customers were connecting to our SMTP servers via IPv6.  I can now confirm that postfix-cluebringer v2.0.10 does *NOT* include support for IPv6.  The following message was logged to the cbpolicyd.log file each time a customer tried to send a message via SMTP over IPv6:

[2014/03/09-17:49:19 – 4009] [CBPOLICYD] ERROR: Protocol data validation error, required parameter ‘client_address’ was not found or invalid format

I confirmed that policyd v2.0.14 does NOT support IPv6 either, so don’t waste time trying to upgrade v2.0.10 to v2.0.14.

I manually upgraded to the v2.1.x pre-release (policyd v2.1.x-201310261831), which DOES support IPv6.  The SQL schema had MANY minor changes between v2.0.x and v2.1.x.  I compared schemas and determined that the SQL upgrade guide was incomplete.  Rather than trying to deal with upgrade issues, I created a new database and manually recreated our quota records/limits.  If you can’t afford to loose your customer quota status/history (see quotas_tracking table), be sure to backup that data and restore those records after the upgrade.  Good luck.

Moving a Windows 2000 (win2k) Physical Computer to a Xen 3.0.3 DomU running on CentOS 5.x or RHEL5

This article describes how to successfully migrate a physical host running Windows 2000 to a Xen 3.0.3 DomU. The Xen Dom0 is running Linux CentOS 5.2 on Intel Xeon CPUs with VT extensions. Other online discussions and examples led me to believe this would NOT work unless I was using Xen 3.0.4 or later. The process was fairly simple. The process works with Windows 2000 Professional as well as Windows 2000 Server variants.

IMPORTANT: I assume you can already create and view other DomU virtual hosts on your Xen server. I also assume you are using a system that supports Intel-VT extensions and that those extensions are properly enabled. Keep in mind that the CPU, the motherboard, and the BIOS all need to support VT extensions. You may also need to enable VT extensions within your BIOS.

First, prep your server as per Microsoft KB 314082. You will save the entire block of registry changes into a “.reg” file and merge with your own registry so that Win2K will recognize your Xen IDE adapter. You will extract the IDE related driver files that they list (Atapi.sys, Intelide.sys, Pciide.sys, and Pciidex.sys) into “system32\drivers\”. I installed a barebones Win2K PRO test DomU before spending a lot of time attempting a migration, so I simply copied the listed driver files from that working DomU to the physical system we were about to migrate. I only copied files that did not already exist. I did not overwite any existing files.

Second, you will want to copy each hard disk in your WIN2K system to an image of equal size. Our WIN2K system had a 20GB hard disk, which we cloned to a 20GB image file on the Xen server. If your WIN2K system has large disks and a lot of unnecessary disk space (ie: 500GB, of which 450GB is free), consider using some sort of tool to migrate the system to a smaller disk (ie: 100GB, of which 50GB would be free) before proceeding. Otherwise, the disk image you copy to your Xen server will be wasting a lot of space!

Our system had a single hard disk, so we only had to create one image using the command below. Repeat this for each hard disk in your system. In our example below, the hard disk from the original system was presented as “/dev/hde”, and the image was saved to “/vservers/win2k-example/win2k.img”. Adjust your paths accordingy. Copying our 20GB disk to an image took just over 10 minutes.

[sourcecode language=”bash”]
time dd if=/dev/hde of=/vservers/win2k-example/win2k.img bs=4k
[/sourcecode]

Lastly, we will create the file containing settings for this DomU. Our file was located at “/etc/xen/win2k-example” and had the following contents. I’m not sure that all of these settings are relevant, but this working config should give you a good starting point.

[sourcecode language=”bash”]
import os, re
arch = os.uname()[4]
if re.search(’64’, arch):
arch_libdir = ‘lib64’
else:
arch_libdir = ‘lib’

kernel = “/usr/lib/xen/boot/hvmloader”
builder=’hvm’

memory = 512
shadow_memory = 8
name = “win2k-example”
vif = [ ‘type=ioemu, bridge=xenbr0’ ]

vcpus=1
disk = [ “file:/vservers/win2k-example/win2k.img,hda,w”, “phy:/dev/hda,hdc:cdrom,r” ]

vnc = 1
vncunused = 1
vncconsole=1

boot=”dc”
#boot=”c”

acpi = 1
apic = 1
device_model = ‘/usr/’ + arch_libdir + ‘/xen/bin/qemu-dm’
stdvga=0
serial=’pty’

usbdevice=’tablet’
[/sourcecode]

Now we can boot our new Windows 2000 DomU virtual host. Be prepared to reconfigure TCP/IP settings for your “new” network card and to resolve other driver issues after the migration.

[sourcecode language=”bash”]
xm create win2k-example
[/sourcecode]

 

2008/09/14 – Jason Klein

Configuring Sipura SPA-3000 as trunk within Asterisk VoIP PBX Server

This article describes how I successfully configured the Sipura SPA-3000 (fw 2.0.13) for use as a single line inbound/outbound trunk within Asterisk at Home (asterisk 1.2.1). Unlike the other examples I found, this configuration is fairly simple and does NOT require configuration of special extensions, etc. This configuration should be fairly secure, but any suggestions and/or feedback are very welcome!

When incoming calls are received by the SPA-3000, they are forwarded to the Asterisk PBX with CALLER ID information and can be routed like any other POTS trunk (ie: as per Incoming Calls config and/or Inbound Routing config by CID). When outgoing calls are placed through the SPA-3000, this device dials the number and connects the call. The person making the call WILL hear the DTMF tones (aka touch tones) that are dialed by the SPA-3000 just before the call is connected. I have not been able to find a way of preventing this (yet).

Configuring Trunk within Asterisk PBX using AMP

Login to AMP (Asterisk Management Portal). Navigate to Setup, Trunks, and choose “Add SIP Trunk”.

General Settings

[sourcecode language=”bash”]
Outbound Caller ID: (leave blank – cannot be used by POTS line)
Maximum Channels: 1 (required – see note below)
[/sourcecode]

NOTE: Each SPA-3000 supports a single channel. You need to setup multiple trunks for multiple SPA-3000 devices.

Outgoing Dial Rules

[sourcecode language=”bash”]
Dial Rules:
1+NXXNXXXXXX ; prefix 10 digit dialing with “1”
1NXXNXXXXXX ; allow all 11 digit dialing as-is
NXXXXXX ; allow all 7 digit dialing as-is
[/sourcecode]

Outgoing Settings

[sourcecode language=”bash”]
Trunk Name: pstn_spa01

Peer Details:
auth=md5
context=from-pstn
dtmfmode=inband
fromuser=asterisk
host=10.10.10.21 ; IP address of SPA device
insecure=very
nat=yes ; omit if no NAT exists between PBX and SPA
port=5061
secret=012345678901
type=peer
username=asterisk
[/sourcecode]

Incoming Settings

[sourcecode language=”bash”]
User Context: spa01

User Details:
allow=ulaw
context=from-pstn
disallow=all
dtmfmode=inband
host=10.10.10.21 ; IP address of SPA device
insecure=very
nat=yes ; omit if no NAT exists between PBX and SPA
secret=KzBTALezmG1a
type=friend
[/sourcecode]

Registration

[sourcecode language=”bash”]
Register String: ; omit – not necessary to register w/ SPA device?
[/sourcecode]

Configuring Outbound Routing within Asterisk PBX using AMP

Login to AMP (Asterisk Management Portal). Navigate to Setup, Outbound Routing, and choose “Add Route”.

Add Route

[sourcecode language=”bash”]
Route Name: ; user preference, avoid special characters here?
pstnspa1

Dial Patterns: ; dial 5 plus 11 digit, 10 digit, and 7 digit numbers
; omit each “5|” to use trunk without dialing prefix
5|1NXXNXXXXXX ; accept 5 + 11 digit dialing
5|NXXNXXXXXX ; accept 5 + 10 digit dialing
5|NXXXXXX ; accept 5 + 7 digit dialing

Trunk Sequence: ; add each available SPA-3000 trunk
SIP/pstn_spa01
SIP/pstn_spa02
SIP/pstn_spa03
[/sourcecode]

Configuring the Sipura SPA-3000

The following example only illustrates changes to default settings. Start by performing a factory reset of your SPA-3000. Connect a handset to the PHONE jack on the SPA-3000 and dial “****” to access the configuration menu, then dial “73738#” (aka “RESET#”) to perform a factory reset.

Login to the web interface of your SPA-3000, click “Admin”, then click “Advanced”. Configuration changes for each tab/page are shown below.

SYSTEM

[sourcecode language=”bash”]
USER PASSWORD: secretpwd ; secures the SPA web interface
; username ‘user’ or ‘admin’?

DHCP: no ; recommend static ip address
STATIC IP: 10.10.10.21
NETMASK: 255.255.255.240
GATEWAY: 10.10.10.30

HOSTNAME: voip-spa1 ; optional
DOMAIN: example.net ; optional
PRIMARY DNS: 10.10.10.2 ; optional
SECONDARY DNS: 10.10.10.3 ; optional
PRI NTP: ntp1.example.net ; optional
SEC NTP: ntp2.example.net ; optional
[/sourcecode]

SIP

[sourcecode language=”bash”]
RTP Packet Size: 0.020 ; improves sound quality (was 0.030)?
[/sourcecode]

REGIONAL

[sourcecode language=”bash”]
TIME ZONE: GMT-05:00 ; Central Time Zone
[/sourcecode]

PSTN LINE

[sourcecode language=”bash”]
NAT Mapping Enable: yes ; only change if NAT exists between PBX and SPA
NAT Keep Alive Enable: yes ; only change if NAT exists between PBX and SPA

PROXY: 10.10.10.24 ; IP address of Asterisk PBX
USE OUTBOUND PROXY: yes
REGISTER: no
REGISTER EXPIRES: 3600
MAKE CALL W/O REG: yes
ANSW CALL W/O REG: yes

DISPLAY NAME: ; leave blank
USER ID: 3501 ; optional?
PASSWORD: ; leave blank

DTMF Process INFO: Yes ; default value
DTMF Process AVT: No ; resolve issues with DTMF
DTMF Tx Method: Auto ; default value

DIAL PLAN 8: (S0<:s@10.10.10.24:5060>)
; forwards incoming PSTN calls to PBX
; resolve issues with DTMF

VOIP-TO-PSTN GW ENABLE: yes
VOIP CALL AUTH METHOD: http digest
ONE STAGE DIALING: yes
LINE1 VOIP CALLER DP: none
VOIP CALLER DEFAULT DP: none
LINE1 FALLBACK DP: none

VOIP USER 1 AUTH ID: asterisk
VOIP USER 1 DP: none
VOIP USER 1 PASSWORD: 012345678901

PSTN-TO-VOIP GW ENABLE: yes
PSTN CALL AUTH METHOD: none
PSTN RING THRU LINE 1: no ; incoming calls do not ring LINE1
PSTN CID FOR VOIP CID: yes
PSTN CALLER DEFAULT DP: 8

PSTN ANSWER DELAY: 5 ; answer incoming PSTN call in X sec
; need to allow time for CALLER ID
; if no CID, you can safely set to 0
; was set to 16
[/sourcecode]

Note regarding FAX transmissions

We have not been able to successfully receive fax transmissions using this configuration, but not for lack of trying. We were also attempting to use a Digium TDM card to accept faxes for a while, with mixed results. We finally concluded that faxing capabilities of Asterisk were not reliable enough for production. Rather than moving to an Asterisk Fax solution, we moved from our older *NIX fax server to an online fax provider who accepts our faxes and forwards them as PDF images.

2006/10/15 – Jason Klein

Configuring MEGARAID driver in CentOS4 kernel to support Dell PERC2/SC and PERC2/DC

This article describes how I successfully installed CentOS4 on a Dell PowerEdge with a PERC2/SC (or PERC2/DC) hardware SCSI RAID controller. The Kernel bundled with CentOS4 uses a newer megaraid_mbox module that no longer includes support for these older Dell PERC-2 (ie: LSI Megaraid 467) controllers. First, you begin by building a “megaraid.ko” module for each kernel you will use (ie: UP/uniprocessor, SMP/multiprocessor). Then you install CentOS4 using the UP module. After installation, you configure CentOS4 to use the MP module.

Building the MEGARAID.ko Kernel Module

You may be able to skip this section by using the files I built. If these files do not work, you will need to build your own kernel modules. You will need to extract each file with bunzip2. [megaraid.ko] [megaraid.ko.smp]

Temporarily setup CentOS4 on another computer. You should only use RPMs from the CD, so that the module you build is compatible with the installation Kernel provided on the CD. In addition to the bare bones installation, you will need the following RPMs:

[sourcecode language=”bash”]
gcc-3.4.4-2.i386.rpm
cpp-3.4.4-2.i386.rpm
glibc-devel-2.3.4-2.13.i386.rpm
glibc-headers-2.3.4-2.13.i386.rpm
glibc-kernheaders-2.4-9.1.98.EL.i386.rpm

kernel-devel-2.6.9-22.EL.i686.rpm

kernel-smp-2.6.9-22.EL.i686.rpm
kernel-smp-devel-2.6.9-22.EL.i686.rpm
[/sourcecode]

Compile the UP (uniprocessor) “megaraid.ko” module first. You will download and extract the source for this module (local copy of megaraid.tar.bz2). Then you will compile the module for use with the 2.6.9-11.EL kernel. Afterwards, you must format a floppy disk and copy the new file to floppy (for use later).

[sourcecode language=”bash”]
cd /opt
wget http://www.tuxyturvy.com/files/megaraid2.tar.bz2

mkdir /usr/src/megaraid
cd /usr/src/megaraid
tar jxvfp /opt/megaraid2.tar.bz2

make -C /lib/modules/2.6.9-11.EL/build SUBDIRS=/usr/src/megaraid modules

fdformat /dev/floppy
mkdosfs /dev/floppy

mkdir /mnt/floppy
mount /dev/floppy /mnt/floppy

cp megaraid.ko /mnt/floppy/
umount /mnt/floppy
[/sourcecode]

Compile the SMP (multiprocessor) “megaraid.ko” module next. You will compile the module for use with the 2.6.9-11.ELsmp kernel. You should also copy this module to floppy (as a separate file name).

[sourcecode language=”bash”]
mkdir /usr/src/megaraid-smp
cd /usr/src/megaraid-smp
tar jxvfp /opt/megaraid2.tar.bz2

make -C /lib/modules/2.6.9-11.ELsmp/build SUBDIRS=/usr/src/megaraid-smp modules

cp megaraid.ko megaraid.ko.smp

fdformat /dev/floppy
mkdosfs /dev/floppy

mkdir /mnt/floppy
mount /dev/floppy /mnt/floppy

cp megaraid.ko.smp /mnt/floppy
umount /mnt/floppy
[/sourcecode]

Manually Loading the MEGARAID.ko Kernel Module

Now you are ready to begin installation of CentOS4 on your Dell PowerEdge system with the unsupported PERC2/SC or PERC2/DC raid controller. When you first boot the CD, type “linux noprobe” to avoid automatic hardware detection. Otherwise, the system will automatically load the (wrong) “megaraid_mbox” module.

[sourcecode language=”bash”]
linux noprobe
[/sourcecode]

When you choose CDROM install, you may receive an error telling you “no cdrom found”. You will need to load the module for the CDROM’s SCSI (or IDE) controller. In our case, we loaded driver “aic7xxx” for our SCSI controller. You should also load your network driver while you can. In our case, we loaded the “3c59x” driver.

Now we will manually load the “megaraid.ko” driver from our floppy disk. Follow the instructions below, starting with CTRL-ALT-F2 to change to a shell prompt.

[sourcecode language=”bash”]
CTRL-ALT-F2
mkdir /mnt/floppy
mount /dev/fd0 /mnt/floppy
insmod /mnt/floppy/megaraid.ko
CTRL-ALT-F1
[/sourcecode]

You can now continue with your installation. The OS will be able to see and partition the volume(s) configured on your PERC2 SCSI raid controller. You must manually install the module BEFORE you press the reboot button at the end of the installation!

Manually Installing the MEGARAID.ko Module

First we will install the UP (uniprocessor) module for use with the 2.6.9-22.EL kernel. We start by switching to a shell console, then copy the “megaraid.ko” module from our floppy to the hard disk. Afterwards, we update modprobe.conf and build a new initial ramdisk image for this kernel.

[sourcecode language=”bash”]
CTRL-ALT-F2
cp /mnt/floppy/megaraid.ko /mnt/sysimage/lib/modules/2.6.9-22.EL/kernel/drivers/scsi/megaraid.ko
chroot /mnt/sysimage
cp /etc/modprobe.conf /etc/modprobe.conf.orig
vi /etc/modprobe.conf
diff /etc/modprobe.conf /etc/modprobe.conf.orig
4c4
< alias scsi_hostadapter1 megaraid --- > alias scsi_hostadapter1 megaraid_mbox

cd /boot
/sbin/mkinitrd initrd-2.6.9-22.EL.img.megaraid 2.6.9-22.EL
cp initrd-2.6.9-EL.img initrd-2.6.9-EL.img.orig
cp initrd-2.6.9-22.EL.img.megaraid initrd-2.6.9-EL.img
exit
CTRL-ALT-F1
[/sourcecode]

Then we will install the MP (multiprocessor) module for use with the 2.6.9-22.ELsmp kernel using a similar method.

[sourcecode language=”bash”]
CTRL-ALT-F2
cp /mnt/floppy/megaraid.ko.smp /mnt/sysimage/lib/modules/2.6.9-22.ELsmp/kernel/drivers/scsi/megaraid.ko
chroot /mnt/sysimage
cp /etc/modprobe.conf /etc/modprobe.conf.orig
vi /etc/modprobe.conf
diff /etc/modprobe.conf /etc/modprobe.conf.orig
4c4
< alias scsi_hostadapter1 megaraid --- > alias scsi_hostadapter1 megaraid_mbox

cd /boot
/sbin/mkinitrd initrd-2.6.9-22.ELsmp.img.megaraid 2.6.9-22.ELsmp
cp initrd-2.6.9-ELsmp.img initrd-2.6.9-ELsmp.img.orig
cp initrd-2.6.9-22.ELsmp.img.megaraid initrd-2.6.9-ELsmp.img
exit
CTRL-ALT-F1
[/sourcecode]

Now the UP and SMP kernels have been configured with the new (old) “megaraid.ko” module! You should be able to boot from both of these modules with support for your PERC2/SC or PERC2/DC controller!

2006/05/18 – Jason Klein