MAJOR ANNOUNCEMENT: Asterisk Speech Recognition Magic Button

If you have already heard about the Magic Button, read on because the feature set has expanded by a factor of five. This post is to inform the community about the upcoming community release of the Magic Button software, as well as the upcoming availability of the new LumenVox Community Speech Licensing Server.

The formal announcement, live demonstration (always interesting demonstration speech applications on a speaker phone with a microphone held up to the speaker) and details will be presented in Vegas in a week-and-a-half. If you are with the press and are interested in attending the event to see the Magic Button in action or would like to schedule an interview, please email me directly at ethan.schroeder at schmoozecom dott com.

For an audio (MP3) introduction to the Magic Button, visit http://schmoozecom.blogspot.com/

The Magic Button is an extensive speech recognition call management tool built on Asterisk and LumenVox speech recognition technology. It allows users to control their PBX experience with nothing more than their voice.

Welcome to a brand new way to communicate. With the touch of the Magic Button, you can use your voice to control calls, set your away status, send and receive messages, call and page groups of people in your organization, and even ask what the date, time or weather is!

You can initiate conversations with people in your organization, your customers or any telephone number. Just say “call John Smith”, “call extension one zero zero one”, “call extension one thousand one”, “call 5-5-5-1-2-3-4”, “call one-eight-hundred-5-5-5-1-2-3-4”, “Intercom John Smith”, or “Intercom Extension 1-0-0-1”.

When you are on the phone with someone else, you can say things like “transfer to John Smith”, “transfer to extension 1-0-0-1”, “transfer to John Smith’s voicemail” or “park call”. Call parking allows you to put a caller on hold and pick that caller up from any phone in your organization. After parking a call, you will be told the parking slot the caller is in. If you forgot the parking slot, you can retrieve a list of parked calls by saying “list parked calls”. If you or someone else parked a caller in slot seventy-one, you or anyone else in your organization could say “retrieve call seven one” or “retrieve call seventy one” and be immediately connected to that caller.

In addition, you can interact with call groups (ring groups), page groups, and voicemail blast groups. You can say “what groups can I page?”, “what groups can I call?”, or “what voicemail blast groups are there?”. Depending on the groups setup on your system, you can say things like “page all”, “page warehouse”, “call group sales”, or “leave a message for group sales”.

Checking your voicemail has never been easier now that there is the Magic Button. Just say “check messages” and you can login and check your messages with your voice, including having the options of listening to messages, moving messages, deleting messages, and even fast-forwarding and rewinding messages by saying “fast forward” and “rewind”. Additionally you can setup your voicemail by saying “record my name”, “record my greeting”, or “record my temporary greeting”. Finally, you can leave messages for other people by saying things like “leave a message for John Smith.”

Controlling your phone calls is easy with the Magic Button. You can tell the Magic Button your home or mobile telephone number and instantly forward your calls when you are away. Say “set my home phone number to 5-5-5-1-2-3-4” or “set my mobile number 5-5-5-1-2-3-4”. You will then be able to say “forward my calls to my home phone” or “forward my calls to my cell phone”. Additionally, you can forward your calls to any extension or external phone number by saying “forward my calls to extension one-thousand-one” or “forward my calls to 5-5-5-1-2-3-4”. To turn off call forwarding, just say “disable call forwarding”. Don’t want to be disturbed? Just say “enable do not disturb” and your phone won’t ring. When you are ready to take calls again say “disable do not disturb.”

Information is always just a finger press away with the Magic Button. Say “What time is it?”, “What is the date?”, or “What is the weather like?”

Out of the office? No problem. With the Magic Button, you can assign a special inbound DID phone number or hidden IVR option to allow employees access to all Magic Button functionality on the road. Users login with their extension and voicemail password (using their voice), and become their extension and have full access to all Magic Button functionality.

You can control your away status and even find out where other people in your organization are. If you are going to lunch, just push the Magic Button and say “I’m at lunch”. The Magic Button will prompt you if you would like to set a return time and record a temporary away message. Likewise, you can say “I’m away”, “I’m in a meeting” or “I’m out of town”. When you are back, just say “I’m back”. Asking “where is John Smith?” will tell you if John Smith has set an away status. If John Smith is away, you will be prompted with the option to be notified when John Smith is back. As soon as John Smith is back you will be paged and notified that John Smith is back and have the option of connecting to John Smith.

Finally, the ability to import your Microsoft Exchange contacts is completed and in beta testing. Just say “import my contacts” and now you can call people in your contact list! It is tested and working on Exchange 2003 and Exchange 2007 (should work on Exchange 2000, as well)

One of the biggest hurdles to bringing speech recognition capabilities to small and medium business has traditionally been the cost of speech recognition ports on a phone system. In order to ensure non-blocking of service, you need to purchase enough speech ports to supply total possible concurrency. This would easily add 25% to the total cost of the solution to the customer. Initial cost is an issue no more! Through an arrangement between FreePBX, Schmooze Communications, Digium, and LumenVox, we are announcing a Community Speech Licensing Server. By pooling together hundreds (and eventually thousands) of licenses, we are able to provide speech recognition capabilities on an unlimited, non-blocking port model at low monthly per-PBX fees, while providing the rich applications like the Magic Button, voice-enabled company directory and voice-enabled auto-attendants (IVRs) as part of a low monthly service fee for access to the Speech Licensing Server.

How low? How about $5-$50/month low, depending on the size of the PBX? Now we’re talking speech recognition that is affordable to any PBX customer.

Your commitment to the Community Speech Licenseing Server also helps the FreePBX project, as Philippe has partnered with us to bring you this and future speech-driven applications that are easy to configure via the FreePBX interface.

More details and an audio introduction are available on the blog at http://schmoozecom.blogspot.com

Asterisk Voice Recognition Company Directory

Along with requests for the Asterisk Voice Recognition "Magic Button", I have had numerous requests over the last couple days for the Asterisk Voice Recognition Company Directory (AVRCD). As promised before, I am releasing the source for the company directory. Drop to the bottom for links to a demo and all the source files for the Company Directory or keep reading for details.

This isn’t your Daddy’s company directory. My goal in this project was to not only create a voice-enabled company directory for Asterisk, but to also extend the functionality of the company directory in the process. It took over 200 hours and utilizes some of the most advanced web-based technology available for an interactive interface.

The AVRCD is tree-based with advanced web-based drag-and-drop technology that lets you drag the extensions you want in the company directory to a custom tree that you create. You can create sales, customer service, marketing, accounting, and whatever other "department" folders you want and have employees specifically in that department folder. When navigating the tree through voice commands, you could for example, say "support", then "Ethan Schroeder" and be connected to me. You can configure each extension to playback confirmations by either pointing an extension to the user’s voicemail name recording or system recording. If there is no audio file selected as a confirmation, the system falls back on text-to-speech using Flite.

The AVRCD also has custom entries support. These could be entries for ring groups, off site IAX or SIP users that don’t have an extension on the main PBX, etc. You can then drag these custom extensions to the tree, just as the built in extensions. These custom extensions also support custom dial strings, so if you do have a remote PBX, you can have it dial that person over
the IAX trunk through the custom dial string.

The company directory, once setup, automatically configures itself to your existing PBX users. All you have to do is drag them to where you want them, and you can drag the same person to multiple directory folders. It sets up "pronunciations" for you based on their names. These pronunciations are editable by you on a per-extension and per-tree-item basis. So if you
are getting complaints that "Michael Smith" is getting calls to "Michelle Smith", you can phonetically help the speech engine by writing custom pronunciations for each name. For example, "Mishell Smith" and "Mikall Smith".

Now, all of this said, the company directory is definitely in beta form. It’s using cutting edge javascript/ajax/ui libraries, and I’m going to be honest with you, I hadn’t worked in client-side application development for years and never OO Javascript, though I was once a Java developer. I just squashed some of the last UI bugs I could find, but there may be more. In addition, it doesn’t support voice error handling. This basically means that it isn’t threshold aware on the voice recognition level, meaning that if the application "isn’t quite sure" what was spoken, it’s probably going to make a choice anyway. Ideally, this is handled by determining threshold acceptability levels and if a recognized pattern is below that threshold it will prompt with options. But it’s free, so you are encouraged to use it as you see fit.

I’m too busy working on material for the Open Telephony Training Seminar to setup an entire demonstration PBX for everyone to try it out, but I have posted a live demo of the interface. All the source is here, which includes PHP, Javascript, dial plans, and AGI. You’ll need to setup and install LumenVox and Flite. Lumenvox has a Linux/Asterisk developer version available for around $50, or you could attend the Training at the end of February to get a free LumenVox license and the Asterisk Voice Recognition "Magic Button".

Lastly, as with the Asterisk Voice Recognition "Magic Button", this project deserves to be a FreePBX modular for integration more tightly coupled with configuration of extensions. In addition, this is a voice only directory. There is no support for dtmf entry for "the first three letters of the first or last name". This functionality is definitely on the todo list, but won’t be useful to accomplish until the FreePBX integration occurs. FreePBX needs your support to make things like this a reality, so consider attending the training at the end of the month. The training is going to reach capacity, so hurry up and register. If you miss this one, stay tuned for announcements in the coming month on subsequent training.

Demo

Source

Microsoft® Response Point™ PBX, Asterisk® and Beyond

Philippe asked me to post some great stuff I’ve been working on recently. So my name is Ethan Schroeder, and in case you are wondering, I’m one of the organizers of the Open Telephony Training Seminar coming up in a few weeks.

With the "analysts", integrators and Microsoft fan-boys (and girls) going crazy over the Microsoft® Response Point™ PBX, I wondered what all the fuss was about. It has some interesting UC/UM (Unified
Communications/Unified Messaging) stuff that links into the Microsoft family of products. Microsoft’s new Office Communications 2007 server really seems to shine in the UC/UM realm. Then I saw a video on the
Response Point "magic button." A voice recognition button for a PBX? It appeared that Microsoft really did something here.

I wondered to myself if Asterisk could do it. A while back I utilized the LumenVox speech recognition software for Linux/Asterisk to build a speech enabled company directory for Asterisk. That was pretty cool, but a magic button would be a Killer App.

From the start of a dream to actual implementation was quite an experience. Creating a button that works when you are not in a call for some functionality (call initiation) and works while you are IN a call
for call control turned out to be a difficult task, but I managed to make it work.

The result is a magic button that when pushed plays a fun tone and let’s me speak to my phone system in wondrous ways:


  • "call John Smith"
    or "dial John Smith" – dials by name (John Smith, John) or extension number
    (Four-thousand-one/4-zero-zero-1/4’oo’1), or even speak the digits of a 7, 10, or 11 digit phone number.

  • "Transfer to John Smith"
    – transfers a call to a name or extension.

  • "Transfer to John Smith’s Voicemail"
    or "Transfer voicemail John Smith" – transfers the call directly into John Smith’s voicemail.

  • "Park call"
    or "Park caller", parks the call and announces the parking slot.

  • "Retrieve calls"
    – queries asterisk for all the parked calls and gives the user their options using the Flite text-to-speech engine (which I’ll soon be switching over to the incredibly cool Cepstral
    engine with their new "Asterisk Allison" voice).

  • "Retrieve call [parking slot]"
    – Retrieves a specific parked call.
  • Needless to say, that’s only the beginning!

Want to see this in action, and even take home the technology? I’m demonstrating it all to participants of the FreePBX training when I present in South Carolina February 27-29th, 2008. I’ll also be giving away all the parts to make the magic button work, the AGI, LumenVox configs, Asterisk dial plans and an Asterisk 1.2.x patch required for the parking feature, back ported from 1.4. All you need is a LumenVox starter kit and it looks like you’ll get that too. To quote LumenVox: "FreePBX provides a user-friendly and full-featured wrapper to Asterisk, and we are pleased that LumenVox speech recognition capabilities are embedded and will be part of this new training course." Gerd Graumann, Director of Business Development, LumenVox. This is an exclusive for attendees. All of this will coincide with my talk about Microsoft, the potential threat, and the future of Asterisk with Microsoft on the horizon of the
PBX market (it’s not as bad as you think).

A magic button isn’t the only goodie you’ll get, either. Philippe has some other un-released modules he plans on bringing along that help make your life easier for various tasks. We’ve been working with Philippe for quite a while on custom projects for customers, and I’m still really excited
about what he is releasing. And then there are the goodies from the other vendors, but we are not going to go there. The value of the class is in the knowledge we will bring you. You don’t have to go to such a training just to get a free phone, chances at free cards, T-Shirts or a one time opportunity to purchase TDM phone cards at prices you will never see again. The fact that you may get some of that is a nice bonus…

So if you haven’t already, take a peek at the training and see if you can attend. Get all the info on the traing site. As Philippe has already mentioned, we’re almost full and if you want to stay at the hotel where the event is at, you better be the next one or two registrations because as far as I recall, that’s about the number of rooms we have left before you’ll have to stay at a different hotel!

I’m looking forward to seeing you and showing you this really cool speech technology!

Ethan