Why do it?
Alexa gets a lot of use in our house, and it is very apparent to me that the future is not a touch screen or a mouse, but voice. Creating an Alexa skill is easy to learn by watching videos and such, but actually creating the skill is a great way to understand the ins and outs of the process and what the backend systems (like AWS Lambda) are capable of.
First you need a problem
To get started, you need a problem to solve. Once you have the problem, you’ll need to think about the solution before you write a line of code. What will your skill do? You need to define the requirements. For my skill, I wanted to ask Alexa to “park my cloud” and have her stop all EC2 instances or RDS databases in my environment.
Building a solution one word at a time
Now that I’ve defined the problem and have an idea for the requirements of the solution, it’s time to start building the skill. The first thing you’ll notice is that the Alexa Skill port is not in the standard AWS portal. You need to go to developer.amazon.com/Alexa and create a developer account and sign in there. Once inside, there is a lot of good information and videos on creating Alexa skills that are worth reviewing. Click the “Create Skill” button to get started. In my example, I’m building a custom skill.
The process for building a skill is broken into major sections; Build, Test, Launch, Measure. In each one you’ll have a number of things to complete before moving on to the next section. The major areas of each section are broken down on the left-hand side of the console. On the initial dashboard you’re also presented with the “Skill builder checklist” on the right as a visual reminder of what you need to do before moving on.
This is the first area you’ll work on in the Build phase of your Alexa skill. This is setting up how your users will interact with your skill.
Invocation will setup how your users will launch your skill. For simplicity’s sake, this is often just the name of the skill. The common patterns will be “Alexa, ask [my skill] [some request],” or “Alexa, launch [my skill].” You’ll want to make sure the invocation for your skill sounds natural to a native speaker.
I think of intents as the “functions” or “methods” for my Alexa skill. There are a number of built-in intents that should always be included (Cancel, Help, Stop) as well as your custom intents that will compose the main functionality of your skill. Here my intent is called “park” since that will have the logic for parking my AWS systems. The name here will only be exposed to your own code, so it isn’t necessarily important what it is.
Utterances is your defined pattern of how people will use your skill. You’ll want to focus on natural language and normal patterns of speech for native users in your target audience. I would recommend doing some research and speaking to a diversity of people to get a good cross section of utterances for your skill. More is better.
Amazon also provides the option to use slots (variables) in your utterances. This allows your skill to do things that are dynamic in nature. When you create a variable in an utterance you also need to create a slot and give it a slot type. This is like providing a type to a variable in a programming language (Number, String, etc.) and will allow Amazon to understand what to expect when hearing the utterance. In our simple example, we don’t need any slots.
Interfaces allow you to interface your skill with other services to provide audio, display, or video options. These aren’t needed for a simple skill, so you can skip it.
Here’s where you’ll connect your Alexa skill to the endpoint you want to handle the logic for your skill. The easiest setup is to use AWS Lambda. There are lots of example Lambda blueprints using different programming languages and doing different things. Use those to get started because the json response formatting can be difficult otherwise. If you don’t have an Alexa skill id here, you’ll need to Save and Build your skill first. Then a skill id will be generated, and you can use it when configuring your Lambda triggers.
AWS Account Lambda
Assuming you already have an AWS account, you’ll want to deploy a new Lambda from a blueprint that looks somewhat similar to what you’re trying to accomplish with your skill (deployed in US-East-1). Even if nothing matches well, pick any one of them as they have the json return formatting set up so you can use it in your code. This will save you a lot of time and effort. Take a look at the information here and here for more information about how to setup and deploy Lambda for Alexa skills. You’ll want to configure your Alexa skill as the trigger for the Lambda in the configuration, and here’s where you’ll copy in your skill id from the developer console “Endpoints” area of the Build phase.
While the actual coding of the Lambda isn’t the purpose of the article, I will include a couple of highlights that are worth mentioning. Below, see the part of the code from the AWS template that would block the Lambda from being run by any Alexa skill other than my own. While the chances of this are rare, there’s no reason for my Lambda to be open to everyone. Here’s what that code looks like in Python:
if (event[‘session’][‘application’][‘applicationId’] != “amzn1.ask.skill.000000000000000000000000”):
raise ValueError(“Invalid Application ID”)
Quite simply, if the Alexa application id passed in the session doesn’t match my known Alexa skill id, then raise an error. The other piece of advice I’d give about the Lambda is to create different methods for each intent to keep the logic separated and easy to follow. Make sure you remove any response language from your code that is from the original blueprint. If your responses are inconsistent, Amazon will fail your skill (I had this happen multiple times because I borrowed from the “Color Picker” Lambda blueprint and had some generic responses left in the code). Also, you’ll want to handle your Cancel, Help, and Stop requests correctly. Lastly, as best practice in all code, add copious logging to CloudWatch so you can diagnose issues. Note the ARN of your Lambda function as you’ll need it for configuring the endpoints in the developer portal.
Once your Lambda is deployed in AWS, you can go back into the developer portal and begin testing the skill. First, put your Lambda function ARN into the endpoint configuration for your skill. Next, click over to the Test phase at the top and choose “Alexa Simulator.” You can try recording your voice on your computer microphone or typing in the request. I recommend you do both to get a sense of how Alexa will interpret what you say and respond. Note that I’ve found the actual Alexa is better at natural language processing than the test options using a microphone on my laptop. When you do a test, the console will show you the JSON input and output. You can take this INPUT pane and copy that information to build a Lambda test script on your Lambda function. If you need to do a lot of work on your Lambda, it’s a lot easier to test from there than to flip back and forth. Pay special attention to your utterances. You’ll learn quickly that your proposed utterances weren’t as natural as you thought. Make updates to the utterances and Lambda as needed and keep testing.
Now you wait. Amazon seems to have a number of automated processes that catch glaring issues, but you will likely end up with some back and forth between yourself and an Amazon employee regarding some part of your skill that needs to be updated. It took about a week to get my final approval and my skill posted.
Creating your own simple Alexa skill is a fun and easy way to get some experience creating applications that respond to voice and understand what’s possible on the platform. Good luck!
-Coin Graham, Senior Cloud Consultant