Creating a conversational app from scratch is like building a visual app pixel-by-pixel. Sinking time and effort into low-level details like rectangles and coordinates is no way to build an engaging visual application, and the same argument holds for a conversational one. SayKit takes care of the details, freeing you to take care of the experience.
Conversational apps should involve more than just answering questions. The SayKit SDK manages the state of a conversation between the user and the app, enabling you to build voice experiences that go deeper.
The ability to define your own actions and detect parameters is the most basic feature in creating a voice-based experience. For example, in the text “I’m looking for peanut butter”, intent recognition allows you to assign “I’m looking for” to an action in the app (in this case, a product search). Entity extraction enables you to separate “peanut butter” from the intent so you can use it to specialize your action (perform a search for “peanut butter”).
The ability to speak to an app without using your hands opens up the possibility for many powerful applications that might not otherwise be possible.
Dialogue Management allows an application to chain a sequence of interactions together. This is most commonly used to clarify spoken input, but it can also be used to simplify complex interactions.
Syncing the conversational state of the app with the GUI, without creating an entirely separate application flow (one for the conversational part of the app and one for the visual) can be challenging. Since visual information is expressed instantaneously and audio information is expressed over time, syncing the two modalities requires a lot of maintenance. Moreover, specifying code and content for visual and audio presentation is very repetitive, which makes software less maintainable. SayKit helps smooth over both of these problems.
When it comes down to it, creating the voice experience for an app is simply creating its User Interface (UI). When you build an app, the graphical elements are defined in code or as assets stored on the device. They are not loaded from a server. For example, no mobile or web frameworks would ask the developer to contact a server to determine if a user touch was a single or double tap. However, that is not the case with voice-based frameworks.
Mirroring application state or structure on a server can become problematic, resulting in either unmanageable applications or limiting the UI to shallow interactions. SayKit allows a developer to sidestep this problem by keeping application state local.
Get access to the SayKit SDK and subscribe for our updates.