Here's where Siri, Apple's new anthropomorphic iPhone femme fatale, with her HAL 9000-like ability to understand and act on natural-language commands, likely portends the future of mobile collaboration. As I write in this InformationWeek report reviewing the iPhone 4S, Siri is already an amazing achievement, able to complete many common tasks, from sending text messages to performing location-aware Web searches. While Apple's ad makes it all look a little more magical than it is in practice, have no doubt that Siri, with its combination of state-of-the-art voice recognition and cloud-based, AI natural-language parsing, is a technical tour de force. My colleague Jonathan Feldman agrees: "Mark my words: Siri will be as big or bigger than the iPad. It's the beginning of actually useful natural language processing and associated automation."
"Automation" is the keyword. Yes, she "understands what you say" and "knows what you mean," but so far, Siri's repertoire is limited. At the risk of looking a gift horse in the mouth, Siri controls only apps embedded in iOS: phone, messages, notes, contacts, calendar, Web search. But she could, and undoubtedly will, do so much more. So far, and I definitely expect this to change in the next year, Siri is wearing a burqa in that she exposes no API. Thus, iOS apps, like the aforementioned mobile collaboration clients, can't tap into Siri's language-parsing, device-controlling wizardry. Although some apps, like the ingenious Siri Tunes, have figured out clever ways of using Siri's ability to send text messages as a user interface to their back-end Web services, these are still just ingenious hacks working around an inherently closed system. What's needed is a full API open to third-party app developers, and we think that's something Apple will release, eventually. While the company hasn't made a statement to that effect, Siri is just too compelling to wall off; it would be like not allowing third-party apps use of the camera.
What might a voice-activated collaboration client do? Siri's current ability to make calendar entries, send text messages, and take dictation hint at the possibilities. For example, the standard way of sharing comments is Facebook's wall metaphor--a comment stream threaded beneath an anchor topic. In the context of enterprise collaboration, the topic is likely to be a PowerPoint deck or meeting agenda. While it's possible, although rarely pleasant, to read heavily formatted content like a slide deck on a smartphone, typing a comment is onerous, even with a client optimized for the smartphone's small display. Wouldn't it be nice to dictate your thoughts instead?
Of course, this text-to-speech example just hints at what innovative developers might do with a cloud-based speech-recognition engine. Siri already understands context, in that prior requests inform subsequent answers. Ask "Find me the nearest Mexican restaurant," and Siri replies with a list based on your current location. Follow up with "No, make that pizza," and Siri remembers both the context (restaurants) and location. Imagine if this same logical power could be applied to any application. Say you're a sales rep and your manager has shared a spreadsheet with regional sales estimates. If you have updated figures for your territory, instead of hunting and pecking changes on the tiny touchscreen keyboard, wouldn't it be nice to say, "Siri, change the sales estimate for the Northwest region from 750,000 to 900,000" and have the update applied, along with a comment field indicating who made the change? Similarly, when reviewing a project manager's task schedule on the road from your phone, wouldn't it be nice to update it with a simple voice command? "Siri, change the completion date for software pilot testing to Feb. 9."
Natural-language control of computer systems is not new; it's been a staple of science fiction since Star Trek. But Siri, with its merging of client-side language processing and server-side meaning interpretation, has raised the bar on what's possible. While talking to a laptop, with its expansive keyboard, never made much sense, talking to your phone couldn't be more natural. Instead of having conversations with friends or colleagues, let's just have a conversation with the device itself. Siri ushers in the era in which speech recognition doesn't let devices just take dictation but actually engage in conversation--tell it what we want, react to the response, and modify our request--and use speech as a software UI.
The future of smartphone collaboration lies in vocal, not tactile, interaction. Siri blazes the trail.
Get lessons from five companies on the front lines of implementing unified communications. Also in the all-digital supplement of Network Computing: Mike Fratto on how to make the case for UC. Download the supplement now. (Free registration required.)