IBM's Watson: 'Think' Of The Implications

"Think" has been a watchword of IBM for a very long time. Until recently, that simple word applied only to IBM employees and the customers that would embrace the principles behind it. With the success of IBM's natural language computer system Watson in last week's "Jeopardy!" matches, "think" can now be applied to one of its computer systems, as well. However, it's the future of Watson that we would now take into account.

David Hill

March 1, 2011

12 Min Read
Network Computing logo

"Think" has been a watchword of IBM for a very long time. Until recently, that simple word applied only to IBM employees and the customers that would embrace the principles behind it. With the success of IBM's natural language computer system Watson in last week's "Jeopardy!" matches, "think" can now be applied to one of its computer systems, as well. However, it's the future of Watson that we would now take into account.

IBM recognizes that Watson could open numerous commercial opportunities, so it should come as no surprise that, in the wake of "Jeopardy!," the company publicly announced the first of them: an agreement with Nuance Communications to sell Watson-based products to health-care providers. Nuance offers voice recognition software used by doctors to dictate medical records. Now doctors will be able to use Watson as an assistant by orally describing symptoms and getting diagnostic information in return. Note that Nuance can communicate via multiple platforms, including automated call centers, mobile phones and auto dashboards. Watson will make available to the doctor reference materials, prior cases and patients' latest medical records.

As a speaker on an IBM webinar about the future of Watson pointed out, the system will help with differential diagnoses by providing a systematic method for identifying unknowns. Watson uses evidence-based knowledge and develops a level of confidence in each answer.

On "Jeopardy!" Watson could return only one response. In this real-world scanario, Watson can return more than one answer. This is good because there may be no one precise answer. Moreover, the need for more evidence, such as a test, may be indicated, and that can help improve the diagnosis. The agreement with Nuance gets IBM's Dr. Watson ball rolling and would seem to be a good thing.

Understand that Watson is a huge, monolithic system, spreading 2,880 cores over some 90 blade servers, with 15TBytes of memory and scores of terabytes of storage. So, in one sense, Watson might be considered the new mainframe. As such, a full-blown Watson is unlikely to fit on your desktop in the near future. However, given the dramatic improvements in network communications, Watson might very well be the core system of a software as a service (SaaS) cloud that can be accessed over the Internet by any device, including mobile devices such as cell phones. Technically, the latency of a cloud should not be a problem--see Google as a reference.That said, the first iteration of Watson was not designed for a large number of concurrent users. Basically, it presumably sat idle for a long period of time and then became very active for under 3 seconds (the maximum time to be successful in responding to a "Jeopardy!" question). In theory, Watson should be able to handle a number of concurrent users, but the question is, how many? That is a question for IBM to answer, but figuring out how to get more out of software with more hardware is something the company has had decades of experience with.

Remember also that the constraints that applied to Watson on "Jeopardy!" can be relaxed, when necessary, in the real world. For example, an under-3-second response time may not be necessary in some cases, especially if Watson has to perform complex analysis. Also, a single precise answer may not always be necessary, as in the case of the differential diagnoses discussed above. What does all this lead to? Let's explore some key implications of a natural language computer system.

The concept of "encyclopedic" knowledge is typically short-hand for all knowledge (in the sense of information), but it is really not true. Watson could ingest an encyclopedia as an afternoon snack, but would never have access to all information, such as the proprietary information contained within an enterprise. However, there are a couple of ways that a Watson-like information repository could be extended to increase its knowledge.

The first is domain-specific information that would add depth to Watson's responses. The "Jeopardy!"-playing system was broad because that was what was needed, but not deep on any domain. Another approach would be to add the ability to use unstructured information, such as medical images. To date, Watson has dealt primarily with text, which is both semi-structured and content-aware information in that it can be indexed and searched. Understanding of true unstructured information would require adding pattern recognition to Watson's existing capabilities.

Humans communicate naturally with other humans verbally, which is a synchronous means of exchanging information, with the exception of asynchronous messages like voice mail. Yes, human communications are enriched by visual clues such as facial expressions, but that is not a primary form of information sharing, at least most of the time. Technology has extended our communications to asynchronous communications through touch (writing) and vision (reading). Now the way that people tend to communicate with computers is through touch, whether keyboard, mouse or touch screen.Yes, speech recognition has matured greatly, but that is for specific application domains that have limited natural language capabilities. A natural-language-understanding computer system changes the paradigm from one of touch to one of speech. Touch will not go away, but natural language capabilities could mean that speech will now become a common means of interacting with computers.

But first we have to understand what we are doing when we perform those actions. When we give commands to a computer we are asking it to behave or act the way that we want or prefer. For example, when I strike the letter "a" on my keyboard while writing this document, I am commanding my computer to place the letter "a" where I am at in the document and to keep it there unless I decide to make a change. When you ask a natural language computer system a question in natural language, you are commanding it to give you an answer to the best of its ability.

One goal of computer speech recognition is to improve ease of use. For example, a graphical user interface is easier to use than a command line, but is still not overly simple. Do you really understand the menu choices and how to do something? A natural language interface system with the proper knowledge base could answer questions ("How do I....?") or even essentially replace the GUI, letting users just do it rather than find the necessary element in a GUI menu to make something happen. In fact, natural language could be critical for extracting the maximum benefits from devices like smartphones, where looking at a tiny screen or texting simple messages simply doesn't provide the interaction variety, richness and speed that speech can provide.

Note that speech can also speed up the ability to make errors or to be unintelligible. And that changes the paradigm of communicating with the computer subtly from "commanding" or "telling" to "asking." That is because the computer can now talk back. "I don't understand what you are saying," or, "Do you really want to transfer a billion dollars from your savings account to your checking account?" That is because the computer can be enabled to make responses based upon its understanding and not just take things at face value.

All in all, we believe speech will eventually become king in computer interactions and that leads to other implications.
 
It is easy to imagine how a precise answering system like Watson could become a key Internet application that might possibly upstage search engine vendors like Google and Microsoft. While that would be an important application, it only touches the surface. Let's say that you want to have a natural language computer system at your beck and call. If IBM (or some other vendor to which IBM would be willing to license the use of Watson's DeepQA technology) wants to make that capability available to you, it is likely to do so through a SaaS-over-the Internet offering.OK, imagine this: Let's say that you have a good high-speed Internet connection and want to put your natural language interface (NLI) to work. Well, I am in my word processer, so, NLI, can you tell me how I can do something in this program that I don't know how to do? Or, skipping that, can you do something for me, such as setting the margins and printing the resulting document? Now, to begin with, the NLI program running beneath Watson knows nothing about these processes, but I'll bet a specialized knowledge base could be created and either made available to the SaaS vendor or offered to you for use on your desktop/laptop.

But that only starts things. Let's say that you are in your browser. Why type in a URL? Why not simply ask the NLI to take you to a Web site--say, Amazon? Then while there, you could initiate a search by asking, "Are there any new books by so-and-so?" That's innocuous, you say. OK, let's extend the model to other search areas. Perhaps you say, "Watson , tell me if there are any flights between Boston and Las Vegas on such-and-such a date." You can ask questions on flights, fares and times until you get what you want. Couldn't you simply go to Expedia and do the same thing? Sure, but maybe not as easily. Well, can't Expedia offer a NLI? Sure, but you don't want to switch NLIs.

Why not? One reason is that an NLI might be personalized for individual users to include information such as credit card numbers and account passwords. In some cases, an NLI can even eliminate the need for an app. In other cases, the app (say Expedia) may offer some other things, such as cancellation protection and deals that are available only to it. However, an app will have to learn to play nice with NLI or risk not being taken into account when the NLI goes to work. Note that this is even more true when a cellphone is used and the NLI serves as the communication vehicle between human callers and the distributed digital world. In this scenario, all apps must go through NLI. For these and other reasons, the NLI system is a killer app because it has the potential to become an app of all other apps.

Of course, computer systems can and do fail. Crashes are well-known, disasters occur unpredictably, and software is anything but perfect. In fact, software is known for, as a friend of mine once said, its unmarketable special features--more popularly known as bugs. That simply means that it does not carry out its tasks in the way that the programmer thought that it should. Like any system, the Watson system can possibly crash and may well have some latent bugs within it, but that is not the main reason it is fallible.

A natural language computer system is fallible because it cannot meet the dictionary definition of absolutely trustworthy or sure. Why? Because it may not have all the necessary evidence to make a decision with confidence. A more important reason, though, is that it may not have enough analytical horsepower to make the correct decision, even though it thinks that it does. On "Jeopardy!" Watson's top three answers and level of confidence on all three were displayed. Although things moved quickly during the game, in several instances, Watson seemed to have 97 percent confidence in responses that would have been dead-wrong.Why not add more analytics to the system to improve those responses? Besides the law of diminishing returns, the world of natural language is inherently probabilistic, uncertain and ambiguous. We have to accept the fact that world-class computers will always be fallible, just like people. Anthropomorphically, natural language computer systems are only human. Still, that creates an issue. Say that a doctor fails to take the advice of the computer, the patient dies, and the computer's recommendation was shown to have been correct. Conversely, say the doctor takes the advice of the computer, the patient dies, and the computer was shown to have come to an incorrect recommendation through ignoring and not presenting evidence because it thought that evidence was irrelevant. Can anyone be charged with malpractice?

Now, I hate to raise disturbing legal issues, but they have already come up and cannot be ignored. While I have no legal background, I can only express my opinion. The doctor is likely to be held responsible in both cases as the human involved. Although the computer conceivably could be deemed "responsible" in the second instance, the computer system's owner could claim that the system is known to be fallible. Moreover, the doctor theoretically could have asked more questions that might have led to a different set of recommendations. In any event, the benefits of a Dr. Watson far outweigh the risks. This issue of risk has to be faced squarely and up front so that physicians feel comfortable in using a Dr. Watson as an assistant, but recognize that it, like any human assistant, is just as likely to be fallible.

Of course, Watson's victory in the Jeopardy! Grand Challenge clearly demonstrated that the long-standing natural language problem had been solved. That was very impressive, but where does Watson go from here? IBM clearly has one answer in the Dr. Watson system that will come via its agreement with Nuance Communications. The company also noted opportunities for Watson-like solutions in areas including finance, technology help desk services and government/education, and it probably has a number of things in the works that have not advanced to the stage where they can be discussed publicly.

Still there are a number of implications that IBM, other IT vendors and users need to think about. The ones discussed here are only a starting point. The need to add more depth to Watson's knowledge base is one. However, how full natural language capabilities will change the way we interact with computers is going to be important, because a natural language interface has the potential to be placed in a position of control, which will make it a killer app. However, even if that is the case, a natural language computer system will not be infallible, and recognizing that fact is essential in minimizing negative consequences. Though the Jeopardy! matches were impressive, Watson may have a much greater impact upon the world of information technology than simply providing a vehicle for quickly providing precise answers to complex questions.

IBM is a client of David Hill and the Mesabi Group.

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights