What's The Greatest Web Software Ever Written?
What are the 12 most important programs we've seen since the modern Internet began with the launch of the Mosaic browser in 1993? Check out our list, and see if
May 5, 2007
When it comes to story assignments, i get all the plums. Last summer, I was asked to wax eloquent on the greatest software ever written. No problem there--just pluck 12 worthy candidates out of the hundreds of thousands of software programs written since the dawn of computing history.
This time it's the greatest Web software ever written. This one's a bit trickier, mainly because the history of the Web is so much shorter. Rummaging around in general-purpose computing's dustbin, filled with 66 years of bull's-eyes, near misses, and clunkers, was hard enough, but at least it allowed for some historical perspective. The modern Internet, which started with the launch of the Mosaic browser on the World Wide Web in 1993, is a relative toddler. Looking at all the youthful endeavors on the Web and deciding which are the best is a little like scanning a class of unruly kids and deciding which ones will become great poets, engineers, and musicians.
The safest thing to do is to start with the Web itself. As it was first implemented at the CERN particle accelerator site in Switzerland in 1990, the Web was a software program loaded on a server. As far as I know, Tim Berners-Lee didn't travel the world, Johnny Appleseed style, loading his software on every Internet server. Rather, he showed how, by following a few specifications and patterns of interoperability, we could get the Web to work everywhere.
Berners-Lee ruthlessly simplified the relationship between server and client, insisting on a few simple standards that would allow widespread information sharing. But when the Web emerged in 1991, it resembled nothing so much as a throwback, a re-enactment of the classic IBM mainframe architecture--powerful servers dictating screens to thousands of dumb terminals in the form of Web browsers. Users' interactions with Internet servers were likewise straitjacketed.
So, before we could move forward in computing on the Internet, it was necessary to fall back. With its statelessness--no user context brought to server requests--and other restrictions, the Web erected serious barriers to sophisticated computing. But software emerged that worked around the Web's limitations and exploited its inherent advantages: simplicity, low cost, widespread reach. These are the criteria I used to determine the Web breakthroughs, the software that showed how the Web could really be used.
If we're looking for great Web software, why not start with Mosaic? It qualifies as a brilliant synthesis of what went before, bringing new utility to the millions of users coming onto the Web in 1993. But, alas, Mosaic was No. 6 on my list of greatest software ever written; no sense repeating myself.
SIMPLE IS AS SIMPLE DOES
The simplest example of what I'm talking about is Hotmail. Written in a combination of Perl and C, it wasn't necessarily sophisticated software. In fact, e-mail on the Web at first was downright clunky. "When I first heard about Hotmail, I thought it was a stupid idea," says Eric Allman, chief science officer of Sendmail Inc. and author of the open source code Sendmail, which powers about one-third of Internet e-mail transfers.
Hotmail launched July 4, 1996, signifying freedom from ISPs |
There would be things you couldn't do with e-mail on the Web that you could do with on-premises e-mail systems, such as change the name of an e-mail account or screen out spam. But Sabeer Bhatia, a recent Stanford grad, focused instead on what a Web mail system could do, using a browser window and its underlying network to offer free e-mail to millions.Hotmail had one characteristic that marks excellent Web software. "The user interface was drop-dead simple," says Allman. Users didn't need to fill in incoming and outgoing POP server TCP/IP address numbers or jump through other hoops, as they would with the e-mail client Eudora. Millions took Hotmail up on the offer. Seventeen months after Hotmail was created, Bhatia sold it to Microsoft for more than $400 million.
In a similar vein, America Online launched a free service called Instant Messenger, and a new way of communicating was born. Instant messaging had existed previously on networked Unix servers as a way for programmers to stay up to date on a project's progress. Quantum Link, then an online service for Commodore 64 and 128 PCs, offered a network service called Online Messages. Quantum Link became AOL, OLM became IM, and the rest, as they say, is Web history.THE FERRARI FACTOR
Simplicity is a hallmark of great Web software. Take the 100,000 lines of lower-level Perl code that went into building Craigslist, the online classified ads system.
I knew very little about the site when I wanted to sell my 1989 Toyota Camry, yet I found it easy to post a small text ad to Craigslist without requiring authorization from anyone. When no offers came in, I figured it was Craig's fault. Then I saw how many sellers were displaying pictures of their vehicles with their ads. I waited until my neighbor, Alfonso, had his red Ferrari out of the garage, parked my rusty Camry next to it, snapped a picture, and posted it late the same night. The Ferrari looked great. Before I could finish shutting down my computer, the phone was ringing with offers. It pays to be able to post your own content.
Craigslist is dull but extremely effective |
Craigslist looks duller than a page of newspaper classified ads, with all its simple text headings. But like the classifieds, people who know nothing about the Web can use it. Craigslist was one of the first sites where thousands of users could upload their own content. It also shields users from pests and nonserious buyers by offering access to mailboxes other than their own to receive offers.The site has stickiness: The average Craigslist visitor goes through 20 pages before exiting, according to Alexa Web Information Service, a traffic measurer owned by Amazon.com. Craigslist averages 20 million new ads a month and 60 million discussion forum posts. Alexa ranks www.craigslist.org as the 40th most popular site on the Web. Its founder, San Franciscan Craig Newmark, explains the design this way: "We know how to keep things simple. ... And I have no design skills."
Craigslist generated so much traffic that employers and recruitment agencies in Los Angeles and San Francisco asked Craigslist to charge them fees as a way of discouraging posers and potential spammers. Craig obliged, charging $75 per employer or recruiter for job postings in San Francisco and $25 in L.A. and five other cities. For the same reason, Craigslist charges $10 per brokered housing listing in New York City, at the request of New York real estate brokers.
Through a series of missteps, and contrary to Craig's wishes, eBay owns 25% of Craigslist. Still, Internet giants ogle the site and note the revenue it leaves on the table. Craigslist posts ads from 450 cities; it charges in only seven. Each of the big Web companies--eBay, Google, Microsoft, Yahoo--has started a classified ad system. Yet Craigslist marches on unimpeded, pulling in an estimated $22 million to $23 million in annual revenue--roughly $1 million per employee. Craigslist's business model has become impossible for competitors to dislodge.THE SEARCHERS
Great Web software isn't only about capturing traffic. A defining characteristic is its ability to bring innovation, a new function, often expressed as a new service, to millions of users.
Search has done that, and many people think of Google as the prime innovator. But there are key attributes of search that Google didn't invent: collecting an index of the whole Web, accessing it in parallel fashion, and delivering results with a speedy response. All those are associated with Google, and fairly so. But they began with Digital Equipment Corp.'s AltaVista.There were already several search engines--Excite, Infoseek, Lycos--when AltaVista launched in December 1995, recalls Louis Monier, who was a search pioneer at Digital's Palo Alto, Calif., research lab and now works as a researcher for Google. In the early days, all the search engines struggled against the Web's nearly insuperable barriers: its size--no one knew for sure how large it was--and capturing an index of the Web's content that was still relevant by the time the index was completed.
Early search engines activated Web crawlers to visit URLs, capture headers and headlines of pages, and organize the information on central servers. Monier points out that a crawler had to wait several seconds for a response after querying a site. At best, search engines could compile information on tens of thousands of sites in a day, and by the time 12 to 14 days had passed, the information would start to get stale. Pages would change since a crawl was initiated, and new pages missed by the crawler would come into existence. "A million pages was their limit," says Monier. By December 1995, Digital's researchers knew the Web had advanced far beyond a million pages.
Monier and the AltaVista team came up with a multithreaded Web crawler, Scooter, that ran on advanced 64-bit Unix servers before 64-bit operating systems were generally available. Scooter could ping a site and, without waiting for the answer, ping more sites, tracking each call and response as a separate thread. While other crawlers proceeded one site at a time, Scooter called a thousand sites at a time and gathered the results. Scooter also collected full pages, not just tops of pages. AltaVista's index of Web site pages was the first that covered the entire Web, Monier says.
Scooter's first crawl revealed a total of 16 million pages, a surprising figure. A second crawl two months later found 25 million. The Web was growing fast and AltaVista could demonstrate it. There are now 114 million active sites on the Web, each with dozens to thousands of pages, according to Netcraft's April survey.
In 1996, AltaVista's massive Web index was distributed across 20 of Digital's latest Alpha servers, implementing search in a parallel fashion by subdividing the index across the cluster. Digital, being a hardware company, trumpeted AltaVista for its demonstration of Alpha hardware performance, recalls Paul Cormier, head of the Digital research group that ran the AltaVista project. Cormier is now executive VP of engineering at Red Hat. But Digital employees and academic researchers realized a new research tool had been thrust into their hands, an open channel to all the information on the Web. Search was no longer hit or miss; researchers built up a faith in AltaVista that had never been tied to a search engine before.Monier saw to it that Scooter kept crawling and the index stayed as up to date as possible. "We tried to keep to subsecond response time," recalls Monier, because when it crept above a second, the use of engine dropped off as if the service were inadequate. He and his team were surprised at how sensitive the public seemed to be to response times, a lesson Google has learned well.
Digital never quite knew what to do with its software prodigy; it spun out AltaVista as a subsidiary, then pulled it back in. Acquired by Compaq when Compaq acquired Digital, AltaVista still languished in a company that didn't know what to do with a software asset.
For three years, AltaVista blazed a meteoric trail in search during a crucial period of Web development. It established search as a tool for both casual and scientific users and brought millions of new users to the Internet. AltaVista still exists at www.altavista.com.
I'm not ignoring Google. Google capitalized on AltaVista's experience by adding the page-rank system and the advertising-based business model. Page rank was indeed breakthrough Web software, but Google's page-rank system was No. 11 on last year's Greatest Software list. No sense ... you know.NOT SO WELL KNOWN
There's another rich contributor to the Web's function, but it's even less well known today than AltaVista. It's the XMLHttpRequest object.Say what?
The XMLHttpRequest object first appeared in March 1999 as part of Microsoft's Internet Explorer 5.0 browser. The browser window before XMLHttpRequest was a static display, like a dumb terminal window. The only thing it could show was what the server sent it as an HTML page. As millions of users interacted with an Internet server, most of them were looking at the same handful of pages, none of them tailored to an individual user's requests as they are today.
XMLHttpRequest changed that. Initially an Active X control, it provided a way to open a background communications channel between the browser and the server over which data could be passed. Before XMLHttpRequest, about the only way a user could get different data was to request a different page.
With Internet Explorer 6.0, in August 2001, XMLHttpRequest was more readily available as a general purpose API instead of an Active X control. It followed Web standards, sought to move data between a server and client as XML or dynamic HTML using the HTTP protocol, and used only JavaScript (or Microsoft's compatible equivalent, JScript) for programming inside the browser window.
The pattern invoked by the API became the cornerstone of Google Maps' ability to respond to an individual end user's request for map data. The customizations that make a MyYahoo account seem tailored to an individual spring from that pattern. "It's the secret sauce of Web 2.0," says Pete LePage, Internet Explorer's senior product manager. Google, Zimbra, and dozens of smaller players are rushing to use the Request object, which in its Internet form is known as Ajax, to build online applications that compete with Microsoft's. "Microsoft probably hasn't received the credit it deserves for having invented XMLHttpRequest," says Scott Dietzen, president of Zimbra, an archrival. The World Wide Web Consortium is working on a standard that formalizes it.How many examples of great Web software does that make? Four. Got to keep moving.
SIMPLE CONCEPT, COMPLEX CODE
The Web punishes complexity and rewards simplicity. But it doesn't reward only simple software; it rewards simple concepts wrapped up in complex software programs.
Exhibit A: On Sept. 3, 1995, computer programmer Pierre Omidyar established AuctionWeb on a personal site he ran to see if used goods could be sold online. What would eventually become eBay brought buyers and sellers together like no auction house before it. EBay not only provided great software that allowed people to sell their goods online, but it also extended APIs to third-party software makers to produce tools for managing large sets of goods online, such as inventory management or listing design tools. Other third parties provided tools for buyers to search for goods on eBay listings.
The accessibility of eBay auctions has been one of the strongest drivers of new users to Web commerce. Two billion items a year pass through eBay, which will generate an expected $7.2 billion in revenue this year.Exhibit B: Launched in 1995, Amazon.com expanded the e-commerce capability of the Web by building a million-volume online bookstore and popularizing the shopping cart and checkout transaction processes. It then capitalized on its e-commerce system by extending it to other retailers. Borders, CDNow, and Virgin Mega are powered by Amazon's e-commerce system, and hundreds of additional retailers link into it via Amazon's e-commerce APIs. Amazon didn't just bring shoppers; it expanded shopping as a standard Web activity.
One Amazon enhancement in particular, known as affinity marketing, used the power of the computer to consult purchaser data stored in a database, sort through what buyers of similar items had purchased recently, then present additional choices to the customer before he or she finished shopping. The technique, a proven sales generator, has been copied at other innovative online businesses, such as Netflix.
Please note: Both eBay and Amazon are involved in ongoing patent litigation over aspects of their respective business models. Another tenet of great Web software: It's not always clear who invented it.VIRTUAL COMMUNITIES
In 1993, the year Mosaic brought the World Wide Web within reach of millions, Howard Rheingold wrote The Virtual Community (Secker & Warburg, 1994) about his experiences with the Well, originally named the Whole Earth 'Lectronic Link by co-founder Stewart Brand. The Well, launched in 1985, was a follow-up project to Brand's Whole Earth Catalog.
The Well existed as a dial-up virtual community hosted on servers in Sausalito, Calif. It served the Bay Area, where participants dialed in to forums, discussion groups, and other forms of electronic communication. Rheingold found it so addictive that his little daughter used to say, "Daddy is saying 'holy moly' to his computer again," and the rest of the family knew he was talking to his friends at the Well.
World Of Warcraft: 8.5 million fans |
How could something launched five years before Berners-Lee even described the World Wide Web be considered great Web software? Because it so clearly rolled up the innovations of bulletin boards, discussion forums, and news groups into a broader form of online community. By the time the Web came along, the Well had provided the model and proved the viability of such communities, and they proliferated in the new environment, using a variety of social software systems.
When Jim Gray, the highly regarded Microsoft researcher, disappeared while sailing off the coast of San Francisco in January, a spontaneous community formed around the task of capturing and inspecting satellite data to track him down. The effort failed, but the idea that such a search could be mounted on the Web with large numbers of volunteers coordinated around different tasks would have been impossible without a much earlier example of such a virtual community: the Well.
Speaking of virtual communities, much has been made of a three-dimensional virtual world known as Second Life. For my money, more significant examples of 3-D virtual reality can be found in massively multiplayer online games, which offer playful, real-time activity where the actions of one participant affect another. The possibilities in terms of training people to complete complicated team-oriented tasks seem obvious. The godfather of this genre is Blizzard Entertainment's World Of Warcraft. Introduced in 2004, World Of Warcraft's 8.5 million rabid fans--3.5 million in China alone--put Second Life's 6 million registered residents in the shade.
Here's my list of the greatest Web software so far, in alphabetical order: AltaVista search, Amazon, AOL Instant Messenger, Craigslist, eBay, Hotmail, XMLHttpRequest, the Well, and World Of Warcraft. Not bad, but that's only nine. Three more to go. Must soldier on.WISDOM OF CROWDS
Another form of collaborative knowledge-building that exploits the power of networking and is freely available to a large audience is the wiki. The best-known, most widely available wiki is Wikipedia.Launched Jan. 15, 2001, Wikipedia sits atop a MySQL open source database system. Its software must be able to handle URL redirection and scale to millions of users. It uses the content management features contained in an open source wiki building system, MediaWiki, written in PHP by Lee Daniel Crocker and customized for Wikipedia. It's the 37th most popular site on the Web, according to Alexa, with 27,000 contributors in 2005 (the last count available).
Wikipedia is dogged by accountability issues. In 2005, John Siegenthaler, founding editorial director of USA Today, was identified in an entry as a suspect in the assassination of President John Kennedy. He's not; someone had changed the entry as a joke. It was eventually corrected. But if changes can be made anonymously, can a handful of editors be relied on to catch all incorrect and malicious edits?
Wikipedia might solve this problem, I think, by requiring contributors to submit short biographies of themselves and connect them to their share of an entry, and by invoking a collective intelligence that calls on readers to comment on the writer. As author Eric Raymond says of open source software development, where multiple reviewers of code improve its quality, "many eyes make all bugs transparent."
That's an example of what New Yorker staff writer James Surowiecki calls, in his book of the same name, The Wisdom Of Crowds (Doubleday, 2004). The logic goes like this: In certain instances, the responses of crowd members, tallied as a whole, yield the right answer more frequently than individual responses from smart members of the crowd. As a network capable of eliciting responses from millions of participants, the Web would seem particularly suited to the wisdom of crowds.
An automated attempt to implement the wisdom of crowds is Digg. Registered members post links to content they find interesting from other Web sites, then vote on that content by clicking on the "Digg It" button--or not. Popular content goes to the top of the list; unpopular content drops off. To have a story endorsed by the readers of Digg has become a kind of online currency--if it ranks high on Digg's list, it's guaranteed a lot of readers from other sites, which translate to traffic for the originating site.Kevin Rose launched the news-oriented Digg site on June 26, 2006, and two months later it was the 24th most visited site on the Web; now it's the 92nd. With success have come attempts to exploit its voting process, with services paying registered Digg users to vote for their designated stories. Some users comply, but in general, placement of the paid content on the site has been voted down by other Digg users, demonstrating once again the wisdom of crowds.BUILD IT AND THEY WILL COME
I now have 11 choices. Here's my list, in prioritized, descending order:
12. | AOL Instant Messenger |
---|---|
11. | Digg |
10. | Hotmail |
9. | World Of Warcraft |
8. | Wikipedia |
7. | XMLHttpRequest object set |
6. | Amazon.com |
5. | eBay |
4. | The Well |
3. | Craigslist |
2. | AltaVista |
My last choice is also first, my No. 1 pick for greatest Web software developed so far.
Berners-Lee's enforced step backward to a simpler platform brought about new concepts and new opportunities. The platform was based on asynchronous communications, where one system delivered a message to another when it was possible, rather than both needing to be available when the delivery was initiated. The user sessions on the platform were stateless; servers using HTTP could rapidly serve information pages, without worrying about carrying information about the user forward from one visit to the next or even one page to the next.Before Craigslist, Hotmail, or the other user-intensive sites could be developed, software was required that could cope with the need to serve millions of HTML pages in quick succession and still draw on background databases and other resources. It would need to bridge the new HTTP protocol to many of the back-end systems necessary for Web operations.
Enter the Apache Web Server. "There wasn't an established Web server when it first came out," recalls Brian Behlendorf, co-founder with Roy Fielding of the Apache Group. Most early Web site managers used NCSA HTTPd, the early Web server developed by Robert McCool at the National Center for Supercomputing Applications (the source of the Mosaic browser). But the HTTPd server didn't scale smoothly as traffic mounted, it couldn't easily manage more than one Web site at a time, and it needed more APIs--a lot more--to interface with back-end systems. "The first Web sites were having common problems with the amounts of traffic," recalls Behlendorf.
The original NCSA HTTPd was improved by a virtual community of Web managers that became known as the Apache Group. In its second version, the Apache Group tore apart the server and rebuilt it as a series of modules, which became the Apache Web Server 2.0. (The latest version is Apache 2.2.4.) The new design let different contributors work on different parts of the server without holding up one another. The server advanced quickly.
In 1998, IBM announced it was dropping its own Web server development and contributing to the Apache Group. IBM said it would include Apache with its WebSphere middleware. The move had the effect of winning acceptance for open source code in business and drawing attention to Apache as it was pitted against its chief rival, Microsoft's Internet Information Server.
The Apache Group was one of the first open source projects to develop a product that competed successfully with commercial code. Apache's market share has fallen off its peak of just under 70% of active sites on the Web, but Apache still powers 66.9 million Web sites to Microsoft IIS's 35.3 million, according to Netcraft's April report. Support for Apache comes from a variety of commercial sources, including IBM and Covalent Technologies.Apache was a volunteer project in which skilled developers exchanged ideas, parceled up work, eliminated bugs, and committed finished code to a central code management system--Behlendorf hosted the original contributions on his own computer. Apache addressed the user scalability problem and moved on to develop a tight linkage with PHP, the scripting language that would become dominant on the Web, tying disparate elements of sites together and supplying the small applications that tied databases to Web pages.
The quick access to data meant pages could be refreshed with the latest information or tailored to the individual with specialized data. Apache was linked tightly to an early open source database, MySQL, a system originally designed for fast reading and serving of data rather than data storage, a property suited to the new Web.
In its example of timeliness, innovative technology, volunteer development, and ability to match commercial competitors, the Apache Web Server set a standard that many have sought to emulate but few have matched. Most of all, it enabled "the network effect," where the concerns and passions of one person connected with those of others and resulted in the construction of new Web sites, swift communications, and virtual communities by an endlessly diverse set of builders.
With Berners-Lee's fateful step backward, the World Wide Web returned computing to a simpler platform and opened the door to a series of rapid steps forward, not just for a single interest group but for the world. The best software on the Web capitalizes on those possibilities. The Apache Web Server and its compatriots are the harbingers of a new age. We can only dimly perceive the outline of that age, but many have started to think it will be characterized by more open standards, more freely available software like Apache, and more human intelligence finding its free expression and outlet in communities on the Web.
Continue to the sidebar:AltaVista--Almost GoogleRead the blog: Greatest Web Software: Let's Hear Your Choices View the timeline:A History of Web Software |
0
You May Also Like