|
By Nick Gall
You should see my office. People tell me it looks like a bomb went off in it. Papers are everywhere, vendor briefing kits sit in stacks around the walls and manuals are strewn across my desk. The information I need is within arm's reach, but I have no way of finding it other than laboriously combing through every piece of paper.
This is why I love getting information digitally. I'll take e-mail over a letter or a fax any day. I love receiving presentations in PowerPoint and white papers in Word. Of course, my file directories, e-mail inbox and folders now resemble my office in electronic form--stacks of electronic documents in no particular order, which I must sift through to find a
needed piece of information.
But the electronic sifting is done by a search engine using Boolean logic--and that makes all the difference in the world. What would be an impossible job with paper, becomes possible with digital documents. Lately, though, I've noticed that searching has become an ordeal. Either I get too many hits because my search terms are necessarily generic or I get no hits because the document is zipped or I misspelled a term. And don't even get me started on searching the Web--that is quickly becoming an exercise in futility.
All of this frustration of "information everywhere and not a drop to drink" is simply my own personal experience of a phenomenon plaguing every facet of IT--woefully inadequate catalog services. They go by many names: catalog services, naming and directory services, repository services, registry services. What they all have in common is the goal of mapping from a name, role, concept or category to one or more corresponding entities. I give a file system (a catal
og) a file name and get back a file handle. Or I give it a directory name and get back a list of the file names in the directory. I give a registry (a catalog) the name of a component (e.g., "spelling checker") and get back one or more spelling checkers installed on my system. It is my belief that catalog services are fundamental to the design of interoperable distributed systems and that the lack of adequate catalog services is the No. 1 impediment to interoperable distributed systems. The information is at our fingertips; we simply lack the ability to get it when and where we need it.
Signs of Strain
The signs of strain caused by inadequate catalog services are visible everywhere. For example, the biggest obstacle in building enterprisewide data warehouses is the lack of complete and coherent metadata. Establishing a metadata repository--a catalog of what data is stored where and how to reconcile inconsistent formats in which the data is stored--is the most difficult part of establishing an enter
prisewide data warehouse.
The same sort of catalog issue is rearing its ugly head on the operational side of IT. Integrating APEs (application packages for the enterprise) is mostly about reconciling different representations of the enterprise stored in each APE's proprietary repository (a catalog). Tools for mapping between one repository and another are just now appearing.
Microsoft's Windows Registry, a catalog of system-configuration information, is an example of catalog inadequacy. Although the registry was a great leap forward from the ad hoc and totally unmanageable .INI file system of Windows 3.1, it illustrates how much more is needed to create a true distributed-system configuration catalog. In fact, initiatives as diverse as Microsoft's ZAW (Zero Administration Workstation) and DNA (Distributed Networking Architecture) are dependent on the company's ADS (Active Directory Services)--a next-generation registry, naming and directory service.
Perhaps the most obvious example of the crippling e
ffect of inadequate catalog services is the Web. The full-text search engines just aren't cutting it anymore, and the existing catalogs (Yahoo, for example) are woefully deficient.
|