|
FEATURE STORYThe Paperless Officeby Michael Hurwicz
T
he paperless office. We've talked about it for years, and most of us are no nearer to it now than when we started talking.
One reason: Most of the heavy-duty, expensive document imaging solutions have been based on Unix, while Windows and NetWare have been the standards for office LANs.
We've been told that our puny Windows clients and NetWare servers couldn't handle production-level document imaging. Nevertheless, a few intrepid vendors took up the gauntlet and built document imaging products based on NetWare Loadable Modules (NLMs) and Windows. We tested three such products--Compulink Management Center's LaserFiche Windows/NLM, PaperWise's ImageWise and Imagery Software's (a subsidiary of Eastman Kodak) GroupStore.
We wanted to test two others: Simplify Development (Mailroom for Windows and ShareScan) and Lanier Worldwide (IMSONLINE). Both chose not to participate.
The three products share the same basic functionality: scanning, indexing and retrieving. However, they differ in functionality, ease of installation, configuration and use, and reliability and performance.
LaserFiche and ImageWise are neck-and-neck, with ImageWise being a bit less expensive. Yet they differ significantly, most notably in their support for the three basic ways of indexing, searching for and retrieving documents.
All three use templates--a collection of index fields associated with a document. Users fill index fields on templates with keywords that identify the particular document. Only Ima
geWise supports automated creation of keywords. Only GroupStore lets you reuse index field definitions from one template in another.
Only LaserFiche offers full-text indexing--translating document images into text via optical character recognition (OCR), making every word a keyword. LaserFiche is also unique in letting you locate documents by browsing a graphical tree of folders and files, much as you would with the Windows File Manager.
LaserFiche's graphical browsing and full-text indexing make it especially well suited to general office filing, while ImageWise's automation for template-based indexing makes it faster and easier for processing masses of paper. GroupStore, despite many good features, including reusable index fields, is not as mature as the other two products.
Separate scanning, indexing and retrieval modules also make it very easy to set up an efficient document imaging assembly line. Batches of documents are dynamically queued and assigned to indexing stations as the operator at each station requests the next batch. Neither LaserFiche nor GroupStore does that.
ImageWise also offers the best template-based retrieval. It was the fastest retriever in the bunch, once index information had been entered. More important for most users is the "query by thesaurus" feature that lets you enter keywords by selecting fr
om a pick list consisting of either the last 10 keywords entered for that field or any 10 keywords you select beforehand. In contrast, LaserFiche just keeps the most recent keyword as a default for the next search. GroupStore doesn't even do that--although it does "remember" the template type of the most recent search. Query by thesaurus can often nearly eliminate keystrokes.
PaperWise offers four companion products for ImageWise. Paper-Route provides workflow functions, including document routing, task assignment and flow reporting. HyperDrive provides hard disk-based caching to speed up retrievals from an optical jukebox. DataWise provides Computer Output to Laser Disk (COLD), a function that indexes documents, such as reports or invoices, that were created directly on the computer. DisplayWise provides integration with DOS applications and terminal emulators, so that information on the screen can be used as index values to retrieve images.
We had several days' worth of trouble configuring Kofax Image Products' KF-920 Software Document Processor, which ImageWise requires. Once that worked, we had minor problems--some places where the user interface was nonintuitive or error checking was weak--but no show stoppers.
In general, ImageWise's retrieval functions are somewhat limited. For instance, each database can have only one set of index fields, rather than multiple templates and folders, which the other products offer. You can define only 10 indexes for a database. Although this may keep users from creating unwieldy, slow indexing systems, it might also keep ImageWise from handling some complex filing tasks.
There are no Boolean searches, just an implied "AND" between index fields. Finally, there is no graphical tree browsing interface.
LaserFiche consists of just two modules: the Windows client program and the NLM. You scan, view images, OCR, fill in index fields and retrieve with the client module. The NLM does file management, full text indexing, all searches and security administration. There is also a short menu of functions available at the file server console, including several monitoring functions, enabling and disabling logins and indexing, and indexing all text documents. LaserFiche really stood out in the installation department, because it required almost no tinkering compared with the other two products.
Index-based retrieval was still available as an alternative, though. In fact, LaserFiche's index-based retrieval was superior, because it has Boolean searches, which make it easy to get exactly the documents you want in a single search operation. It has fuzzy searching, which finds instances that are close to the specified keyword instead of requiring an exact match. This feature is especially good if you're going to do full text searches on OCRed documents, because OCR typically inserts numerous mistakes. Fuzzy search could keep you from having to spell check all your OCRed text.
Compulink also sells Template Wizard, a product that lets you fill in templates from databases and reuse parts of one template in another. We did not test this, however.
If you import multiple text documents, you can full-text index all of them in a single operation, initiated at the file server console. Bulk indexing, combined with fuzzy search, makes LaserFiche a good choice for managing nonimage documents like customer service records or reports. Importing and full-text indexing are about as automated and convenient as could be (unless, perhaps, they could be ongoing or regularly recurring, instead of manually initiated).
LaserFiche also has application-based security. With other programs, users can access image files through Windows File Manager or DOS, and potentially delete or alter them. With LaserFiche, users don't even need to login to the file server. LaserFiche determines access to folders, files and permitted operations. Files and folders are invisible if you don't have appropriate access rights. ImageWise cannot offer this type of security, because it doesn't have its own NLM through which to implement it. GroupStore could do it, but doesn't.
GroupStore consists of three network services plus a shared scanning program (GroupScan) and a client module (Imagery for Windows). The latter two are Dynamic Link Library-based (DLL). The network services include the Mass Storage Service (MSS), the Image Management Service (IMS) and the Document Management Service (DMS). All three are NLMs that run only on NetWare 4.x. MSS and DMS also include DLL-based administration programs that run at client stations.
MSS provides hierarchical storage management. IMS provides services for
applications, such as fax servers, that use GroupStore functions. DMS provides an electronic document filing system. We tested DMS and MSS. We couldn't test IMS because only one product, Biscom's Faxcom fax server, was IMS-compatible at the time of testing, and both Imagery and Biscom declined to provide a fax server.
On another occasion, we were unable either to access or delete a particular document. At the same time, we were unable to create a folder. A DMS error message said the database might be corrupt and we should exit and reload the database. That didn't help, though. The ultimate solution was to wipe out the existing database and reinstall DMS. We were unable to duplicate the problems.
If you use GroupScan--the only way to create multiple documents in a single scan--the procedure for indexing the documents requires you to retrieve and save each document with Imagery for Windows. That's unnecessarily time-consuming, especially since retrieval functions can be somewhat clumsy.
In addition, GroupStore doesn't have full-text indexing. Text files created via the integrated OCR do not remain associated with the image files. Nor can you index or retrieve an imag
e file using words in the text file. (LaserFiche, in contrast, does keep an association between the image and the text file.) On the other hand, DMS Administrator makes creating templates easier with reusable index fields for templates.
Also on the positive side, GroupScan is the most efficient way we've seen to create multiple documents with a single scan, because it can separate documents based either on a page count or by detecting a blank sheet that acts as a document separator. With the other two products, you have to separate documents manually, either explicitly (with LaserFiche) or by applying appropriate indexing information (with ImageWise).
Clunky Retrieval/Cool Display Retrieval functions are generally a bit primitive and clunky. There is no graphical directory tree. There's also no Boolean searching, just an implied "and" between fields specified. For instance, if you want to look at magazine articles that have either "Imaging" or "Scanners" in the "subject" field, you'll have to do two separate searches.
Retrieval can also be slowed by disappearing pick lists. For instance, if you want to look at all documents with "Imaging" in the subject field, you do your search, select a document and view it. When you close that document and go to open another one, you find the pick list has disappeared. You have to perform the same search all over again!
On the other hand, some of Imagery for Windows' display functions are really good. For instance, it can display up to four pages at a time per document, and you can drag and drop pages within a document or among documents. You can delete, add and shuffle pages around.
Neither of the other products can do this. ImageWise doesn't even track documents as such. Whatever pages are returned by a particular search can be considered a document for the purposes of that search. So, although you can display as many pages as you want and delete any currently displayed page, you move or shuffle pages only by changing index values. For instance, you migh
t define each multipage invoice using an "invoice number" field. All pages with the same number in the "invoice number" field belong to the same invoice--about as nongraphical an approach as we can imagine.
LaserFiche is somewhat closer to Imagery in its approach to manipulating pages within an existing document. Using its batch processing function, LaserFiche lets you move pages within a document or to a new document but not among existing documents. You can also delete pages. LaserFiche represents only one page at a time as an image. It represents others by numbers. (We reviewed LaserFiche version 2.3. The next version of LaserFiche is supposed to have a feature similar to Imagery's.)
Neither of the other vendors provides hierarchical storage management. Then again, MSS will work with LaserFiche and ImageWise, too. Its only requirement is storage on a NetWare volume, including magnetic and optical media (but usually not tape drives). Another potential advantage is the fact that GroupStore is designed as a multivendor platform. Time will tell how many third-party software vendors use IMS, DMS and MSS services. Finally, Imagery uses the NetWare Directory Services.
LaserFiche NLM/Windows
, $7,995 (five users). Compulink Management Center, (310) 212-5465; fax (310) 212-5064.0,
imaging@ix.netcom.com
GroupStore
, $9,995 (50
users, single-server license for DMS, MSS and IMS); Imagery for Windows, $1,679 (five users); MassStorage Service, $5,995 (unlimited users, single server); GroupScan, $1,995 (one user). Imagery Software, (617) 275-7700; fax (617) 280-9710.
Our primary imaging workstation was a 50-MHz 486 EISA PC, a Compaq SystemPro/XL with 1 GB of SCSI hard disk and 16 MB of memory. The scanner, attached to the Compaq via an Adaptec 154CF SCSI host adapter, was a Fujitsu M3097Gm.
We performed two comparative tests, for scanning and retrieving. We noted server processor utilization, total elapsed time and network traffic measured in kilobytes--both total and peak--for each operation. Indexing proved resistant to a quantitative approach, since the user interfaces and procedures were so different for the three programs.
Scanning
: Scanning 60 pages took between two and nearly four minutes and raised the overall processor utilization level during that whole time. It also caused occasional sharp spikes.
The most important scanning measures are usually total time elapsed and typical processor utilization. Scanning takes so long that it can't put any huge load on the network. Imagery's GroupScan was the performance champion for scanning, although it beat Compulink by less than 15 percent, and typical server utilizations were quite comparable. Amazingly, GroupScan also put only about half the load on the network that the other two did and yet was competitive in
speed--faster than ImageWise and only about 12 percent slower than LaserFiche. GroupScan is one part of Imagery's GroupStore that is definitely ready for prime time.
Note that all tests were done without an accelerator card. Besides speeding all three products up, such a card would tend to reduce the differences in scan times.
Retrieval
: For the retrieval test, we retrieved one image from a database of 100 images, each of which was indexed with a single index value. This operation took only a few seconds and resulted in a single flurry of processor activity.
For retrieving, elapsed time is usually the primary concern, but both server and network loading can be significant issues. In this area, PaperWise was most efficient, with the shortest elapsed time and the lowest loads both on the server and the network. Compulink came in second for all measurements and Imagery, third.
Conclusions
: What do these tests show about the dreaded ability of imaging applications to overload networks? When retrieving documents, we saw peak network traffic from one workstation of about 40 to 80 KB per second. Even at the lower rate, a mere nine workstations simultaneously retrieving images would bring 10 Mbps Ethernet to nearly 30 percent utilization. Of course, users don't sit there and retrieve images continually. Nevertheless, LAN managers should monitor traffic for these applications, as the potential for overload is definitely there.
The surprise for us, however, was high server processor utilization. True, our server was only a 60-MHz Pentium. Nevertheless, we didn't expect
processor utilizations pushing up to 70 percent and even beyond 90 percent for a single workstation (compared with a baseline of 1 to 4 percent with
no imaging activity). Our best guess as to what accounts for differences in elapsed times: PaperWise uses the KIPP (KF-920) software from Kofax. The software may be optimized for the Kofax image processing boards, not for straight SCSI. The other two, whos
e times are similar, use driver software from Pixel Translations. Differences in both CPU utilization and amount of data transferred (which roughly follow one another) may relate to the complexity of the database tables maintained at the server to track scan jobs. For instance, only Compulink maintains a hierarchical tree structure, into which scans must be placed. Some data must be exchanged to indicate where this scan belongs in the tree. PaperWise has some built-in complexities, too. For instance, it must check for "scan flags" used for automated indexing. With Imagery, in version 1.0, there may be less for the client and server to talk about.
The numbers for document retrieval seem to relate directly to the complexity of the database NLM. PaperWise uses only Btrieve, which is quick and efficient. Compulink has its own NLM, which implements a complex relational database. Imagery uses NetWare SQL--a notorious resource hog.
|











