WORKSHOPS

Extending The File System With EDMS

by Russ Edelman

The introduction of Windows95 has served as a catalyst in reevaluating the need for Electronic Document Management Systems (EDMS). The driving reason for this reevaluation was due to several technological advancements introduced by Windows95. These advancements, when combined with new functionality offered by today's business applications, seem to challenge the core functionality provided by document management. However, when examined in more depth, the robust functionality of today's EDMS technologies is easily recognizable. In fact, Windows95 complements EDMS rather than competes with it.

An EDMS is fundamentally a relational database that is tightly integra ted with your business applications. It serves as an extension to the operating system by allowing files to have extended properties, or what is commonly referred to as a profile. Profiles describe file contents and can be searched at incredible speed. By employing this profile metaphor, you can get detailed descriptions of the file, as well as fielded information and security.

Windows95 Extended File Name With the introduction of Windows95, the archaic 8.3 limitation is replaced with a very reasonable 256-character limitation. This applies for all files as well as directories--now called folders. One would ask, "Why is an EDMS needed if you have 256 characters to name a file?" When compared in greater detail , document management systems provide a wealth of functionality beyond the 256-character file names of Windows95. This includes extended comment sections, fielded information, comprehensive document-level security and automatically updated access statistics for the file. DOC S Open from PC DOCS and Saros' Mezzanine, among other products, employ this architecture using standard SQL technology. In comparison, Novell's SoftSolutions (to be released as part of GroupWise XTD), employs a functionally consistent architecture, albeit with a proprietary database. These products maintain this additional information in the profile associated with the document.

Fielded information consists of any pertinent document data, including the author's name, the document type or custom-defined fields. Therefore, documents can be categorized easily for future searching. For example, a user searching for all documents created by "Justin Marshall" and the department (an example of a custom-defined field) of "Fitness" could issue a simple search request and receive a hit list immediately with all documents meeting this criteria.

Both DOCS Open and SoftSolutions allow a hit list appearance to be modified so it appears in the desired format for the users. Profile f ields in an EDMS are used as variables in building the path for the document's actual file system location. This is completely transparent to the user, and it allows files to be distributed across a well-designed directory structure.

An automatic file naming process is also incorporated into most EDMS products. This generates the file system name for the document, which is transparent to the user. In comparison, Windows95 does not offer automatic file naming logic. Some business applications have implemented rudimentary file naming logic as well as fielded information. However, these do not provide the level of uniformity or functionality offered by EDMS. Windows95 allows for some basic search criteria to be created without allowing for any fielded information searches. This is understandable, since Windows95 does not allow custom fields to be defined. Business applications allow for searching, albeit in a very limited capacity and, again, without having the luxury of uniform data. It is importan t to stress that the exclusion of these features should not be considered a deficiency in either Windows95 or business applications. It simply is functionality that falls beyond the boundaries of each respective technology.

With regard to file level security, it can be observed that Windows95 has very little, and business applications are normally limited to password protection at the file level. In comparison, documents managed in an EDMS can be secured by selecting security attributes for the document, which invoke the native file security functions for the underlying file system.

DOCS Open and SoftSolutions actually offer processes that can enhance NOS security further. This is done by removing all NOS rights at the file level. When a file is accessed, rights are temporarily granted by the EDMS process and then removed after the file is used. From a network administration perspective, this is the ideal environment to gain all of the database benefits for your EDMS. Additionally, EDMS products automatically track key statistics, such as who edited the file last and when. It also can track information like who has viewed or printed the document.

Advanced Search Capabilities Having the ability to define a profile that accurately reflects the contents of a document is very significant. However, if the information in the profile cannot be searched with immediate responses, the value of the EDMS is significantly diminished. Most EDMS vendors utilize relational database platforms that use indexed files. This allows all information in the profile to be stored in a high-performance database for instantaneous access. Additionally, EDMS vendors usually provide complete full-text indexing capabilities for the document conten ts (in comparison to the profile information). Depending on the technologies employed, full-text searching can include advanced functionality like phrase and proximity searching, relevancy ranking, synonym searches and a host of other features. When combining profile and full-text searching capabilities, users of EDMS products take a quantum leap toward providing immediate access to dispersed information.

Windows95, in comparison, provides rudimentary searching capabilities. By employing the "Find File/Folder" option within Windows95, users can search for a file or folder with minimal criteria. This criteria includes date, time, file size and several other parameters. While this does provide limited functionality, it is truly in a different league when compared to the capabilities of an EDMS. Windows95 file/folder searching is based on sequential searches executed against the directory entry table (DET). There is no database employed, and the sequential nature of the search is extremely slow since there is no file indexing. Regarding full-text capabilities, Windows95 can search for text within documents. If this criteria is specified, it will employ a similar sequential technique, which also does not employ any index capabilities. Furthermore , Windows95 only provides searching capabilities for one data volume at a time (logical drive letters, in most cases). This means that the user would have to issue multiple searches on each volume, starting at the root, in order to find a file that may reside on one of many volumes.

Furthermore, many of today's business applications suites (like Microsoft Office95, Corel Office, Lotus SmartSuite) offer rudimentary searching capabilities. Microsoft and Lotus employ similar techniques, since they allow searching. But the searching is performed in a sequential manner, similar to Windows95. Corel Office, specifically WordPerfect, offers a "QuickFinder" technology, which allows indexes to be built for searching. When explored in greater detail , QuickFinder is not designed to serve as a network-based search engine for hundreds or thousands of users.

Access Across the LAN and WAN Most document management systems are designed to provide seamless search capabilities across LAN s and WANs. This significantly impacts searching for files that may reside on any volume or even across multiple servers in a LAN

or WAN environment. Employing client/server database technologies introduces a layer of abstraction that separates the profile information from the actual file. This minimizes the impact on network bandwidth by only passing the profile database requests and responses across the network. When a request is made from the client, it is processed locally at each respective server process. The request results will then be delivered back to the client. Once the profile is identified, the file can be retrieved easily, thus minimizing excess network traffic. Most EDMS products take care of all security handshaking and file transfers when searching for files in a multiserver environment.

For example, a law firm using DOCS Open with five offices around the country, uses a frame relay network for WAN connectivity. By using DOCS Open, users in the New York office can sea rch effortlessly for all documents created by "Aaron Jacob" with the words "this was a rush job" in the actual text of the document. Once the search is executed, a "client" request is sent to the other four sites and each site, in turn, provides a "server" response that indicates potential matches. The files appearing on the hit list can be located in any of the five sites. Once the appropriate document is identified, the user simply selects the file and DOCS Open delivers the document to the user. If the selected file is in Florida, for example, DOCS Open transparently logs the user into the Florida office, transfers the file, disconnects from the Florida server and then updates all logs to indicate that the file is in use by a user in New York.

Network/Workstation File Synchronization Windows95 introduced a feature called the Briefcase, which enables files to be easily synchronized when residing in two locations. It's ideal if you need to continuo usly transfer files from a network volume to a laptop. Specific files can be dragged in the briefcase, and when changes are made, a synchronization process could be executed to ensure consistency between those two files. The Briefcase synchronization process updates the older file with the newer version of the file. This can be bidirectional, so files on the network could update the laptop and vice versa.

In comparison, EDMS products introduce a different method, which allows users to search for files and then check them out. The EDMS database automatically updates to indicate who checked the file out and when it will be returned. This is a very important feature from a network administration perspective, since it provides complete control regarding the removal of files along with a detailed audit trail. By employing check-in/check-out functionality, other users would be prevented from updating the file while checked out. SoftSolutions has named a portion of this process "Portable Mode," while DOCS Open calls its process "Mobile Mode." When returning the file, a synchronization process is initiated, updating the file and then updating the EDMS databases.

Closely related to synchronization is "version control." In many cases, it is important to maintain all versions of a file as it is revised. Windows95 and most business applications don't provide this functionality. In comparison, most EDMS products offer extensive version control capabilities. This provides a true audit trail and sequence to be maintained for a document.

Resource Utilization As with all other technologies, EDMS products consume resources. This can be broken down into two categories--the client and the server. The client portion of most EDMS produ cts generally can consume 10 MB to 25 MB of disk capacity. Additionally, RAM will be impacted, since each respective EDMS product consumes a reasonable amount of memory. The server processes may be able to run on a server that is no t dedicated as an EDMS database server. However, implementations that exceed approximately 50 users may find that performance degrades considerably as the EDMS processes are fully exercised. The recommended solution for this problem normally consists of the introduction of dedicated database servers. Additionally, full-text indexing machines must be taken into account for overall cost. EDMS vendors have only recently incorporated the full-text searching processes into their primary database platforms.

Russ Edelman is vice president of systems services at ICM. He can be reached at redelman@icmus.com.

Updated July 8, 1996




Valley View, Live!

Research and Reports

Storage Virtualization Guide
May 2012

Network Computing: May 2012

TechWeb Careers