WORKSHOPS
Extending The File System With EDMS
by Russ Edelman
The introduction of Windows95 has served as a catalyst in reevaluating the
need for Electronic Document Management Systems (EDMS). The driving reason
for this reevaluation was due to several technological advancements introduced
by Windows95. These advancements, when combined with new functionality offered
by today's business applications, seem to challenge the core functionality
provided by document management. However, when examined in more depth, the
robust functionality of today's EDMS technologies is easily recognizable.
In fact, Windows95 complements EDMS rather than competes with it.
An EDMS is fundamentally a relational database that is tightly integra
ted
with your business applications. It serves as an extension to the operating
system by allowing files to have extended properties, or what is commonly
referred to as a profile. Profiles describe file contents and can be searched
at incredible speed. By employing this profile metaphor, you can get detailed
descriptions of the file, as well as fielded information and security.
Windows95 Extended File Name
With the introduction of Windows95,
the archaic 8.3 limitation is replaced with a very reasonable 256-character
limitation. This applies for all files as well as directories--now called
folders. One would ask, "Why is an EDMS needed if you have 256 characters
to name a file?" When compared in greater detail
, document management
systems provide a wealth of functionality beyond the 256-character file
names of Windows95. This includes extended comment sections, fielded information,
comprehensive document-level security and automatically updated access statistics
for the file. DOC
S Open from PC DOCS and Saros' Mezzanine, among other products,
employ this architecture using standard SQL technology. In comparison, Novell's
SoftSolutions (to be released as part of GroupWise XTD), employs a functionally
consistent architecture, albeit with a proprietary database. These products
maintain this additional information in the profile associated with the
document.
Fielded information consists of any pertinent document data, including the
author's name, the document type or custom-defined fields. Therefore, documents
can be categorized easily for future searching. For example, a user searching
for all documents created by "Justin Marshall" and the department
(an example of a custom-defined field) of "Fitness" could issue
a simple search request and receive a hit list immediately with all documents
meeting this criteria.
Both DOCS Open and SoftSolutions allow a hit list appearance to be modified
so it appears in the desired format for the users. Profile f
ields in an
EDMS are used as variables in building the path for the document's actual
file system location. This is completely transparent to the user, and it
allows files to be distributed across a well-designed directory structure.
An automatic file naming process is also incorporated into most EDMS products.
This generates the file system name for the document, which is transparent
to the user. In comparison, Windows95 does not offer automatic file naming
logic. Some business applications have implemented rudimentary file naming
logic as well as fielded information. However, these do not provide the
level of uniformity or functionality offered by EDMS. Windows95 allows for
some basic search criteria to be created without allowing
for any fielded
information searches. This is understandable, since Windows95 does not allow
custom fields to be defined. Business applications allow for searching,
albeit in a very limited capacity and, again, without having the luxury
of uniform data. It is importan
t to stress that the exclusion of these features
should not be considered a deficiency in either Windows95 or business applications.
It simply is functionality that falls beyond the boundaries of each respective
technology.
With regard to file level security, it can be observed that Windows95 has
very little, and business applications are normally limited to password
protection at the file level. In comparison, documents managed in an EDMS
can be secured by selecting security attributes for the document, which
invoke the native file security functions for the underlying file system.
DOCS Open and SoftSolutions actually offer processes that can enhance NOS
security further. This is done by removing all NOS rights at the file level.
When a file is accessed, rights are temporarily granted by the EDMS process
and then removed after the file is used. From a network administration perspective,
this is the ideal environment to gain all of the database benefits for your
EDMS. Additionally, EDMS
products automatically track key statistics, such
as who edited the file last and when. It also can track information like
who has viewed or printed the document.
Advanced Search Capabilities
Having the ability to define a profile
that accurately reflects the contents of a document is very significant.
However, if the information in the profile cannot be searched with immediate
responses, the value of the EDMS is significantly diminished. Most EDMS
vendors utilize relational database platforms that use indexed files. This
allows all information in the profile to be stored in a high-performance
database for instantaneous access. Additionally, EDMS vendors usually provide
complete full-text indexing capabilities for the document conten
ts (in comparison
to the profile information). Depending on the technologies employed, full-text
searching can include advanced functionality like phrase and proximity searching,
relevancy ranking, synonym searches and a host of other features. When combining
profile and full-text searching capabilities, users of EDMS products take
a quantum leap toward providing immediate access to dispersed information.
Windows95, in comparison, provides rudimentary searching capabilities. By
employing the "Find File/Folder" option within Windows95, users
can search for a file or folder with minimal criteria. This criteria includes
date, time, file size and several other parameters. While this does provide
limited functionality, it is truly in a different league when compared to
the capabilities of an EDMS. Windows95 file/folder searching is based on
sequential searches executed against the directory entry table (DET). There
is no database employed, and the sequential nature of the search is extremely
slow since there is no file indexing. Regarding full-text capabilities,
Windows95 can search for text within documents. If this criteria is specified,
it will employ a similar sequential technique, which also does not employ
any index capabilities. Furthermore
, Windows95 only provides searching capabilities
for one data volume at a time (logical drive letters, in most cases). This
means that the user would have to issue multiple searches on each volume,
starting at the root, in order to find a file that may reside on one of
many volumes.
Furthermore, many of today's business applications suites (like Microsoft
Office95, Corel Office, Lotus SmartSuite) offer rudimentary searching capabilities.
Microsoft and Lotus employ similar techniques, since they allow searching.
But the searching is performed in a sequential manner, similar to Windows95.
Corel Office, specifically WordPerfect, offers a "QuickFinder"
technology, which allows indexes to be built for searching. When explored
in greater detail
, QuickFinder is not designed to serve as a network-based
search engine for hundreds or thousands of users.
Access Across the LAN and WAN
Most document management systems are
designed to provide seamless search capabilities across LAN
s and WANs. This
significantly impacts searching for files that may reside on any volume
or even across multiple servers in a LAN
or WAN environment. Employing client/server database technologies introduces
a layer of abstraction that separates the profile information from the actual
file. This minimizes the impact on network bandwidth by only passing the
profile database requests and responses across the network. When a request
is made from the client, it is processed locally at each respective server
process. The request results will then be delivered back to the client.
Once the profile is identified, the file can be retrieved easily, thus minimizing
excess network traffic. Most EDMS products take care of all security handshaking
and file transfers when searching for files in a multiserver environment.
For example, a law firm using DOCS Open with five offices around the country,
uses a frame relay network for WAN connectivity. By using DOCS Open, users
in the New York office can sea
rch effortlessly for all documents created
by "Aaron Jacob" with the words "this was a rush job"
in the actual text of the document. Once the search is executed, a "client"
request is sent to the other four sites and each site, in turn, provides
a "server" response that indicates potential matches. The files
appearing on the hit list can be located in any of the five sites. Once
the appropriate document is identified, the user simply selects the file
and DOCS Open delivers the document to the user. If the selected file is
in Florida, for example, DOCS Open transparently logs the user into the
Florida office, transfers the file, disconnects from the Florida server
and then updates all logs to indicate that the file is in use by a
user
in New York.
Network/Workstation File Synchronization
Windows95 introduced a feature
called the Briefcase, which enables files to be easily synchronized when
residing in two locations. It's ideal if you need to continuo
usly transfer
files from a network volume to a laptop. Specific files can be dragged in
the briefcase, and when changes are made, a synchronization process could
be executed to ensure consistency between those two files. The Briefcase
synchronization process updates the older file with the newer version of
the file. This can be bidirectional, so files on the network could update
the laptop and vice versa.
In comparison, EDMS products introduce a different method, which allows
users to search for files and then check them out. The EDMS database automatically
updates to indicate who checked the file out and when it will be returned.
This is a very important feature from a network administration perspective,
since it provides complete control regarding the removal of files along
with a detailed audit trail. By employing check-in/check-out functionality,
other users would be prevented from updating the file while checked out.
SoftSolutions has named a portion of this process "Portable Mode,"
while DOCS Open calls its process "Mobile Mode." When returning
the file, a synchronization process is initiated, updating the file and
then updating the EDMS databases.
Closely related to synchronization is "version control." In many
cases, it is important to maintain all versions of a file as it is revised.
Windows95 and most business applications don't provide this functionality.
In comparison, most EDMS products offer extensive version control capabilities.
This provides a true audit trail and sequence to be maintained for a document.
Resource Utilization
As with all other technologies, EDMS products
consume resources. This can be broken down into two categories--the client
and the server. The client portion of most EDMS produ
cts generally can consume
10 MB to 25 MB of disk capacity. Additionally, RAM will be impacted, since
each respective EDMS product consumes a reasonable amount of memory. The
server processes may be able to run on a server that is no
t dedicated as
an EDMS database server. However, implementations that exceed approximately
50 users may find that performance degrades considerably as the EDMS processes
are fully exercised. The recommended solution for this problem normally
consists of the introduction of dedicated database servers. Additionally,
full-text indexing machines must be taken into account for overall cost.
EDMS vendors have only recently incorporated the full-text searching processes
into their primary database platforms.
Russ Edelman is vice president of systems services at ICM. He can be
reached at redelman@icmus.com.
Updated July 8, 1996
|