The terms used in Document Management and Workflow Automation systems can sometimes be confusing. Click on a topic in the list to get a quick briefing. We’ve also included a glossary of common industry terms below.
Annotation and Markup Features allow you to add comments to an electronic document in much the same way that you would use highlighters or Post-it notes to draw attention to specific areas of a printed document.
Aperture Card is a standard Hollerith encoded IBM-style punch card that acts as a transport for a 35mm transparency. Typically, aperture cards are used to store blueprints and engineering drawings.
Aperture Card Scanner is a type of scanner that allows aperture cards to be converted into electronic documents.
APRP (Adaptive Pattern Recognition Processing) is one of the most sophisticated technologies currently available in modern text retrieval software. APRP automatically indexes the binary patterns in digital information, creating a pattern-based memory that is optimized for the content of the data. It eliminates the costly labor of manually defining keywords and sorting and labeling information in database fields. APRP has a high tolerance for input data errors, eliminating the need for OCR clean up.
CD-ROM (Compact Disk-Read Only Memory) is a circular disk used to store large amounts of electronic data. CD-ROMs can hold up to 680 MB of computer data. The media is low cost and durable, and in large scale applications can be inexpensively duplicated into thousands of copies. Unlike optical disks which can be written to many times, a CD-ROM is read-only.
Client-Server Based System is a system that stores electronic documents on one computer—a server, while making those documents available to other computers—clients, via a network.
COD (Computer Originated Document) refers to any document that was originally created on a computer, like a word processing document or a spreadsheet.
COLD (Computer Output to Laser Disk) software allows you to transfer documents from expensive mainframe storage, onto an inexpensive, long-term optical disk storage system.
Collection refers to two or more electronic documents containing related information that have been grouped together to facilitate retrieval.
Compression is a process that reduces the number of bytes required to define a document in order to save disk space or transmission time. Compression is achieved by replacing commonly occurring sequences of pixels with shorter codes. Some compression methods—like JPEG, throw away some data seeking only to preserve the appearance of the image. Others—like Group-IV, preserve all of the original information.
Cross-Platform software enables you to share information between computers running different operating systems, such as Macintosh and Windows workstations.
Database is an organized collection of information stored on a computer. With Optix, a database is an organized collection of electronic documents stored on a computer. The database is structured to facilitate the search and retrieval of information contained in the database.
Database Fields are placeholders for discrete bits of information in a database. For example, your last name would be typed into a field for that purpose. The grouped contents of several fields together form a record.
Database Publishing enables you to publish a select group of documents from a large-scale document database to laptops and CD-ROMs, allowing you to create miniature, portable databases.
Database Query Screen is a computer generated form which allows you to search for information contained in the fields of a database. By entering information in pre-defined text fields, you instruct the computer to search the database for documents which contain that information. Some document management systems allow you to customize the query screens to accept information that is applicable to the database you wish to search.
Database Record is a collection of the contents of a related group of database fields.
Digital Documents are documents that are stored on a computer. The documents may have been created on a computer, as with word-processing files and spreadsheets, or they may have been converted into digital documents by means of document imaging. Digital documents are also referred to as electronic documents.
Document is a broadly used term that refers to word-processing files, email messages, spreadsheets, database tables, faxes, business forms, images, or any other collection of organized data. Documents are also referred to as ‘records.’
Document Imaging is the process by which print and film documents are fed into a scanner and converted into electronic documents. During the scanning process documents can be OCRed and indexed to ensure quick retrieval at a later date.
Document Management Systems enable you to store documents electronically. This facilitates the process of retrieving, sharing, tracking, revising, and distributing documents and the information they contain. A complete Electronic Document Management System (EDMS) provides you with all of the software and hardware required to ensure that you maintain control over all your documents, both scanned images, and files that were created on a computer—like spreadsheets, word processing documents and graphics. A complete EDMS includes document imaging, OCR, text retrieval, workflow, and Computer Output to Laser Disk capabilities.
Document Retrieval is the process by which you can search and ‘retrieve’ an archived document from a database. This is done by entering information in a database query screen.
EDMS is an acronym for Electronic Document Management System.
Electronic Documents are documents that are stored on a computer. The documents may have been created on a computer, as with word-processing files and spreadsheets, or they may have been converted into digital documents by means of document imaging. Electronic documents are also referred to as digital documents.
File Management is the methods by which an operating system organizes and manages files. Since the early 1970’s this has consisted primarily of the use of symbolic folders and filenames.
Full-Text Retrieval is a capability that enables you to search for documents stored in a database based on the text contained in the documents. It can be used in conjunction with index-based searching which relies on a description of the document entered by a scan operator.
Graphical Route Developer Tools enable you to easily create, and modify, workflow routes by letting you ‘draw’ a workflow route on the screen, in much the same way they would draw a picture with a computerized drawing program. In effect, users draw a map of how they want documents to flow through their organization.
Group-IV is a compression method designed by CCITT for use with Group IV fax machines. This method is optimized for compressing scanned text.
Hyperlinks allow you to ‘link’ any document stored in a database with any other document. You can link a spreadsheet to an image, a database to a graphic, or a word processing file to a site on the World Wide Web. You can then navigate from one related document to another, simply by clicking on the hyperlinks..
Index refers to the information contained in an electronic document that enables you to retrieve it from a database. The index can include physical location information (e.g., where the document is stored) and document identification information (e.g., date archived, creator, and contents).
JPEG (Joint Photographic Experts Group) is a standard image compression mechanism. JPEG compression is “lossy,” meaning that the compression scheme sacrifices some image quality in exchange for a reduction in the file’s size.
Life Cycle refers to the period of time between when a document is archived and when it is destroyed.
Magnetic Disk Digital media that uses magnetic particles to store data. Both hard disks and floppy disks are magnetic disks.
Microfilm/Microfiche Scanner is a type of scanner that converts microfilm or microfiche documents into electronic documents.
Network refers to two or more computers that have been linked together to enable them to communicate with each other, exchange information, and share resources.
OCR (Optical Character Recognition) refers to the process by which scanned images are electronically “read” to convert them into editable text. This conversion is performed after scanning, and may output formatted text or text-only files (flat ASCII files). Text generated by OCR is often input into text search databases, allowing retrieval of the original scanned image based on its content.
Optical Disks use tiny optically reflective particles to store data. A laser is used to read the reflective bits, and write data. Unlike CD-ROM, which is read-only, most optical disc systems are writable.
Optical Disk Jukebox is a piece of hardware that stores, and provides rapid access to multiple optical disks.
Patch Card is a document that contains scanner and indexing instructions in the form of a barcode. Patch Cards can be inserted at specific points in a ‘scan batch’ where you desire new scanner or indexing settings to begin or end. Patch cards can instruct document imaging software to store a document in a specific database, assign the document an incremental sequence number, assign a job name, or record the scan date of a document. Patch cards are also capable of adjusting scanner settings and performing image enhancement operations such as ‘deskew,’ ‘rotate,’ and ‘despeckle’.
RAID (Redundant Array of Inexpensive Disks) is a storage technique that enables you to obtain increased storage reliability and performance by writing data to a connected series of disks referred to as a logical volume. Data reliability is achieved with error correction techniques or data duplication. Disk performance is achieved by parallel data transfers to a set of disks–this technique is known as ‘data striping.’
Record Retention Schedule is a form that details the categories of records an organization is required to store. It outlines the length of time different categories of records should be stored, and when they can be deleted.
Resolution refers to the ‘image-sharpness’ of a document, usually measured in dots (or pixels) per inch (dpi). Documents can be scanned at various resolutions depending on your particular needs. The higher the resolution of a document, the greater the image-sharpness, and the larger the file size will be. Resolution also refers to the image-sharpness that printers and monitors are capable of reproducing.
Retention Period is the length of time documents must be stored and maintained to satisfy business or legal requirements.
Scalable refers to the ability to enlarge or reduce the size of an image. A document management system is said to be ‘scalable’ if its capabilities can be increased to support additional users or platforms.
Scan Batch is a collection of documents that are fed into a scanner for the purpose of being converted into digital or electronic documents.
Scanner Interface Board is a piece of hardware that enables software programs to communicate with various models of scanners.
Scriptable and Recordable Software enables you to automate repetitive computer tasks. You can instruct a ‘script’ to open one program, carry out a task, close that program, open a new program, carry out a new task, and so on until the project is completed. Or, you can ‘record’ a series of steps as you perform them, and save those steps as a single script.
Semantic Network Technology is an underlying technology of sophisticated text retrieval software. It offers you a built-in ‘dictionary’ of 400,000 word meanings and over 1.6 million word relationships. It recognizes phrases like ‘real estate’ and ‘kangaroo court’ as single units of meaning, not individual words. It also recognizes words with multiple meanings such as ‘concrete’. To choose the meaning appropriate for your query, you simply click on the meaning you intend. Semantic Network Technology helps to ensure that you find the documents you are looking for quickly and easily.
SQL (Structured Query Language) is a database access language that originated on mainframes and minicomputers, and which is now popular on PCs.
Text Retrieval Software enables you to retrieve electronic documents by entering ‘key’ words in a search field. Documents containing the text you entered are retrieved from the text library, and presented to you in a list ranked by relevance. However, since you are searching for documents based on the content within the document, if the content of a document does not contain the word(s) used in your search, a document could be missed and never found. Using more generic search terms results in more hits but attempting to find the one document you need from a list of thousands is simply unrealistic.
TIFF (Tagged Image File Format) is an industry standard file format developed for the purpose of storing high-resolution bit-mapped, gray-scale, and color images.
TWAIN is a scanning interface standard developed to address the need for consistent, easy integration of scanners with document imaging programs. Software programs that are written to support the TWAIN standard are capable of controlling any TWAIN compliant scanner.
Workflow Software allows businesses to move electronic documents along a user-defined ‘routing’ path, from one workstation to the next, around a local or wide-area network. Once the document arrives at any given workstation, the receiver can add notations to, or modify, the document as they see fit. An insurance company might use workflow software to route claim forms through their organization. A user at one step might wish to review the forms and add a new document to the electronic ‘package’ before sending it to the next workstation. The next user might wish to add several notations to the forms before sending it on to the final workstation for approval. The route can be as simple or as complex as a business process requires.