Wednesday, December 1, 1999

The need for automated classification


By Bain McKay

Over the centuries, classification systems have been used to simplify large sets of data so that the processing could be done at the group level rather than at the element level. These classification systems facilitated a more productive way of dealing with large volumes of data. There are many types of classification systems, such as job classification for compensation purposes, Dewey Decimal System for librarians, and classes of methods in object-oriented programming.

Over the years, the development and management of classification systems has become a science, where sets of data are classified into successive groups, or hierarchies, by their likeness or affinity. Simply put, classification is much like a multi-level sort, providing a hierarchical group of records with similar characteristics sorted on attribute data. [Of course, choosing the appropriate classification and categories can often be an art needing a human touch. --DG]

Classification systems make it easier to understand and process large volumes of data by assigning sets of rules to classification groups. In effect, classification methods are used to leverage what you know by applying your knowledge to multiples (groups) of records rather than to each individual record. "Multiples of records" represent the leverage factor that is central to productivity enhancement -- doing more with less.

A way to tame "infoglut"

Today, with the world suffering from "infoglut" (i.e., too much information), more and more attention is being paid to classification systems. Classification systems are seen as a way to compartmentalize and simplify the massive volumes of information coming at us daily from the exponential use of computing and the Internet, thus facilitating its management and control.

Document Management Systems like Lotus Domino.Doc and back-end systems like DOCS Open from PC DOCS, have become popular ways to manage the large volumes of electronic documents, e-mail and components used all across corporations to conduct business today. They have greatly eased our ability to organize and find information among the millions of documents required to run the modern organization. However, the sheer volume of documents has placed a significant strain on our manual records management and classification systems, which are no longer able to keep up.

Records classification systems are used to ensure that records are retained and disposed of in a managed and orderly fashion, as per the current laws of the land and the needs of the organization. In many cases, companies want to dispose of paper documents to save space for cost management reasons. While the same typically held true in the past for electronic documents, the plummeting cost of electronic storage has made this (that is, the need to dispose of old electronic documents to save storage space) less of an issue. Corporations must take care that they don't throw out documents that, by law, must be retained. If they did, they could find themselves in trouble for destroying legal evidence, and the penalties are severe.