Save to Del.icio.us


Member perspectives on Semio

November, 2001

This article summarizes responses to a recent member query about "real world" experiences with the Semio auto-categorization program. Most people we interviewed are using Semio as a "back end" tool to bring a semblance of order to large document repositories. In other words, the software performs or facilitates certain tasks that are traditionally performed at the front end during the editing process, before a work is published. Examples of such tasks are extraction of significant terms from documents, grouping the terms into a subject hierarchy, and assigning terms to documents. Because of the way corporate intranets have developed, companies have under-invested in front end editorial disciplines and infrastructures. But the piper must be paid, and auto-categorization programs are one way to bring some order to the ensuing chaos.

Most of the respondents are using Semio with corporate intranets. A key issue is compatibility with other software, especially search engines. (For our experience in integrating metadata with the Inktomi search engine, see "The best of both worlds.") Another big concern is the issue of "hidden" costs, such as staff time required to create an initial list of top categories and "tweak" the program parameters for greater accuracy. The article includes member comments on the issues of cost, compatibility, and accuracy.

Created on December 10, 2001 l Updated on May 28, 2004