|Montague Institute l Contents l Index l Digest l Courses l Calendar l Subscribe
Piloting Semio classification software
One of the participants in this month's "Introduction to Business Taxonomies" course and roundtable reported on a pilot classification project conducted by her company. The goal of the pilot was to find out whether Semio and The Brain programs would be a cost-effective way of improving navigation on the corporate intranet. This article summarizes her findings.
The importance of testing
The test is an investment. The vendor will probably charge $10,000 - $20,000 to use the software during the pilot period and to customize it to work with your content. You also need to factor in the cost of selecting content, editing the computer results, and testing the taxonomy with users. If you decide to implement the vendor's solution, the pilot gives you a head start. If you decide not to buy, you've saved several hundred thousand dollars on a system that won't meet your needs.
The company could not find any vendor that could perform all three functions, so it settled on a combination of Semio for categorization and The Brain for visualization. (Semio and The Brain are strategic partners.) The combination requires the client to supply a taxonomy that broadly describes its content.
1. Identify high level categories and "concepts." The client developed a top level taxonomy that described the content selected for the pilot. A "concept" is a noun phrase -- e.g. "constraint management" -- that describes a segment of a category -- e.g. "Information management."
2. Crawl data and extract noun phrases ("concepts"). The Semio program examined ("crawled") the test documents and identified important phrases.
3. Assign noun phrases to documents (preliminary). The Semio program assigned the phrases identified in Step 2 to the test documents.
4. Apply the client's taxonomy. The Semio program associated terms in the client's taxonomy with noun phrases extracted from the test documents.
5. Edit the taxonomy. Using statistics provided by the Semio program, the client's staff reviewed and edited the program's output.
6. Rebuild maps and reapply taxonomies. Using the edited categories, the Semio program reconstructed its taxonomy and re-assigned terms to documents.
7. Create the visual taxonomy. Text from the Semio-created taxonomy was imported into The Brain for display.
Pilot content and staff
An information professional and a technical expert selected the content, created the top level categories, edited the Semio output, designed the user tests, and analyzed the test results.
Testing and results
For five of the nine tasks, the Semio/Brain visual map yielded the best (shortest) times. The average times that testers took to complete each task varied from a low of 1.5 minutes to a high of 11 minutes. Seventy per cent of the testers found it difficult to locate information using the existing intranet navigation tools. Semio did not categorize all the documents in the right place. A full implementation after the pilot would require additional time spent on refining the taxonomy.
Strengths and weaknesses
On the other hand, the pilot identified the following weaknesses:
2. Significant time is required up front to define the domains (information spaces) to be "crawled" and to develop a top level taxonomy.
3. Developing a taxonomy for a large or diverse information space usually works best when the work is divided into several taxonomies which may later be combined.
Created on May 23, 2001 | Updated on November 19, 2012 Created on l Updated on November 19, 2012