| Montague Institute l Contents l Index l Digest l Courses l Calendar l Subscribe |
||
|
Best of both worlds: September, 2001 What if we could combine:
What if we could do it by leveraging our existing content infrastructure using a customized search engine and a meta tag program? This article describes a recent pilot project designed to reduce the hidden costs of Internet information retrieval by customizing the search function and integrating it with a browse function. Pilot objectives Google allows web sites to use its search free of charge, but webmasters can't customize it. And, because it's not installed on our server, it can only find documents on our public site. Documents on the password-protected Society site are excluded from the search results. Now go to the Index and browse the A - Z list to find "best practices." Select "B" at the top of the page, then click on "best practices" in the left frame. A list of documents for that term will be displayed in the right frame. The Index is powered by a relational database (Knowledge Base), also used in our taeching Lab. When you click on a term, you're using the database's built-in search engine and special web pages that have been designed to display database values. Database search vs. advanced search The importance of complete and accurate metadata This kind of descriptive information, called "metadata," consists of two parts -- a label and a value. In a database, metadata resides in fields. In a web page or Microsoft Office document, metadata resides in meta tags. The example below illustrates how some of the field values in the Knowledge Base record for the "Knowledge Transfer" article were incorporated into the corresponding web page as meta tags during our pilot project. Best of both worlds Browsing, using an A - Z index or list of topics, solves these problems by presenting a list of choices within a circumscribed, standardized universe. Because good indexes require human input, they are considered to be expensive to create and maintain. But, if you consider the amount of time a good index can save as well as the efficiencies possible with categorization technology, the cost becomes more reasonable. How do you get the best of both worlds -- searching and browsing -- without inconveniencing users and making a large investment in software and personnel? You do it in the same way that you build a house. Instead of private wells, septic tanks, and generators for each dwelling, you connect to public utilities. Instead of building windows from scratch, you install pre-fabricated modules. And you use technology to reduce costs and improve quality. Taxonomy: an information "public utility" We began building our taxonomy in 1998 because we needed to save time in research and publishing. By the summer of 2001, we had created Knowledge Base records for all the articles in the Montague Institute Review. Each article record in the Knowledge Base includes descriptive attributes (e.g. author, title, publication date) along with links to taxonomy records (e.g. vocabulary terms and categories). Each taxonomy record includes a definition (if available) along with links to broader terms, narrower terms, related terms, and articles. Enhancing content creation
By incorporating much of the indexing work into the content creation process and using pre-defined categories and terms from the taxonomy, we reduced the cost of maintaining the index browse function and set the stage for customizing the search engine. Using technology But first we had to get the metadata (certain field labels and field values) from our Knowledge Base into each HTML page on our Web site. To make this easier, we tested a software tool called the Watchfire Metadata Management System, previously known as Metabot (another option is HiSoftware's Metadata Server). Metabot works like a spreadsheet, allowing you to add metadata tags and make global changes to the metadata values in all the documents on your web site. An example of a global value would be "publisher" (the "publisher" meta tag value would be "Montague Institute" for all our documents). It also speeds up the process of adding metadata values manually. An example would be the Knowledge Base ID number, which is unique to each article. Benefits Now try the Ultraseek browse feature. Click on the "performance improvement" link (between the search box and the list of matching documents). Visitors can see related categories and definitions without having to access the index in a separate operation. On the same screen, they can get a list of all documents containing the search term ("performance improvement") as well as a list of articles recommended by our editors (even though the recommended documents might not contain the actual phrase "performance improvement"). Because the Ultraseek search engine can access metadata in the documents, it displays a more meaningful document summary that includes a short description and the date of publication. But better document summaries aren't the only benefit of Ultraseek's ability to read meta tags. This feature also improves the advanced search feature. If you go to the Ultraseek Search and click on "Advanced Search." Search the body for the word "taxonomy" and select dates between today's date and 1 January 2001. You should get about fifteen documents -- all published in 2001. Click on the "Sort by date" link, and Ultraseek will sort the results by publication date. Reducing the hidden costs of information retrieval Leveraging the human investment Scaling up Scaling up the model to serve the entire company means replicating the local system, each specialized collection supported by its own Knowledge Base, customized search engine, and taxonomy. Integrating all of them would be an information "public utility" consisting of a corporate thesaurus and a search engine, configured to search all the collections. Other alternatives Conclusion The return on investment was relatively high because we could:
| ||