About us Contact us Digest Digest index
Date

Digest Topics

Authoring & editing
Budgeting, reporting, & metrics
Case studies
Content management
Enterprise knowledge management
Government applications
Graphics & posters
Learning & Education
Legal applications
Ontologies & topic maps
Personal knowledge management
Point of view articles
Publishing
Reviews
Studies & surveys
Research & searching
Slides & presentations
Taxonomies & metadata
Technologies & tools
Trends & strategies
Usability & user behavior
see also A - Z index

In this occasional column, Montague Institute Founder Jean Graef comments on one or more of the Digest articles. See also other POV articles.

Where will semantic content come from?

June, 2009

Until recently, information about the Semantic Web has been either visionary and theoretical or highly technical. Now, with the appearance of applications such as Google's "rich snippets," Yahoo's SearchMonkey, and Reuters OpenCalais, the future is tantalizingly close (see De facto standards for semantic search?). It's time to face two issues that have received little attention: sources of semantic content and the trustworthiness of those sources. By "semantic content" I'm talking about the meaning of words, the metadata that describes them, and the relationships among them.

In my own case, a recent mortgage refinance application illustrates the problem of semantic content, and a recent New York Times article raises questions about trust.

A question of meaning
I live in a condominium managed by a Homeowners Association (HOA) and recently decided to take advantage of low interest rates by refinancing my mortgage. Since I used the same bank that issued the original mortgage, both the loan officer and I expected my application to sail through with no problem. What we didn't count on was that the same economic conditions that generated the low rates also gave rise to more regulations. After three months, the mortgage officer told me that my application was denied because the Homeowners Association did not have a "Fidelity Bond" — apparently a new Fannie Mae rule that went into effect in January 2009.

Now I was at the epicenter of confusion that involved five entities: the local bank that originated the refi application, their regional processing center in another state, the (HOA), the HOA's insurance agent, and Fannie Mae (which issued the new reg). Eventually, the insurance agent confirmed what I had learned through a Google search: that a "Fidelity Bond" was the same thing as "Employee Dishonesty" coverage. Both compensate the HOA if one of its Board members or employees embezzles money or commits fraud. At my suggestion, the agent simply reissued the insurance certificate required by the bank, substituting the phrase "Fidelity Bond" for "Employee Dishonesty," for which we did have coverage.

Aside from the time spent by each of the parties, the incident lengthened a process that should have taken 6 weeks into almost 6 months. Even worse, it could have meant that the refi was no longer financially viable, since interest rates have risen. Who's to blame? Fannie Mae, for not issuing regulations that used both banking and insurance terminology? The mortgage lender and the insurance agent for not knowing that the two phrases were synonymous?

Assigning blame might be emotionally satisfying and politically expedient, but it doesn't solve the problem. What if the semantic infrastructure were in place to recognize when a concept (i.e. "Fidelity Bond") would impact other domains (i.e. banking, insurance, HOAs), suggest synonyms or related concepts, and alert someone to revise the documentation? But who would pay to create and update such a system? In our decentralized economy, resources are usually forthcoming only if the major players (i.e. banks and insurance companies) are convinced that something will be good for their bottom line — or if an entrepreneur sees an opportunity to make money marketing a new semantic service. It's unlikely that one condo owner's refi problem is going to have an impact.

What's trustworthy?
Our capitalistic economy has some built-in checks and balances that help to ensure the reliability of published content, particularly if the parties involved are evenly matched and newsworthy. So, it's likely that a semantic infrastructure created through a joint effort of banks, insurance companies, and government agencies will be fairly trustworthy. But in other domains, who will replace print publishers as arbiters of the truth when their ad-supported business model is disintegrating? Now that Google and Yahoo are using semantic standards RDFa and microformats to link directly to reviews, events, products and people, how do we know that the information displayed is trustworthy?

One experiment that bears watching is the Associated Press's recent announcement that it will publish content created by nonprofit organizations (see A.P. in Deal to Deliver Nonprofits’ Journalism). At first blush, this looks like a good deal for everyone, giving newspapers an economical way to supplement their shrinking editorial resources and nonprofits access to a broader audience.

But nonprofits exist because people with money want to advance an agenda. Sometimes their agenda is socially responsible, but often it's a way for a special interest group to accomplish objectives at the expense of the public good. The A.P. deal will work for information consumers only if there is full disclosure about the contributing nonprofits and their backers, and only if readers have easy access to multiple points of view. Since word games are a key tool of interest groups (think "climate change" vs. "global warming"), a multi-domain thesaurus is essential. Semantic Web technology provides the method, but who will provide the mandate and the funding to make it happen?

As long as the Semantic Web was only a gleam in the computer scientist's eye, we didn't need to think about content and trust. Now that de facto standards are in use and viable applications are emerging, it's time to face the music. The current economic slump is not only accelerating the demise of those institutions that provide content oversight and create metadata, but is also giving rise to new regulations that — without a semantic infrastructure — may only throw sand in the gears of commerce. The technology piece of the infrastructure is almost ready; it's the semantic content and editorial oversight that's missing.

For original articles by Jean Graef, see the Montague Institute Review.

Created on June 15, 2009 l Updated on June 16, 2009

Montague Institute l Society of Knowledge Base Publishers l Montague Information Technology
© Copyright Jean Graef 2004 - 2006. All rights reserved.