Right now, text in computing is defined by a series of numbers, most commonly the Unicode standard. Each number signifies a particular letter, and computers can scan these codes very quickly. So when you enter a search term, the machine has no idea what those letters signify. It simply looks for the pattern -- it has no inkling of the concept behind the pattern.
But in semantic search, every bit of information is defined by potentially dozens of meaningful concepts. When a copywriter invoices for his or her work, for example, the date could be defined in terms of calendar, invoice, billing period, and so on. All these definitions for one piece of information are called 'metadata', or information about information.
Collections of agreed metadata terms for a particular field or task, like medicine or accounting, are called ontologies.
So the computer not only searches for the term, it searches for related metadata that defines types of information in specific ways. In reality, the computer still does not 'understand' a concept in its semantic search -- it continues to look for patterns of letters. But because the concepts behind the search terms are included, it can return results based on concepts as well as text patterns.
Friday, November 20, 2009
Computers Search For Meaning
European researchers have developed the first semantic search platform that integrates text, video and audio. "The system can 'watch' films, 'listen' to audio and 'read' text to find relevant responses to semantic search terms." The MESH project "represents an emerging paradigm shift in search technology" according tl an article in ScienceDaily titled, Listen, Watch, Read: Computers Search for Meaning.