Monday, October 11, 2010

Machine Learning Project at Carnegie Mellon


The NYTIMES published a story titled, Aiming to Learn as We Do, a Machine Teaches Itself. The article on machine learning focused on the Never-Ending Language Learning system (NELL) at Carnegie Mellon University. It is an interesting glimpse into how far we have come in developing learning systems.
With NELL, the researchers built a base of knowledge, seeding each kind of category or relation with 10 to 15 examples that are true. In the category for emotions, for example: “Anger is an emotion.” “Bliss is an emotion.” And about a dozen more.

Then NELL gets to work. Its tools include programs that extract and classify text phrases from the Web, programs that look for patterns and correlations, and programs that learn rules. For example, when the computer system reads the phrase “Pikes Peak,” it studies the structure — two words, each beginning with a capital letter, and the last word is Peak. That structure alone might make it probable that Pikes Peak is a mountain. But NELL also reads in several ways. It will mine for text phrases that surround Pikes Peak and similar noun phrases repeatedly. For example, “I climbed XXX.”

A helping hand from humans, occasionally, will be part of the answer. For the first six months, NELL ran unassisted. But the research team noticed that while it did well with most categories and relations, its accuracy on about one-fourth of them trailed well behind. Starting in June, the researchers began scanning each category and relation for about five minutes every two weeks. When they find blatant errors, they label and correct them, putting NELL’s learning engine back on track.

When Dr. Mitchell scanned the “baked goods” category recently, he noticed a clear pattern. NELL was at first quite accurate, easily identifying all kinds of pies, breads, cakes and cookies as baked goods. But things went awry after NELL’s noun-phrase classifier decided “Internet cookies” was a baked good. (Its database related to baked goods or the Internet apparently lacked the knowledge to correct the mistake.)

NELL had read the sentence “I deleted my Internet cookies.” So when it read “I deleted my files,” it decided “files” was probably a baked good, too. “It started this whole avalanche of mistakes,” Dr. Mitchell said. He corrected the Internet cookies error and restarted NELL’s bakery education.

His ideal, Dr. Mitchell said, was a computer system that could learn continuously with no need for human assistance. “We’re not there yet,” he said. “But you and I don’t learn in isolation either.”

No comments: