Re: Something altogether different?

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

To: "Murali Mani" <mmani@c...>
Subject: Re: Something altogether different?
From: "Ken North" <kennorth@s...>
Date: Mon, 25 Apr 2005 20:23:01 -0700
Cc: "'XML Developers List'" <xml-dev@l...>
References: <15725CF6AFE2F34DB8A5B4770B7334EE07206E84@h...> <a06020400be92acb49acf@[192.168.1.101]> <001a01c549d6$09c6ce20$1601a8c0@DURANTE> <Pine.LNX.4.58.0504251642300.3409@c...>

Murali Mani wrote:

> One disadvantage of term-based weighting or vector space model is the
> well-known example cited in the Google's original paper (rather sales
> pitch??) --
>
> A document with only the words "Bill Clinton [expletive deleted]"; as opposed to the
> actual white house page was considered more important for the query "Bill
> Clinton" (when Clinton was the president)
>
> I believe we can use vector-space model only when the document collection
> is "homogeneous" in some manner.. and has repetitive words etc.

Google is apparently looking at a noun clustering scheme.
http://news.zdnet.com/2100-9588_22-5605127.html?tag=nl.e539

Norvig highlighted a research paper written by a Google employee last year
regarding a classification engine the company is testing. The technology can
parse a proper noun or compound nouns into several categories in order to
deliver clustered results, for example. For a query on "ATM," or asynchronous
transfer mode, the engine would be able to use the terms "such as" on Web pages
indexed with the term to discover that it can be linked to the expression
"high-speed networks." As a result, a search for high-speed networks might pull
up a cluster on ATM.

References:
- RE: Something altogether different?
  - From: "Bullard, Claude L (Len)" <len.bullard@i...>
- RE: Something altogether different?
  - From: "Steven J. DeRose" <sderose@a...>
- Re: Something altogether different?
  - From: "Ken North" <kennorth@s...>
- Re: Something altogether different?
  - From: Murali Mani <mmani@c...>

Prev by Date: Re: Something altogether different?
Next by Date: Re: Something altogether different?
Previous by thread: Re: Something altogether different?
Next by thread: Re: Something altogether different?
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >