Re: Something altogether different?

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: Re: Something altogether different?
From: Murali Mani <mmani@c...>
Date: Mon, 25 Apr 2005 16:48:08 -0400 (EDT)
Cc: "'XML Developers List'" <xml-dev@l...>
In-reply-to: <001a01c549d6$09c6ce20$1601a8c0@DURANTE>
References: <15725CF6AFE2F34DB8A5B4770B7334EE07206E84@h...><a06020400be92acb49acf@[192.168.1.101]> <001a01c549d6$09c6ce20$1601a8c0@DURANTE>


One disadvantage of term-based weighting or vector space model is the
well-known example cited in the Google's original paper (rather sales
pitch??) --

A document with only the words "Bill Clinton [expletive deleted]"; as opposed to the
actual white house page was considered more important for the query "Bill
Clinton" (when Clinton was the president)

I believe we can use vector-space model only when the document collection 
is "homogeneous" in some manner.. and has repetitive words etc.

Also note -- vector space model, you have to obtain rank of documents in
real-time given a query.

For other metrics such as say pagerank, rank of documents can be 
pre-computed, and we can use better algorithms based on this property.

best, murali.

Follow-Ups:
- Re: Something altogether different?
  - From: "Ken North" <kennorth@s...>
- Re: Something altogether different?
  - From: "Ken North" <kennorth@s...>

References:
- RE: Something altogether different?
  - From: "Bullard, Claude L (Len)" <len.bullard@i...>
- RE: Something altogether different?
  - From: "Steven J. DeRose" <sderose@a...>
- Re: Something altogether different?
  - From: "Ken North" <kennorth@s...>

Prev by Date: Re: Something altogether different?
Next by Date: RE: Something altogether different?
Previous by thread: Re: Something altogether different?
Next by thread: Re: Something altogether different?
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >