[Home] [By Thread] [By Date] [Recent Entries]


Rick Marshall wrote:

> actually modern compressors don't know about the representation and 
> don't much mind. they actually work on the entropy of the message and 
> the message as a bit stream - ie they don't know there are tags, ascii 
> data, binary, data, schema etc. there's not room to go into it here but 
> they will compress a message fairly consistently based on the entropy of 
> the message, not the representation. different algorithms are marginally 
> better than others (bzip2 vs gzip eg), but seem to give proportionally 
> similar results.


There have been a number of compressors designed specifically for XML, 
though, that take advantage of the knowledge of the structure of XML 
documents to gain some benefits in compression compared to generic 
algorithms AT&T Labs' XMill is one example:

http://www.research.att.com/sw/tools/xmill/

Personally, though, I doubt even the best of these justify the added 
cost and complexity compared to a generic algorithm like deflate.

-- 
Elliotte Rusty Harold  elharo@m...
XML in a Nutshell 3rd Edition Just Published!
http://www.cafeconleche.org/books/xian3/
http://www.amazon.com/exec/obidos/ISBN%3D0596007647/cafeaulaitA/ref%3Dnosim

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member