[Home] [By Thread] [By Date] [Recent Entries]

  • From: Michael Sokolov <msokolov@s...>
  • To: "Sheila M. Morrissey" <Sheila.Morrissey@i...>, "xml-dev@l..." <xml-dev@l...>
  • Date: Tue, 29 Apr 2014 16:59:41 -0400

Having worked on a number of dictionary products (including the current OED online - v.3, which is now delivered as XML - the previous one was SGML), I can say that each of them has had a completely idiosyncratic data (tagging) structure.  OED, in particular, encodes a great deal of information that isn't represented in most dictionaries.  And I don't think this is due to sheer perversity, or not-invented-here syndrome.  It's just that different dictionaries encode quite different information.  Are you interested in bilingual dictionaries? Etymology? Examples in current usage? Historical usage?  Linking to other entries via a thesaurus?, etc...

I would advise a simple scheme that aims for a greatest common divisor, and don't plan to model the full complexity of any real existing dictionary in your first version.

-Mike


On 04/29/2014 04:07 PM, Sheila M. Morrissey wrote:

Any recommendations/warnings for an XML vocabulary or an ontology for dictionaries (such as OED, or Websters)?

 

Thanks,

Sheila

 

Sheila M. Morrissey

Senior Researcher

ITHAKA

100 Campus Drive

Suite 100

Princeton NJ 08540

609-986-2221   

sheila.morrissey@i...

 

ITHAKA (www.ithaka.org) is a not-for-profit organization that helps the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways.  We provide innovative services that benefit higher education, including Ithaka S+R, JSTOR, and Portico.

 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member