[Home] [By Thread] [By Date] [Recent Entries]

  • From: Andrew Welch <andrew.j.welch@g...>
  • To: Helena Galhardas <helena.galhardas@i...>
  • Date: Mon, 6 Feb 2012 14:02:33 +0000

> In order to test exhaustively this library, we need to have XML data sets
> that have data quality problems known a priori.
> By data quality problems, we mean: missing values, misspellings, synonyms,
> values out of domain, approximate duplicates, etc.


Government data:  http://data.gov.uk/data

I did a short contract for 'LinkedGov' a while back
(http://linkedgov.org/), it's their goal to make the data clean and
usable, so you might want to get in touch with them.



-- 
Andrew Welch
http://andrewjwelch.com


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member