[Home] [By Thread] [By Date] [Recent Entries]
> In order to test exhaustively this library, we need to have XML data sets > that have data quality problems known a priori. > By data quality problems, we mean: missing values, misspellings, synonyms, > values out of domain, approximate duplicates, etc. Government data: http://data.gov.uk/data I did a short contract for 'LinkedGov' a while back (http://linkedgov.org/), it's their goal to make the data clean and usable, so you might want to get in touch with them. -- Andrew Welch http://andrewjwelch.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



