Ok I admit I haven't thought about this in anger but a problem
shared....etc...
I am doing a matching algorithm to match movie data from different
repositories so that I know when the repositories are referencing the same
movie even though they may hold different metadata.
It's not enough to match solely on title - one reason for that is movies
have subtitles and may go by the subtitle in a different venue.
So let's say thanks to xsl:key I have in a variable $titles all the movies
that have that title and in a variable $actors I have all the movies that
that actor featured in.
A (out of many) criteria I could have is that if the data from the
respective venues has it's title and an actor in common then they are the
same movie, thats a plain intersect between $titles and $movies but I want
something stronger than that.
I want the data from the venues to match only if they have at least 2
actors in common.
|