[Home] [By Thread] [By Date] [Recent Entries]


The following perl, applied to the standard Unicode database file 
'UnicodeData.txt' produces a file of entity declarations, 673922 bytes 
in size, that declares 13789 entities with canonical names.

use strict;

while (<STDIN>)
{
     my @fields = split(/;/, $_);
     my $cpoint = $fields[0];
     $cpoint =~ s/^0*//;
     my $name = $fields[1];
     next unless $name;
     next if ($name =~ /</);
     $name =~ s/ /_/g;
     print "<!ENTITY $name '&#x$cpoint;'>\n";
}


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member