[Home] [By Thread] [By Date] [Recent Entries]

  • From: Tony Graham <tgraham@a...>
  • To: xml-dev@l...
  • Date: Wed, 10 Aug 2022 11:27:28 +0100

On 09/08/2022 21:01, Roger L Costello wrote:
A file has no inherent format.

The format of a file is determined by the programs that use it.

Since file types are not determined by the file system, the "kernel"
can't tell you the type of file: it doesn't know.
Yet the Unix 'file' [1] command has been doing a pretty good job of it
since 1973. [2]

The 'file' command firstly uses filesystem tests to determine if a file
is empty or is a special file, such as a socket or a symbolic link.

It secondly uses 'magic' tests to detect the file type. The 'file'
manpage includes:

The magic tests are used to check for files with data in particular
fixed formats.

The magic tests use "magic patterns" from a 'magic' file. [3]

The 'magic' file on my Linux system includes 11 patterns that start with
'<?xml'. They are mostly followed by other tests to try to determine
the type of XML, e.g.:

0 string \<?xml\ version="
>15 string >\0
>>19 search/4096 \<svg SVG Scalable Vector Graphics image

There's even a test for '<?XML' that will be reported as 'broken XML document'.

Regards,


Tony Graham.
--
Senior Architect
XML Division
Antenna House, Inc.
----
Skerries, Ireland
tgraham@a...

[1] https://www.man7.org/linux/man-pages/man1/file.1.html
[2] https://www.man7.org/linux/man-pages/man1/file.1.html#HISTORY
[3] https://man7.org/linux/man-pages/man4/magic.4.html


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member