- From: Dimitre Novatchev <dnovatchev@g...>
- To: Michael Kay <mike@s...>
- Date: Sun, 16 Jan 2022 18:57:14 -0800
Extended the code to also measure the size of naked XDocument, XElement and XAttribute. Hear we see even greater difference: XAttribute is more than 5 times smaller in size than an XElement: static void Main(string[] args) { var mem1 = GC.GetTotalMemory(true); var doc = new XmlDocument(); var mem2 = GC.GetTotalMemory(true); var elem = doc.CreateElement("x"); var mem3 = GC.GetTotalMemory(true); var attr = doc.CreateAttribute("y"); var mem4 = GC.GetTotalMemory(true); var xDoc = new XDocument(); var mem5 = GC.GetTotalMemory(true); var xElem = new XElement("x"); var mem6 = GC.GetTotalMemory(true); var xAttr = new XAttribute("y", "1"); var mem7 = GC.GetTotalMemory(true); Console.WriteLine($"XmlDocument: {mem2 - mem1} bytes"); Console.WriteLine($"XmlElement: {mem3 - mem2} bytes"); Console.WriteLine($"XmlAttribute: {mem4 - mem3} bytes"); Console.WriteLine($"XDocument: {mem5 - mem4} bytes"); Console.WriteLine($"XElement: {mem6 - mem5} bytes"); Console.WriteLine($"XAttribute: {mem7 - mem6} bytes"); }
Results:
XmlDocument: 3176 bytes XmlElement: 656 bytes XmlAttribute: 152 bytes XDocument: 56 bytes XElement: 512 bytes XAttribute: 96 bytes
Cheers, Dimitre
}
On Sun, Jan 16, 2022 at 12:30 PM Michael Kay < mike@s...> wrote: I ran a small C# program that shows the sizes of a bare (just created) XmlDocument, XmlElement and XmlAttribute. Here are the results: an XmlDocument: 3176 bytes
an XmlElement: 656 bytes
an XmlAttribute: 152 bytes.
I hadn't realised that you could get such a high precision memory instrumentation in C#.
With SaxonCS, on the TinyTree, nodes aren't allocated as individual objects, so we need to do bulk allocation and then compute an average.
I ran this test with SaxonCS:
private void buildDocWithElements(TreeModel model, int count) { long mem = GC.GetTotalMemory(true); StringBuilder sb = new StringBuilder("<doc>"); for (int i = 0; i < count; i++) { sb.Append("<a/>"); } sb.Append("</doc>"); Processor proc = new Processor(); DocumentBuilder db = proc.NewDocumentBuilder(); db.TreeModel = model; XdmNode doc = db.Build(new StringReader(sb.ToString())); sb = null; Console.WriteLine("Memory: " + model + " " + count + " elements = " + (GC.GetTotalMemory(true) - mem)); }
private void buildDocWithAttributes(TreeModel model, int count) { long mem = GC.GetTotalMemory(true); StringBuilder sb = new StringBuilder("<doc>"); for (int i = 0; i < count; i++) { sb.Append("<a b=''/>"); } sb.Append("</doc>"); Processor proc = new Processor(); DocumentBuilder db = proc.NewDocumentBuilder(); db.TreeModel = model; XdmNode doc = db.Build(new StringReader(sb.ToString())); sb = null; Console.WriteLine("Memory: " + model + " " + count + " attributes = " + (GC.GetTotalMemory(true) - mem)); }
[Test] public void TestMemoryUsed() { buildDocWithElements(TreeModel.TinyTree, 10000); buildDocWithElements(TreeModel.TinyTree, 20000); buildDocWithAttributes(TreeModel.TinyTree, 10000); buildDocWithAttributes(TreeModel.TinyTree, 20000); buildDocWithElements(TreeModel.LinkedTree, 10000); buildDocWithElements(TreeModel.LinkedTree, 20000); buildDocWithAttributes(TreeModel.LinkedTree, 10000); buildDocWithAttributes(TreeModel.LinkedTree, 20000); }and it produced this output:
Memory: TinyTree 10000 elements = 800992 Memory: TinyTree 20000 elements = 992680 Memory: TinyTree 10000 attributes = 900744 Memory: TinyTree 20000 attributes = 1720944 Memory: LinkedTree 10000 elements = 2064384 Memory: LinkedTree 20000 elements = 4072008 Memory: LinkedTree 10000 attributes = 4198024 Memory: LinkedTree 20000 attributes = 8316768
But note that when we add 10000 attributes we are also adding 10000 elements.
My conclusions from this:
For the TinyTree:
* the cost for an additional empty element is (992680 - 800992) / 10000 = 19 bytes * the cost for an additional empty element plus empty attribute is (1720944 - 900744) / 10000 = 82 bytes, so the attribute is 63 bytes
For the Linked Tree:
* the cost for an additional empty element is (4072008 - 2064384) / 10000 = 200 bytes * the cost for an additional empty element plus empty attribute is (8316768 - 4198024) / 10000 = 412 bytes, so the attribute is 212 bytes
These are close to what I would predict from the design.
Measuring empty elements and attributes is a bit artificial. If we make the values in each case be a single ASCII character the numbers change to
Memory: TinyTree 10000 elements = 994320 Memory: TinyTree 20000 elements = 1379024 Memory: TinyTree 10000 attributes = 1176816 Memory: TinyTree 20000 attributes = 2273088 Memory: LinkedTree 10000 elements = 3103296 Memory: LinkedTree 20000 elements = 6148136 Memory: LinkedTree 10000 attributes = 4478456 Memory: LinkedTree 20000 attributes = 8868808
meaning:
For the TinyTree:
* the cost for an additional single-character element is 38 bytes * the cost for an additional single-character attribute is 110 - 19 = 91 bytes
For the Linked Tree:
* the cost for an additional single-character element is 304 bytes * the cost for an additional single-character attribute is 439 - 200 = 239 bytes
Note: from the design (not from measurement) the size should be independent of the length of the name, provided the same names are used repeatedly.
Michael Kay Saxonica
--
Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- To achieve the impossible dream, try going to sleep. ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :) ------------------------------------- Sanity is madness put to good use. ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|