[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: I think XML tools should handle XML files up to 2^64bytes
On Tue, 2018-11-13 at 13:32 +0000, Costello, Roger L. wrote: > Hi Folks, > > I think XML tools (e.g., XML parsers, schema validators, XSLT > processors) should handle XML files up to 2^64 bytes in size. Hmm. My own text retrieval system [1] limited files to 2^23 blocks of 64 bytes each (32 bits) -- it needed bytes rather than characters so it could seek directly to any part of the file to extract a snippet for showing results. But that was on 32-bit systems; on a 64-bit system it might be able to address 2^59 blocks. Many other systems at the time duplicated the text internally, a strategy which lets you guarantee some sort of integrity but massively increases the index size. Since 64 bits lets you address more storage than most people can buy, and vastly more than can be parsed linearly by most XML tools in any reasonable amount of time, it's not a useful limit and not easily testable. > Why that number? Here's why: > > The number 2^64 is: [list of magical correspondances deleted] You could choose any number and find lots of reasons to choose it. Using 63 bits lets you use negative numbers as an offset from the end of the file. The reason i used fewer bits for the text retrieval system was first that storing only approximate locations in files meant storing less information - a smaller index, faster to process - and secondly that it let me use some of the bits in the address for flags, again saving space in the index. Some systems store garbage collection information inside address pointers. 2^47 bytes would still permit very large files and would give 8 bits for another purpose and still not use the sign bit, allowing negative offsets. [...] > The total number of IPv6 addresses generally given to a single LAN or > subnet. There are 39 books inthe Old Testament. There are 3 * 9 = 27 books in the New Testament. There are 2 * 7 = 14 books in the Apocrypha. There are 1 * 4 = 4 Gospels. So XML systems should support at least 4^39 bytes. Liam [1] lq-text was (is) the open source version of nx-text, a commercial package i wrote but that we never sold. Michael Sperberg-McQueen has suggested a backronym of "Liquid Text" which i shall use if i ever do another release. https://www.holoweb.net/liam/lq-text -- Liam Quin, https://www.holoweb.net/liam/cv/ Web slave for vintage clipart http://www.fromoldbooks.org/ Available for XML/Document/Information Architecture/ XSL/XQuery/Web/Text Processing/A11Y work & consulting.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|