[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Granularity

  • From: Michael Sokolov <sokolov@ifactory.com>
  • To: Len Bullard <cbullard@hiwaay.net>
  • Date: Thu, 05 Jan 2012 21:21:47 -0500

Re:  Granularity
Good question!

It depends on the content.

Most human-readable texts are broken down into conceptual units already; articles, sections, chapters, entries, etc.  We try to pick one that will at least fill the screen with text, and then impose maximum size constraints based on the delivery channel's capacity.  It's not just viewing though that informs the choice; search often figures into it as well.  Ideally search results are 1-1 with viewable chunks; this leads to a natural, easily-grasped interface, and makes search implementation straightforward.

Sometimes texts (like novels) don't have natural breaks; in these cases search is less important, reading more so, and we just paginate according to the user's viewport size.

Other texts impose their own specific chunking requirements (enormous court documents; dictionaries where you can search entries, senses (within an entry) or quotations (within a sense)) that fight against the simple rules. In these cases we try to recast the problem in more familiar terms, sometimes chunking at multiple levels at once for search, but displaying using anchors or pagination within a larger chunk.

Machine to machine I think is informed by a different set of considerations: transaction boundaries, channel capacity again, ability to rollback and retry, etc.  Basically a compromise between performance (large messages will tend to be more performant, up to memory limits), and robustness (small messages make a smaller crater when they fail).

As far as human-machine, it does also depend to a certain extent on the software.  Word can handle much larger documents than in-browser editors, and features like autosave can mitigate the failure to save a large document, but generally speaking I'd say chunk size here is similar to the human-human piece.  I do sometimes end up poking around in 50MB xml documents in emacs, sometimes even changing something, and it works fine, but I don't think that's a typical use case?  I find that 100MB is pretty much the limit for that sort of thing.

-Mike

On 1/5/2012 7:14 PM, Len Bullard wrote:
8044FBBA608F4BAEACD54B9453165FD9@LenBullardPro" type="cite">

When building XML systems, how do you choose the best granularity for storing and retrieving fragments? 

 

Machine to machine

 

Human to machine

 

Human to human

 

Part of the art is interpreting what branch and leaf combinations best give a role/user the most copacetic view.  How do you choose?  Does the user choose?

 

The proportion of XML consumed and emitted by machines or humans is not interesting,IME.   The cost and type of the value-add of the humans consuming and emitting XML is.  In documents, this is obvious.   Granularity.

 

len



  • References:
    • Granularity
      • From: "Len Bullard" <cbullard@hiwaay.net>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.