[Home] [By Thread] [By Date] [Recent Entries]

  • From: John Cowan <johnwcowan@g...>
  • To: Rick Jelliffe <rjelliffe@a...>
  • Date: Wed, 10 Apr 2013 01:52:45 -0400

On Wed, Apr 10, 2013 at 1:23 AM, Rick Jelliffe <rjelliffe@a...> wrote:

This is a truly excellent post, which you should work up into an article somewhere.

1) Test driven development.   Before=as=so-soon-after-that-noone-notices  you make some software, you make a test for it.  If the document has a fixed structure, you can test by instances. If the document is semi-structured or recursive, your test specification has to allow those kinds of structures too: and for XML such a specification is called a schema.  

Examplotron is particularly nice here, or using Trang to generate a post hoc RELAX NG schema from a reasonable library of existing instances.  Any new instance that fails output validation is then added to the library, and the schema is regenerated.  (Unfortunately, Trang can't accept a schema on the input side when doing this, or you'd just need to keep the schema.  I'm working on a tool that will be much cruder than Trang but will have this capability.)
 
2) Quality assurance.  I work in a company with a globally distributed development and production system: (it is so big that US content architects may forget they have brother content architects in other countries when casually posting :-).

Not me, dude.  My "two dozen" included our Indian siblings.  If there are schema developers elsewhere than (in caps) Content Architecture, I don't know about it.
 
3) Conway's Law.  A successful system must have sub-system boundaries that match the organization.   Formalizing a boundary that matches internal organizational boundaries helps reduce communication costs. Formalizing a boundary within a team needs to allow flexibility, agility, otherwise it will get in the way.  

Indeed, that's what our internal schemas are basically for.  "Boundarylessness" is one of  $EMPLOYER's so-called key values, which (as usual) is an indication that they aren't (yet) very good at it.
 
Where I would disagree with Simon, I think, is that I think the advent of JSON for point-to-point interchange actually means that probably you should always use a schema with XML:  if you don't need a schema perhaps you should be using JSON?   

The problem with JSON is that arrays provide ordering and objects provide naming, but if you want named ordering you have to go a level deeper, which is annoying.  A JSON document containing a sequence of paragraphs interspersed with blockquotes, you have to make each element of the outermost array a dummy object like {"type" : "paragraph", "content" : (whatever)}".  Not all JSON systems correctly handle the case of the top-level item being an array, either.
 
Actually, that is too much: what trumps often is how easy a format is to fit into your current ecosystem and capabilities:

Sure, local issues almost always trump global architecture in practice, unless there are *very* strong top-down drivers.

--
GMail doesn't have rotating .sigs, but you can see mine at http://www.ccil.org/~cowan/signatures


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member