[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Ensuring samples are representative

  • From: "Liam R. E. Quin" <liam@w3.org>
  • To: yamahito <yamahito@gmail.com>, xml-dev@l...
  • Date: Mon, 17 Oct 2016 21:43:16 -0400

Re:  Ensuring samples are representative
On Mon, 2016-10-17 at 21:34 +0000, yamahito wrote:
> Hi Folks,
> 
> I have an upcoming contract that will involve converting between XML
> formats.  My client wants their customer to provide a representative
> sample

In that case make sure your contract does not promise any level of
accuracy other than a best effort.

> The onus may be firmly on the end customer,

In which case get that in writing.

> What would people recommend to help a) ensure that the sample *is*
> representative and b) help target/prioritise the work on the
> transformation?

You can't ensure it. It may help to get the five longest, five
involving mathematics, five involving tables, five involving both, the
five oldest documents, and then every 17th document (or some other
prime number) by document number.

I say 17th because otherwise you might just get e.g. the first of every
batch, and discover that's always a cover letter. But it depends what
people are willing to send you. If they won't do that, ask for some
"complete sets, if documents come in groups".

Ask the people working with the documents which ones are easy and which
are hard and why. But don't assume that what they find hard will be
hard for you - e.g. "the long tables are hard because they have 10,000
rows" doesn't bother a program, but "the single-page cover sheets are
all different because they came from a word processor" is another
matter.

Converting e.g. from word processing or page layout XML files to a
higher level is difficult - e.g. you may need to group list items
together into a list, coping with continuing numbering after an
intervening table...

It's like a giant puzzle and can be a lot of fun to work out. The two
most important things I learned from doing SGML and XML conversions are
to automate as much as possible and to document your processes. Next
most important comes using programming tools such as make, Perl, XSLT,
a revision control system, with never any need to remember which
scripts to run and in what order.
]
Hope this helps,

Liam






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.