[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: xml over http - RFC 3023

  • From: "Andrew Welch" <andrew.j.welch@g...>
  • To: "Rick Jelliffe" <rjelliffe@a...>
  • Date: Mon, 1 Dec 2008 12:12:19 +0000

Re:  xml over http - RFC 3023
> For application/xml you ignore the first step and go straight to the
> document. If your data is usually in UTF-8 or ASCII, you could perhaps read
> in the first block from bytes to characters and (if the transcoder has not
> generated an exception) confirm that there is no XML encoding declaration or
> BOM or that the string "utf-8" does not appear in the XML encoding
> declaration, in which case you don't need to do anything more complicated.
> If your data is text/xml, you are indeed in a sea of complication, which is
> why text/xml has been discouraged for so long.

ok, that makes sense, thanks.


> Maybe, but the mechanism for this occur, for Apache at least, is for someone
> to write it, contribute it, champion it and maintain it.

Champion a mechanism for a web server to serve xml?  really?


> But the basic XML contract is that the encoding must be explicitly labelled
> by the sender (creator of the document) and the recipient should not guess
> but use the label. If this is too much for naive users, then XML is simply
> not the technology for them, and XML should not be blamed for not working in
> a situation it explicitly was designed to avoid. It is just like if someone
> does not know what + means they cannot use a calculator. It is not an
> indictment of mathematics if someone who does not know + cannot use a
> calculator. Character encoding is just as fundamental to computer
> programming as knowledge of the difference between floats and ints, for
> example: that Western computer science and IT courses have guaranteed the
> ignorance of their students in this is sad.

Er, ok.  You do realise there is a different expert somewhere else in
the world saying exactly the same thing about their specialist area.
(not sure I agree with that analogy either)



> In any case, I thought most people had written off RSS as unprocessable by
> generic XML tools, because so much RSS was not well-formed? I thought one
> reason for Atom was that the early RSS systems creators messed up their XML
> and RSS never recovered.  With RSS, what you are not experiencing the
> failure of XML on the web, you may be experiencing the failure of non-WF XML
> (and the potential complexity of figuring out text/xml).

The vast majority of the feeds are RSS, very few are Atom so from here
it looks like Atom has had little impact so far.  Processing the RSS
feeds are a pain but manageable using xslt 2.0 calling out to tagsoup,
jtidy etc, and using a LexicalHandler to intercept the entities.

From my naive perspective, I would've thought the web server would
serve the XML with the correct encoding in the contenttype so I don't
have to ignore it, and/or I could the XML parser a url and it would
take care of it.  I'm not sure why I should be reading appendices of
the spec and writing low-level code for something that should be an
everyday task.  In that respect, I think, you could argue it hasn't
succeeded yet.



-- 
Andrew Welch
http://andrewjwelch.com
Kernow: http://kernowforsaxon.sf.net/


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.