[Home] [By Thread] [By Date] [Recent Entries]
Let's not forget the balance between fidelity and usability. TBH in my
experience usability tends to trump fidelity in most use
cases,especially if we set the bar of entry (re: understanding) too
high. Whilst I often bristle at terms such as MVP or KISS, there is a
need to consider affordable quality.
On 25/04/2017, David Carlisle <d.p.carlisle@g...> wrote:
> On 25 April 2017 at 19:49, Costello, Roger L. <costello@m...> wrote:
>
>> Hi Folks,
>>
>> XML 1.0 has a limited set of characters. Some other data formats have a
>> superset of characters – the other data formats may have characters that
>> would be illegal in XML.
>>
>> Suppose the other data format is to be converted to XML. How will the
>> illegal characters be handled?
>>
>> Other data format -> convert -> XML
>>
>> Example: the JSON data format has a superset of characters. Suppose you
>> want to convert the following JSON to XML:
>>
>> {
>> "key":"\u0000"
>> }
>>
>>
>>
>> \u0000 is a JSON encoding of the NUL (hex 0) character. Recall that the
>> NUL character is not allowed in XML.
>>
>> I am collecting requirements on the process of converting other data
>> formats into XML. Below is my list thus far. Do you agree with the list?
>>
>
> I agree it's a list.
>
>
>
>
>> Are there requirements that you would add/delete?
>>
>> 1. The conversion must result in legal XML. Thus, conversion of the above
>> JSON must not produce this:
>>
>> <key>�</key>
>>
>> That is not legal (well-formed) XML.
>>
> this should go without saying: it is implied by "conversion to XML" there
> is no such thing as XML which is not well formed, it's just not XML.
>
> 2. The conversion must be round-trippable. The operation must be lossless.
>> Thus, it is not acceptable to convert the above JSON to this:
>>
>> <key/>
>>
>> Data has been lost. That is a lossy operation and is not round-trippable.
>>
>
> A good requiremet to have.
>
> 3. The conversion must output standard XML. The XML must not contain
>> syntax/encoding that is specific to the other data format. The XML must
>> be
>> processable using standard XML tools. Thus, it is not acceptable to
>> convert
>> the above JSON to this:
>>
>> <key>\u0000</key>
>>
>> That has a JSON-specific encoding embedded within XML. If we wanted, say,
>> to do a string comparison on the value of <key>, the application would
>> need
>> to understand the JSON syntax.
>>
>
> Without a definition of "Standard XML" I don't think this requirement means
> anything.
> the content of an XML element is always in some format specified outside of
> XML if you have <p>Hello World</p> you need to understand English to make
> sense of the content which is no different from understanding that \u0000
> means null (if that is what it means in this context)
>
>
> 4. The conversion must output readable text. No hexadecimal text output.
>> Thus, it is not acceptable to convert this:
>>
>> {
>> "message": "Hello \u000C World"
>> }
>>
>>
>>
>> to this:
>>
>>
>>
>> <message>48656c6c6f200c20576f726c64</message>
>>
>
> This doesn't seem a useful restriction.
>
>
>>
>> Well, that’s a start. What are the other requirements for converting
>> illegal characters to XML?
>>
>>
>>
>> Have these requirements boxed me into a situation where no solution is
>> possible?
>>
>
> impossible to say. If for example you use
>
> <message>hello <char>0</char> World</message>
>
> does that meet all four of your requirements, I can't tell.
> (that is content model of message is character data or char elements and
> content of char is a decimal number representing a unicode character of
> that number.)
>
>
>
>>
>> /Roger
>>
>>
>>
>
> David
>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



