[Home] [By Thread] [By Date] [Recent Entries]
To give you a non-MS area where occasional non XML characters may appear inside strings: Look at the current ANSI/ISO proposals for serializing relational data into XML. None of the database companies (Oracle, IBM, Sybase, us etc) want to encode strings as base64. To answer your question below: Assuming that we could at least allow to use a char entity for an invalid XML char. That would already help. Best regards Michael PS: Please cc me directly. Otherwise I will not see the answer until several weeks later... > OK, assuming the data type *can* be changed: what encoding would you > suggest for encoding arbitrary Unicode data (where control characters may > appear, but only occasionally)? > > Surely not base64 (it's for byte streams, adds a lot of overhead and makes > your XML unreadable to humans). > > BTW: another side of this problem is DOM's current approach. createText() > doesn't have to throw an exception when the string contains forbidden > characters. There is no standard method to test for XML character code > compliance (note that there's also an issue regarding Java characters not > being valid Unicode characters in all cases). DOM level 2 doesn't describe > serialization, so current serializers in the best case throw an exception > (which is pretty late...) or ignore the issue at all (producing broken > XML). > > > > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl>
|

Cart



