[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Notations
At 07:56 PM 9/29/98 -0400, david@m... wrote: >HTML and XML should have similar public IDs, since they're both W3C >specs -- the public ID will probably include the w3.org domain name. >What do you use, Eliot? For graphic notations, I usually use an omitted system identifier. For other notations, I use the URL of the spec, if I know it. I didn't realize there was an Adobe-defined FPI for EPS (is there one for PDF? Frame MIF? Frame binary?). The purpose of the external identifier for a notation is to uniquely identify the notation, presumably by identifying the authoritative documentation for that notation. This has two purposes: 1. To allow observing humans to determine what a particular notation is all about and have some hope of figuring out how to process it. 2. To allow the mapping of local notation names (i.e., on data ("unparsed") entity declarations and NOTATION attributes) to the processor for that notation. This latter function is *identical* to the way that object references are mapped to objects in COM. If you dig into how COM object connections are managed, you'll discover that the Windows registry is, in part, nothing but a mapping table that gets you from local (to your machine) names for notations to the UUIDs of the COM objects that implement those notations, which are then mapped to the local program names on your machine (e.g., a .exe, .dll, or .ocx file). This is just like for notations: local name for notation "EPS" maps to universally unique name for notation (+//ISBN 0-201-18127-4::Adobe//NOTATION PostScript Language Ref. Manual//EN) maps to local processor object that interprests the notation (e.g., acroread.exe). I find this interesting for two reasons. First, it suggests that the notation mechanism the correct solution for the problem because someone else came up with essentially the same solution for essentially the same problem. Second, during the XML discussions, Microsoft often complained that indirection was too hard in various contexts. However, here is Microsoft using pretty sophisticated indirection in the heart of their operating systems. Hmmmm. Maybe it's not so hard after all. Or is it simply that in the case of COM, as for notations, there's simply no way to avoid the indirection, so you have to [expletive deleted] it up and deal with it? Hmmmm. The main difference between what's happening in COMland and what notations do is that in COMland the unique name is completely opaque and unique because the generation algorithm depends on a bunch of variables that pretty well guarantee uniqueness, but also guarantee opacity; while external IDs can be just as unique, but require things like registration authorities and name management processes in order to remain human understandable and meaningful. One of the things this means is that FPIs can, if constructed in clever ways, be "researchable" (as Martin Bryan said) in the absense of a known mapping, while UUIDs are pretty much just noise unless you already have the mapping. I can tell you one thing, the Windows registry would be a heck of lot easier to debug if you could tell by looking at a UUID what it named, or at least have a clue. This then leads to a question: do I use public IDs, URLs, or UUIDs for my notations? I think that I would *never* use UUIDs, because they are too opaque. But I would definitely use them as the right hand side of my local mapping table, assuming that I'm using COM-based software (which until someone provides a usuable SGML editor on Linux {other than psgml--sorry, I'm dependent on graphical interfaces for structured editing}, I'm forced to do). Once I properly implement generalized notation processing for PHyLIS (www.phylis.com), you will actually see things like this in the "entity" mapping catalog PHyLIS uses: PUBLIC "x" "{00000014-0000-0010-8000-00AA006D2EA4}" Where "x" is the external ID for the notation (Notation name, URL, or FPI, doesn't matter) and "{00000014-0000-0010-8000-00AA006D2EA4}" is the UUID of the COM object that implements PHyLIS' notation processor interface on your machine for that notation. Within PHyLIS, the processing will be: 1. Get reference to data with a notation (for example, a request to construct a grove from a data entity with the notation "x"). 2. Look up the external ID of the notation for the data entity 3. For the external ID, look up the UUID of the implementing object 4. Use that UUID as the argument to create_object() (in VB, not sure what it would be in Python, but there must be something). 5. Windows handles resolving the UUID to an executable. When configuring PHyLIS, you would register the COM objects you want to use to process various notations, just as you register helper apps in your Web browser, using some PHyLIS-provided interface (or by modifying the XML document(s) PHyLIS will use for configuration--you can bet I'm not going near the registry for that). Big difference--no dependency on extensions, as there are with MIME types (at least on Windows, Unix systems may be smarter). In fact, the external ID of the data entity is irrelevant, the notation governs. Of course, you might define a very generic notation, like "graphic", where the processor uses other means to determine how to really process the graphic (it might use MIME types), but that's ok--if it makes sense to do that for you, no reason not to. In the case of things like graphics, there's already a well established mechanism for making graphics self-defining for type (magic numbers), so why make the entity declaration redundant and risk lying (how many times have you changed the format of a graphic and grumbled about having to update the entity declaration?)? But not all data types have this facility, so you still need something like notations to handle that case. You also need notations to indicate that special interpretations should be applied to an element (after parsing, of course), which is what notation attributes do. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|