[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] MathML (and implications for XML)
I have read quickly through the MathML (970515) draft and have some (hopefully constructive) comments to make - any crossmember of xml-dev and html-math-wg is welcome to crosspost them. Before giving detailed comments, I must say that I think it's an extremely useful document and covers all of the areas that I - as a mathematically oriented scientist - would like to see. The initial discussion is very useful and I shall borrow some of the flavour of it when redrafting Chemical Markup Language. An archetypal XML DTD --------------------- Since MathML is one of the very first XML DTDs to be published it naturally sets a style which others may imitate. In general I think it does this well, though it is at the mercy of a still fluid XML-lang and XML-link spec. I appreciate that some of this was probably written some time before the latest XML drafts. Specific comments in this area are: 3.1.4 'By default, XML processors remove all leading and trailing whitespace ... between the begin/end tags and collapse any internal w/s to a single space character'. My current understanding is that *validating* parsing removes the start and end w/s but does not collapse the internal w/s, but that WF-parsing passes the whole lot unchanged included the leading/trailing w/s. [I'm usually wrong on this, but it's a problem area :-)]. 7.1 ['</' is not allowed in CDATA]. My reading of XML 2.7 is that '</' is unrecognised within CDATA (indeed only ']]>' can terminate it). This might allow significant simplification to MathML 7.1 and allow the elimination of two sets of tags. MathML proposes two generic means of extending functionality, one through attributes and the other through macros. 7.2.3 the OTHER attribute has the syntax: OTHER="name1='val1 name2='val2'..." and essentially allows a means of adding additional attributes independently of the DTD. Personally I'm sympathetic to this (as long as the attributes are ones *I*'ve though of :-). This is 'not to encourage software developers to use this as a loophole for circumventing the MathML core markup'... but as we all know this is the sort of unchecked semantics that people love and which soon leads to non-interoperable documents and processors. I'd be frightened of it in the Chemical community. This is a point which is important for XML in general. 5.3 Macros. This is the ability to create macros to avoid repetition of verbose markup and seems particularly appropriate to math. (I think it has a similar, but smaller, role in chemistry.) As far as I can see it is totally compatible with XML/SGML, ***BUT it requires a pre-processor*** (I have been calling this a pre-parser). <PROPOSAL> There will be a role for a pre-parser in XML and one of its functions will be to apply macros. Can we work towards a standard set of operations that a pre-parser might carry out? </PROPOSAL> XML-LINK. The document is written with little reference to XML-link (not surprising, since it's new and AFAIK JUMBO is the only tool that implements it even at prototype level). However I think there are at least the following areas where XML-link mechanism might be alternatives: 7.1 Display and in-line notations. The draft assumes that the MATH component of a document is embedded in the HTML at the point that it occurs in natural reading. XML-LINK gives a mechanism for separating the math and the text and combining them under the flexibility of the linking mechanism. The problem occurs in exactly the same way in chemistry - do we encode HCl in-line or as a display; HCl This is a matter of style which may not be totally within the author's control - the publisher or renderer or reader may have the power to alter it. Since XML will approach this generically at the LINK level, I have used constructs like: <P>this is <A HREF="#HCl" XML-LINK="SIMPLE" ACTUATE="AUTO" SHOW="EMBED"> hydrogen chloride...</P> ... <MOL ID="HCl"> <FORMULA> <XVAR CONVENTION="SMILES">Cl</XVAR> <!-- yes, I really meant to omit H! --> </FORMULA> </MOL> This - in the present JUMBO - will in-line the formula for HCl. I am sure that by use of stylesheets and BEHAVIOUR it would be possible to control your equations to be at the para end, etc. 7.2.4 <MACTION>. I am sure that it is possible to recast this tag in terms of XML-LINK BEHAVIOR. That saves a lot of hassle writing code because it may already have been done...at least in part. Communality with future XML DTDs -------------------------------- As XML develops, CML gets smaller. This is wonderful. There are a number of general components of MathML that will help CML and probably other people as well. A particular example is VECTOR and MATRIX (4.2.9). It is clear from the XML-WG that many people want a method of representing (multidimensional) regular arrays of strongly typed data and also the means for addressing into these. Some (including me) will try to push for economy of expression and avoid the <SEP/> syntax. (At present CML uses the following matrix syntax: <ARRAY ROWS="2" COLUMNS="3" TYPE="FLOAT>1 2 3 4 5 6</ARRAY> and has a kludgy mechanism for repeated arrayElements or arrayElements with whitespace. Since some of our matrices are large I'd quite like to drop <SEP/>, though recent XML-WG discussion has emphasised that space is not an issue. <PROPOSAL> MathML, CML, and other XML enthusiasts should strive towards a common *extensible* way of representing arrays and matrices </PROPOSAL> Interoperability with HTML -------------------------- This is a key area and I'm not clear from MathML spec exactly what the mechanism is. AFAIK CML and MathML are the first DTDs to tackle the question of how to interoperate with HTML. As we know there are syntactic problems of how to combine two or more DTDs (DTD fragments). <AXIOM> It should ultimately be possible to create a joint HTML/*ML document which can be validated (i.e. not just well-formed). </AXIOM> This raises considerable problems in general since HTML content models do not allow for <MATH> or <CML> or other foreign tags. In CML I 'solve' this by embedding chunks of HTML within CML documents - i.e. the CML document 'owns' the HTML. It's not clear in MathML which document contains chunks of the other (this is a general XML/HTML problems which has to be addressed). MathML also provides for a subset of HTML within the <MATH> container - I assume it's a subset because it has to be processed and rendered by the MathML processor and I'm extremely sympathetic to this problem - I've spent far too much time hacking HTML rendering. At present I favour a solution where CML (and MathML) are separated from the HTML and connected by XML-LINK as in the previous section. <PROPOSAL> XML should investigate mechanisms for HTML and *ML interoperability </PROPOSAL> Interoperability with CML ------------------------- AFAICS there are no namespace collisons between the MathML tagset and CML so it's straightforward to write: <!DOCTYPE CML SYSTEM "cml.dtd" [ <!ENTITY % mathml SYSTEM "http://www.w3.org/some/where/mathml.dtd"> %mathml; ]> and then use MathML tags. This is more luck than good planning :-), but CML has been careful to restrict its tagset. Linking between variables ------------------------- If I write: x = y + 3 (I) and later 2x = y + 4 (II) I would 'normally' deduce that x = 1 (III) and y = -2 (IV) However, there is nothing in MathML AFAICS that allows one to specify that the 'x' in (I) is the same x as in (II). [Please forgive me if I've missed this]. For many applications we need to label a variable or function as having the same value and semantics throughout a document, e.g. 'Determination of <A HREF="#c"> the velocity of light</A>'. In this example I would point to some central target which represented a the variable 'c', though I'm not clear how MathML would manage this in equations. This is a very important requirement for re-usable scientific publications, though perhaps ambitious at this stage. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|