[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: A processing instruction for robots
* Walter Underwood | | Comments are welcome. First thought: this is fine for very simple uses, but for more complex uses something along the lines of the robots.txt file would be very nice. How about a variant PI that can point to a robots.rdf resource? Second thought: "and the index attribute must be first". This is nice for implementors, but is likely to clash with the expectations of users and the cost of more generality is very low for implementors. Why not follow the <URL: http://www.w3.org/TR/xml-stylesheet/ > style of specifying PI pseudo-attributes? Also: The robot PI, says the spec, "should be in the internal subset (not in an external DTD or parameter entity). Since robots may be non-validating, a robots PI in the external subset might not be seen by the robot." I think this is misleading, since "the internal subset" is usually a short for "the internal DTD subset". A better way of putting it might be "It should be in the document entity (not in an external entity, including the external DTD subset and external parameter entities). Since robots may skip external entities, PIs in external entities might not be seen by the robot." However, I don't think this will do either. Entities are what the storage structure of SGML/XML documents are composed of, and I think this spec needs to take some sort of stand as to how entities map to WWW resources, and which entities the PI is really talking about. One way is to say that every resource is an entity, and every web-accessible entity is a resource. Then one might say that the robots PI refers to a) the entity in which it is found b) the entity in which it is found and all entities included by this entity via entity references, regardless of any robots PIs in these included entities c) the entity in which it is found, and if "follow" is set to yes, all entities included by this entity via entity references, regardless of any robots PIs in these included entities d) the entity in which it is found, and if "sub-entities" is set to yes, all entities included by this entity via entity references, regardless of any robots PIs in these included entities Once one agrees on a policy I think this is worth a subsection in the spec, regardless of the choice made. b) is probably the easiest to implement, since many APIs do not expose entity structure. It might not be the best choice, though. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|