What ambiguities are you referring to? What lack of clarity do you mean? The grammars for XHTML and HTML are virtually identical, and they're both specified in the same way.
You might have missed my point. I've already replied to the SGML issue that was brought up by driptray's earlier comment.
Other than that, the point I was trying to make was that the strictness of XML's syntax makes it more obvious to XHTML authors and XHTML parser writers exactly what tree the document should be parsed into, without the complication of having to look at the DTD to get semantic information.
With HTML, there can be lots of ways to get the same tree, and because part of it is built on HTML-specific semantics it's not immediately obvious what the tree is supposed to be. Instead, you have to understand the meaning of each element, and its own specific properties of where it starts and ends in relation to other specific elements.
For example, a paragraph ends when a new paragraph is seen, when a list begins (I think), when a header is seen, and so on. It doesn't end when an image is inserted - even though by looking at the page it might sometimes appear intuitively that it does. Just by looking at it, is the text after an image going to be in the same block as the text before it? In HTML this isn't visually obvious, but in XHTML it's required by specification to be visually obvious. These are more naturally semantic rules based on element-by-element special cases, making HTML's syntax more complicated. They're not obvious without knowing the exact HTML specs, making it generally more difficult to write an HTML document that's marked up the way it's supposed to be.
Obviously this can be figured out by looking at the HTML DTD, and an SGML parser can and will validate an HTML document that way. Any experienced author will know to check the DTD or the specs, and they probably will. Lots of authors won't, though.
With XHTML not having more complicated semantic rules determining the syntax, an author is required to specify exactly what tree they want. If they weren't required to as with ordinary HTML, the parser might not return an error when they made a mistake - it could still accept it but use a semantic-based assumption to create an incorrect document tree. IMHO this makes it less likely in XHTML that there will be mis-communication between everyone involved. ie. Less ambiguity.
jesterzog Fight the light
[ Parent ]