When should I use a CDATA Marked Section?
Answer:
You should almost never need to use CDATA Sections. The CDATA mechanism was
designed to let an author quote fragments of text containing markup characters (the openangle-
bracket and the ampersand), for example when documenting XML (this FAQ uses
CDATA Sections quite a lot, for obvious reasons). A CDATA Section turns off markup
recognition for the duration of the section (it gets turned on again only by the closing
sequence of double end-square-brackets and a close-angle-bracket).
Consequently, nothing in a CDATA section can ever be recognised as anything to do
with markup: it's just a string of opaque characters, and if you use an XML
transformation language like XSLT, any markup characters in it will get turned into their
character entity equivalent.
If you try, for example, to use:
some text with <![CDATA[markup]]> in it.
in the expectation that the embedded markup would remain untouched, it won't: it will
just output
some text with <em>markup</em> in it.
In other words, CDATA Sections cannot preserve the embedded markup as markup.
Normally this is exactly what you want because this technique was designed to let people
do things like write documentation about markup. It was not designed to allow the
passing of little chunks of (possibly invalid) unparsed HTML embedded inside your own
XML through to a subsequent process—because that would risk invalidating the output.
As a result you cannot expect to keep markup untouched simply because it looked as if it
designed to let an author quote fragments of text containing markup characters (the openangle-
bracket and the ampersand), for example when documenting XML (this FAQ uses
CDATA Sections quite a lot, for obvious reasons). A CDATA Section turns off markup
recognition for the duration of the section (it gets turned on again only by the closing
sequence of double end-square-brackets and a close-angle-bracket).
Consequently, nothing in a CDATA section can ever be recognised as anything to do
with markup: it's just a string of opaque characters, and if you use an XML
transformation language like XSLT, any markup characters in it will get turned into their
character entity equivalent.
If you try, for example, to use:
some text with <![CDATA[markup]]> in it.
in the expectation that the embedded markup would remain untouched, it won't: it will
just output
some text with <em>markup</em> in it.
In other words, CDATA Sections cannot preserve the embedded markup as markup.
Normally this is exactly what you want because this technique was designed to let people
do things like write documentation about markup. It was not designed to allow the
passing of little chunks of (possibly invalid) unparsed HTML embedded inside your own
XML through to a subsequent process—because that would risk invalidating the output.
As a result you cannot expect to keep markup untouched simply because it looked as if it
was safely ‘hidden’ inside a CDATA section: it can't be used as a magic shield to
preserve HTML markup for future use as markup, only as characters.
preserve HTML markup for future use as markup, only as characters.
No comments:
Post a Comment