Alias: How to force a default namespace to use prefix
Sorry if I missed something important, I'm quite new to namespace problematics.
But I'm deadlocked at the last point to solve of the whole transformation
process.
Everything works nice, except that XHTML namespace is set as default namespace,
so no prefixes, preferably 'html' prefix, is not included in element names when
serialized back to string.
I'm getting:
<html xmlns="http://www.w3.org/1999/xhtml">
<body> some <b> bold </b> text </body>
</html>
But I need:
<html xmlns:html="http://www.w3.org/1999/xhtml">
<html:body> some <html:b> bold </html:b> text </html:body>
</html>
Why? Because in reality I pick peaces of html - often corrupt! - from database
transforming them to valid xhtml and finally assemble them into another, bigger
XML, with multiple namespaces. Indeed, I build RSS/Atom feed.
So my question is like:
how to force a default namespace to use prefix.
Is this relevant to parser or serializer (transformer)?
how to pick a prefix name for namespace. Preferably 'html'.
Here is my code:
// set up Neko parser, set html tag fixing routines and namespaces on
org.cyberneko.html.parsers.DOMParser parser = new DOMParser();
parser.setFeature(
"http://cyberneko.org/html/features/balance-tags", true);
parser.setProperty(
"http://cyberneko.org/html/properties/names/elems", "lower");
parser.setFeature(
"http://cyberneko.org/html/features/override-namespaces",
true);
parser.setFeature(
"http://cyberneko.org/html/features/insert-namespaces",
true);
parser.setProperty(
"http://cyberneko.org/html/properties/namespaces-uri",
"http://www.w3.org/1999/xhtml");
// parse html fragment, fix it and return full and valid XML document
parser.parse(
new InputSource(
new StringReader(htmlFragment)));
return parser.getDocument();
// ..OK, let's serialize it back to string!
// prepare serializer
StringWriter sw = new StringWriter();
Transformer t = TransformerFactory.newInstance()
.newTransformer();
t.setOutputProperty(OutputKeys.METHOD, "xml");
t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
// Serialize DOM tree
t.transform(new DOMSource(node),new StreamResult(sw));
String outputXHTML = sw.toString();
P.S.
NekoHTML parser is a real treasure! Helping much with closing html
tags, misballanced tags etc. Thanks Andy.
|