You try to extract the links from an XHTML document like this:
xml sel -t -m "//a" -c . -n page.xhtml
The document contains an <a/> element, but
there are no matches.
<html xmlns="http://www.w3.org/1999/xhtml"><body>
<a href="http://example.com">A link</a>
</body></html>
The problem is the
xmlns="http://www.w3.org/1999/xhtml" attribute
on the root element, meaning that it, and all elements below
have this url as part of their name.
To match namespaced elements you must bind the namespace to a prefix and prepend it to the name:
xml sel -N x="http://www.w3.org/1999/xhtml" -t -m "//x:a" -c . -n page.xhtml
XML documents can also use different namespace prefixes, on any element in the document. In order to handle namespaces with greater ease, XMLStarlet (versions 1.2.1+) will use the namespace prefixes declared on the root element of the input document. The default namespace will be bound to the prefixes "_" and "DEFAULT" (in versions 1.5.0+). So another way to solve handle the previous example would be:
xml sel -t -m "//_:a" -c . -n page.xhtml
This feature can be disabled (versions 1.6.0+) by the global
--no-doc-namespace option. When should you
disable it? Suppose you are writing a script that handles
XML documents that look like this:
<data xmlns:a="http://example.com">
<a:important-data>...</a:important-data>
</data>
and also this:
<data xmlns:b="http://example.com">
<b:important-data>...</b:important-data>
</data>
Since both documents use the same namespace they are
equivalent, even though the prefixes happen to be different.
By using --no-doc-namespace and binding the
namespace with -N, you can be sure that XMLStarlet's
behaviour will be independant of the input document.
Delete namespace declarations and all elements from non default namespace from the following XML document:
Input (file ns2.xml)
<doc xmlns="http://www.a.com/xyz" xmlns:ns="http://www.c.com/xyz">
<A>test</A>
<B>
<ns:C>xyz</ns:C>
</B>
</doc>
Command:
xml ed -N N="http://www.c.com/xyz" -d '//N:*' ns2.xml | \
sed -e 's/ xmlns.*=".*"//g'Output
<doc>
<A>test</A>
<B/>
</doc>