1. Namespaces and default namespace

One of the commonly asked questions about XmlStarlet 'select' or 'edit' options is: "Why nothing matched for my XPath expression which seems right to me?". Common cause of these problems is not properly defining a namespace for XPath. This chapter will show several examples to illustrate these issues you might encounter.

For example the following XHTML document has a default namespace declaration

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Query Page</title>
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<meta name="robots" content="noindex,nofollow" />
</head>
<body>
...
</body>
</html>

And the following (initially looking correct) query to print all links

xml sel -t -m "//a" -c . -n 

would return nothing. The issue with this query is that it is not addressing element <a> in the right namespace. XPath requires all namespaces used in XPath expression be defined. So for declared namespace <html xmlns="http://www.w3.org/1999/xhtml"> in input XML, you have to do same for XPath (or XSLT). There is another important detail: namespace equivalency is determined not by namespace prefix, but by URI. See query below, which would return expected result

xml sel -N x="http://www.w3.org/1999/xhtml" -t -m "//x:a" -c . -n

Example of deleting namespace declarations.

Delete namespace declarations and all elements from non default namespace from the following XML document:

Input (file ns2.xml)

<doc xmlns="http://www.a.com/xyz" xmlns:ns="http://www.c.com/xyz">
  <A>test</A>
  <B>
    <ns:C>xyz</ns:C>
  </B>
</doc>

Command:

xml ed -N N="http://www.c.com/xyz" -d '//N:*' ns2.xml | sed -e 's/ xmlns.*=".*"//g'

Output

<doc>
  <A>test</A>
  <B/>
</doc>