Editing, Validating and Querying XML with the XMLStarlet command line utility

XMLStarlet is a free multi-platform command line utility built on top of the libxml2 and libxslt libraries that can be used to query, edit, validate and format XML files.

Installing XMLStarlet:

To install XMLStarlet on OS X through MacPorts, run this command (after making sure that MacPorts has been installed):

sudo port install xmlstarlet

To install XMLStarlet on Linux, run one of these commands (depending on your Linux distribution):

sudo apt-get install xmlstarlet
sudo yum install xmlstarlet

Getting Help:

To get a list of options for XMLStarlet, run this command after installing the utility:

xmlstarlet

You will be presented with output similar to this:

XMLStarlet Toolkit: Command line utilities for XML
Usage: xml []  []
where  is one of:
  ed    (or edit)      - Edit/Update XML document(s)
  sel   (or select)    - Select data or query XML document(s) (XPATH, etc)
  tr    (or transform) - Transform XML document(s) using XSLT
  val   (or validate)  - Validate XML document(s) (well-formed/DTD/XSD/RelaxNG)
  fo    (or format)    - Format XML document(s)
  el    (or elements)  - Display element structure of XML document
  c14n  (or canonic)   - XML canonicalization
  ls    (or list)      - List directory as XML
  esc   (or escape)    - Escape special XML characters
  unesc (or unescape)  - Unescape special XML characters
  pyx   (or xmln)      - Convert XML into PYX format (based on ESIS - ISO 8879)
  p2x   (or depyx)     - Convert PYX into XML
 are:
  --version            - show version
  --help               - show help
Wherever file name mentioned in command help it is assumed
that URL can be used instead as well.
 
Type: xml  --help  for command help
 
XMLStarlet is a command line toolkit to query/edit/check/transform
XML documents (for more information see http://xmlstar.sourceforge.net/)

Validating XML:

The following examples are all based on this XML file (data.xml):

< ?xml version="1.0"? >
< groceries >
  < item >
    < name >Bread< /name >
    < quantity >1< /quantity >
    < cost >1.50< /cost >
  < /item >
  < item >
    < name >Milk< /name >
    < quantity >1< /quantity >
    < cost >2.00< /cost >
  < /item >
  < item >
    < name >Energy Bars< /name >
    < quantity >5< /quantity >
    < cost>1.25< /cost >
  < /item >
< /groceries >

To validate the contents of the above mentioned file, run this command:

xmlstarlet val data.xml

If you defined a DTD or Schema in the XML file, the data will be validated against it. If not, the file will just be checked for correct structure and syntax.

Any problems with the file will be displayed as an error message similar to the one below:

data.xml:6: parser error : expected '>'
    < cost >1.50< /costs >
                    ^
data.xml - invalid

Querying XML:

To select specific data from the XML file, run a command like this:

xmlstarlet sel -t -m //item -s D:T:U name -v name -n data.xml

… where “-t” indicates to use the specified query template, “-m //item” indicates to match elements named “item” at any depth, “-s D:T:U name” indicates that the results should be sorted on the “name” field in descending order of text values with uppercase results first, “-v name” indicates that the value of the “name” field should be retrieved, and “-n” indicates that each result should be followed by a newline character.

To convert a XMLStarlet query into XSLT, add a “-C” flag directly after the “sel” argument:

xmlstarlet sel -C -t -m //item -s D:T:U name -v name -n data.xml

The above command will output XSLT data similar to this:

< ?xml version="1.0"? >
< xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:exslt="http://exslt.org/common"
 xmlns:math="http://exslt.org/math"
 xmlns:date="http://exslt.org/dates-and-times"
 xmlns:func="http://exslt.org/functions"
 xmlns:set="http://exslt.org/sets"
 xmlns:str="http://exslt.org/strings"
 xmlns:dyn="http://exslt.org/dynamic"
 xmlns:saxon="http://icl.com/saxon"
 xmlns:xalanredirect="org.apache.xalan.xslt.extensions.Redirect"
 xmlns:xt="http://www.jclark.com/xt"
 xmlns:libxslt="http://xmlsoft.org/XSLT/namespace"
 xmlns:test="http://xmlsoft.org/XSLT/"
 extension-element-prefixes="exslt math date func set str dyn saxon xalanredirect xt libxslt test"
 exclude-result-prefixes="math str" >
< xsl:output omit-xml-declaration="yes" indent="no"/ >
< xsl:param name="inputFile" >-< /xsl:param >
< xsl:template match="/" >
  < xsl:call-template name="t1"/ >
< /xsl:template >
< xsl:template name="t1" >
  < xsl:for-each select="//item" >
    < xsl:sort order="descending" data-type="text" case-order="upper-first" select="name"/ >
    < xsl:value-of select="name"/ >
    < xsl:value-of select="'&amp;#10;'"/ >
  < /xsl:for-each >
< /xsl:template >
< /xsl:stylesheet >

More Information:

For more details on any of XMLStarlet’s functions, run a command similar to these:

xmlstarlet ed --help
xmlstarlet sel --help
xmlstarlet val --help

 

Related posts:

  1. Mac OS X Quick Tip: Using Spotlight to search from the command line
  2. Checking directory sizes on a Bash command line
  3. aria2c – UNIX command line segmented download utility
  4. Reusing commands with different arguments on a Bash command line
  5. Sending Tweets from the command line using a Bash script
Twitter Digg Delicious Stumbleupon Technorati Facebook Email

No comments yet... Be the first to leave a reply!

Afrigator