xmllint — xmllint XML parsing, XPath queries, DTD/XSD validation, pretty-print. pwsh Select-Xml + System.Xml alternatives. libxml2 vs Saxon for XSLT across all 5 shells
Equivalents in every shell
xmllint --xpath "//book[@id='1']/title/text()" books.xmlFrom `libxml2-utils` (Debian/Ubuntu) or `libxml2` (Fedora/Arch/macOS Homebrew). Common flags: `--xpath <expr>` runs an XPath 1.0 query; `--format` pretty-prints (indents); `--noout` suppresses parsed-tree dump (useful with `--noxincludeoptions` to silently validate); `--valid` validates against the DTD declared in the document; `--schema <file.xsd>` validates against an XSD schema; `--relaxng <file.rng>` validates against Relax NG; `--shell` opens an interactive XPath REPL. Newline behaviour: `--xpath` outputs results concatenated without separator — wrap each result in a delimiter via XPath: `--xpath "string-join(//book/title, \" \")"`. CAVEAT: libxml2 supports XPath 1.0 ONLY — for XPath 2.0/3.0 features (regex, higher-order functions) use Saxon (`xmlstarlet` is similar to xmllint; Saxon-HE supports XPath 3.1).
xmllint --xpath "//book[@id='1']/title/text()" books.xmlxmllint --xpath "//book[@id='1']/title/text()" books.xmlSelect-Xml -Path books.xml -XPath "//book[@id='1']/title" | ForEach-Object { $_.Node.InnerText }pwsh-native: `Select-Xml -Path <file> -XPath <expr>` returns `SelectXmlInfo` objects wrapping the matching nodes. Access the actual XML node via `.Node`, the text content via `.Node.InnerText` or `.Node.'#text'`. For DTD/XSD validation: `$reader = [System.Xml.XmlReader]::Create("books.xml", (New-Object System.Xml.XmlReaderSettings -Property @{ ValidationType = "Schema"; Schemas = (New-Object System.Xml.Schema.XmlSchemaSet).Add("", "schema.xsd") }))` — verbose .NET pattern. For schema-validation in CI, shell to `xmllint --schema schema.xsd document.xml --noout` (much cleaner). pwsh 7+ on Linux/macOS: `xmllint` may not be installed by default — `apt install libxml2-utils` / `brew install libxml2`.
xmllint --xpath "//book[@id='1']/title/text()" books.xmlNo cmd-native XML tool. Install via `winget install libxml2` or `choco install xmllint`. Alternative: call into pwsh: `powershell -Command "Select-Xml -Path books.xml -XPath '//book[@id=1]/title' | Foreach { $_.Node.InnerText }"`. For pure cmd users, WSL is the friction-free path. CAVEAT: cmd's quoting is fragile — XPath expressions with single-quotes inside need careful escaping (`xmllint --xpath "//book[@id=\"1\"]/title" books.xml` to avoid the cmd-treats-single-quote-as-literal issue).
Worked examples
Query an XPath expression
xmllint --xpath "//book[1]/title/text()" books.xmlxmllint --xpath "//book[1]/title/text()" books.xml(Select-Xml -Path books.xml -XPath "//book[1]/title").Node.InnerTextxmllint --xpath "//book[1]/title/text()" books.xmlPretty-print (indent) an XML document
xmllint --format input.xml > output.xmlxmllint --format input.xml > output.xml$xml = [xml](Get-Content input.xml); $sw = New-Object System.IO.StringWriter; $xw = New-Object System.Xml.XmlTextWriter($sw); $xw.Formatting = "Indented"; $xml.Save($xw); $sw.ToString() | Out-File output.xmlxmllint --format input.xml > output.xmlValidate against an XSD schema
xmllint --schema schema.xsd document.xml --nooutxmllint --schema schema.xsd document.xml --noout$xml = New-Object System.Xml.XmlDocument; $xml.Schemas.Add("", "schema.xsd") | Out-Null; $xml.Load("document.xml"); $xml.Validate($null)xmllint --schema schema.xsd document.xml --nooutValidate well-formedness only (no schema)
xmllint --noout document.xml && echo "valid"xmllint --noout document.xml && echo "valid"try { [xml](Get-Content document.xml) | Out-Null; "valid" } catch { "invalid: $_" }xmllint --noout document.xml && echo validGotchas
- **libxml2 is XPath 1.0 ONLY.** XPath 1.0 lacks: regex matching (`matches()`), higher-order functions, sequence operations, `let` bindings, native dates, `if/then/else`. For XPath 2.0+ features you need a different processor — `xmlstarlet` (similar CLI to xmllint, also 1.0) or Saxon-HE (XPath/XQuery 3.1, runs as `java -jar saxon-he.jar`). Common 1.0 workarounds: instead of `matches(@name, "^foo")`, use `starts-with(@name, "foo")`; instead of `current-date()`, pass as a parameter from the calling script. If your XPath is failing with cryptic errors, check: are you using a 2.0 feature in a 1.0 processor?
- **XML namespaces silently break unqualified XPath queries.** A document like `<root xmlns="http://example.com"><item>x</item></root>` looks unproblematic, but `xmllint --xpath "//item" file.xml` returns NOTHING — because `item` is in the `http://example.com` namespace, and unqualified XPath looks for the no-namespace `item`. Fix: bind a prefix and use it: `xmllint --xpath "//*[local-name()='item']" file.xml` (cheap workaround — ignores namespace), or properly use `--shell` and `setns ex http://example.com` then `xpath //ex:item`. For one-shot queries where namespace correctness doesn't matter, `local-name()='…'` is the fast escape hatch. For maintainable scripts, declare and bind the namespaces properly.
- **`--xpath` output is concatenated without separators.** Querying `//book/title` against 3 books returns `<title>A</title><title>B</title><title>C</title>` smashed together. To get one per line, use XPath's `string-join`: `--xpath "string-join(//book/title, ' ')"` (XPath 1.0 lacks `string-join` — for that you need libxml2 with EXSLT enabled, or a 2.0 processor). Workaround in 1.0: `xmllint --shell file.xml` and `cd //book` + iterate with `dir` / `cat`. Or pipe through `sed -e 's/></>\n</g'` to insert newlines between adjacent tags — fragile but works for simple cases.
- **XInclude and external entities — security implications.** xmllint resolves `<xi:include>` ELEMENTS by default (XInclude pulls in other files). It does NOT resolve external general entities (`<!ENTITY foo SYSTEM "file:///etc/passwd">`) unless `--loaddtd` or `--noent` is specified — disabled by default since CVE-2014-3660 (XXE). If you intentionally want entity expansion: `xmllint --noent --loaddtd file.xml`. For UNTRUSTED XML input (anything from a network / user upload), NEVER use `--noent`/`--loaddtd` — risk of XML eXternal Entity attacks reading server files. In .NET / pwsh, `XmlDocument.XmlResolver = $null` disables external entity resolution (default in modern .NET; older versions vulnerable).
- **XSLT support: libxslt + xsltproc, NOT xmllint.** xmllint can VALIDATE but cannot TRANSFORM via XSLT — that's `xsltproc` (sibling tool in libxslt). `xsltproc stylesheet.xsl input.xml > output.xml`. xsltproc supports XSLT 1.0 (libxslt is mature but 1.0-only). For XSLT 2.0/3.0: Saxon-HE. For RELAXED schema validation: `xmllint --relaxng schema.rng file.xml` (Relax NG is often more readable than XSD for hand-written schemas). For Schematron (rule-based validation): `xmllint` does NOT support it natively — use `iso-schematron-xslt2` from Schematron.com via xsltproc/Saxon.