Feels less cluttered, easier to read and more consistent. Still keep tovalue, tobytes etc that are more basic functions this only renamed format related functions. Also there is an exceptin for to/fromjson as it comes from jq. Also fixes lots of spelling errors while reading thru.
2.8 KiB
XML can be decoded and encoded into jq values in two ways, elements as object or array. Which variant to use depends a bit what you want to do. The object variant might be easier to query for a specific value but array might be easier to use to generate xml or to query after all elements of some kind etc.
Encoding is done using the to_xml
function and it will figure what variant that is used based on the input value.
Is has two optional options indent
and attribute_prefix
.
Elements as object
Element can have different shapes depending on body text, attributes and children:
<a key="value">text</a>
is{"a":{"#text":"text","@key":"value"}}
, has text (#text
) and attributes (@key
)<a>text</a>
is{"a":"text"}
<a><b>text</b></a>
is{"a":{"b":"text"}}
one child with only text and no attributes<a><b/><b>text</b></a>
is{"a":{"b":["","text"]}}
two children with same name end up in an array<a><b/><b key="value">text</b></a>
is{"a":{"b":["",{"#text":"text","@key":"value"}]}}
If there is #seq
attribute it encodes the child element order. Use -o seq=true
to include sequence number when decoding,
otherwise order might be lost.
# decode as object is the default
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -d xml -o seq=true
{
"a": {
"b": [
{
"#seq": 0
},
{
"#seq": 1,
"#text": "bbb"
}
],
"c": {
"#seq": 2,
"#text": "ccc",
"@attr": "value"
}
}
}
# access text of the <c> element
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq '.a.c["#text"]'
"ccc"
# decode to object and encode to xml
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -r -d xml -o seq=true 'to_xml({indent:2})'
<a>
<b></b>
<b>bbb</b>
<c attr="value">ccc</c>
</a>
Elements as array
Elements are arrays of the shape ["#text": "body text", "attr_name", {key: "attr value"}|null, [<child element>, ...]]
.
# decode as array
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -d xml -o array=true
[
"a",
null,
[
[
"b",
null,
[]
],
[
"b",
{
"#text": "bbb"
},
[]
],
[
"c",
{
"#text": "ccc",
"attr": "value"
},
[]
]
]
]
# decode to array and encode to xml
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -r -d xml -o array=true -o seq=true 'to_xml({indent:2})'
<a>
<b></b>
<b>bbb</b>
<c attr="value">ccc</c>
</a>
# access text of the <c> element, the object variant above is probably easier to use
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -o array=true '.[2][2][1]["#text"]'
"ccc"