1
1
mirror of https://github.com/wader/fq.git synced 2024-11-22 07:16:49 +03:00

format,intepr: Refactor json, yaml, etc into formats also move out related functions

json, yaml, toml, xml, html, csv are now normal formats and most of them also particiate
in probing (not html and csv).

Also fixes a bunch of bugs in to/fromxml, to/fromjq etc.
This commit is contained in:
Mattias Wadman 2022-06-01 16:55:55 +02:00
parent 8b5cc89641
commit cae288e6be
95 changed files with 2826 additions and 2048 deletions

1
.gitattributes vendored
View File

@ -2,3 +2,4 @@
*.fqtest eol=lf
*.json eol=lf
*.jq eol=lf
*.xml eol=lf

View File

@ -61,6 +61,7 @@ bsd_loopback_frame,
[bson](doc/formats.md#bson),
bzip2,
[cbor](doc/formats.md#cbor),
[csv](doc/formats.md#csv),
dns,
dns_tcp,
elf,
@ -82,6 +83,7 @@ hevc_nalu,
hevc_pps,
hevc_sps,
hevc_vps,
[html](doc/formats.md#html),
icc_profile,
icmp,
icmpv6,
@ -120,6 +122,7 @@ sll_packet,
tar,
tcp_segment,
tiff,
toml,
udp_datagram,
vorbis_comment,
vorbis_packet,
@ -130,6 +133,8 @@ vpx_ccr,
wav,
webp,
xing,
[xml](doc/formats.md#xml),
yaml,
[zip](doc/formats.md#zip)
[#]: sh-end
@ -286,3 +291,4 @@ Licenses of direct dependencies:
- golang/snappy https://github.com/golang/snappy/blob/master/LICENSE (BSD)
- github.com/BurntSushi/toml https://github.com/BurntSushi/toml/blob/master/COPYING (MIT)
- gopkg.in/yaml.v3 https://github.com/go-yaml/yaml/blob/v3/LICENSE (MIT)
- github.com/creasty/defaults https://github.com/creasty/defaults/blob/master/LICENSE (MIT)

View File

@ -31,6 +31,7 @@
|[`bson`](#bson) |Binary&nbsp;JSON |<sub></sub>|
|`bzip2` |bzip2&nbsp;compression |<sub>`probe`</sub>|
|[`cbor`](#cbor) |Concise&nbsp;Binary&nbsp;Object&nbsp;Representation |<sub></sub>|
|[`csv`](#csv) |Comma&nbsp;separated&nbsp;values |<sub></sub>|
|`dns` |DNS&nbsp;packet |<sub></sub>|
|`dns_tcp` |DNS&nbsp;packet&nbsp;(TCP) |<sub></sub>|
|`elf` |Executable&nbsp;and&nbsp;Linkable&nbsp;Format |<sub></sub>|
@ -52,6 +53,7 @@
|`hevc_pps` |H.265/HEVC&nbsp;Picture&nbsp;Parameter&nbsp;Set |<sub></sub>|
|`hevc_sps` |H.265/HEVC&nbsp;Sequence&nbsp;Parameter&nbsp;Set |<sub></sub>|
|`hevc_vps` |H.265/HEVC&nbsp;Video&nbsp;Parameter&nbsp;Set |<sub></sub>|
|[`html`](#html) |HyperText&nbsp;Markup&nbsp;Language |<sub></sub>|
|`icc_profile` |International&nbsp;Color&nbsp;Consortium&nbsp;profile |<sub></sub>|
|`icmp` |Internet&nbsp;Control&nbsp;Message&nbsp;Protocol |<sub></sub>|
|`icmpv6` |Internet&nbsp;Control&nbsp;Message&nbsp;Protocol&nbsp;v6 |<sub></sub>|
@ -61,7 +63,7 @@
|`ipv4_packet` |Internet&nbsp;protocol&nbsp;v4&nbsp;packet |<sub>`ip_packet`</sub>|
|`ipv6_packet` |Internet&nbsp;protocol&nbsp;v6&nbsp;packet |<sub>`ip_packet`</sub>|
|`jpeg` |Joint&nbsp;Photographic&nbsp;Experts&nbsp;Group&nbsp;file |<sub>`exif` `icc_profile`</sub>|
|`json` |JSON |<sub></sub>|
|`json` |JavaScript&nbsp;Object&nbsp;Notation |<sub></sub>|
|[`macho`](#macho) |Mach-O&nbsp;macOS&nbsp;executable |<sub></sub>|
|[`matroska`](#matroska) |Matroska&nbsp;file |<sub>`aac_frame` `av1_ccr` `av1_frame` `avc_au` `avc_dcr` `flac_frame` `flac_metadatablocks` `hevc_au` `hevc_dcr` `image` `mp3_frame` `mpeg_asc` `mpeg_pes_packet` `mpeg_spu` `opus_packet` `vorbis_packet` `vp8_frame` `vp9_cfm` `vp9_frame`</sub>|
|[`mp3`](#mp3) |MP3&nbsp;file |<sub>`id3v2` `id3v1` `id3v11` `apev2` `mp3_frame`</sub>|
@ -90,6 +92,7 @@
|`tar` |Tar&nbsp;archive |<sub>`probe`</sub>|
|`tcp_segment` |Transmission&nbsp;control&nbsp;protocol&nbsp;segment |<sub></sub>|
|`tiff` |Tag&nbsp;Image&nbsp;File&nbsp;Format |<sub>`icc_profile`</sub>|
|`toml` |Tom's&nbsp;Obvious,&nbsp;Minimal&nbsp;Language |<sub></sub>|
|`udp_datagram` |User&nbsp;datagram&nbsp;protocol |<sub>`udp_payload`</sub>|
|`vorbis_comment` |Vorbis&nbsp;comment |<sub>`flac_picture`</sub>|
|`vorbis_packet` |Vorbis&nbsp;packet |<sub>`vorbis_comment`</sub>|
@ -100,12 +103,14 @@
|`wav` |WAV&nbsp;file |<sub>`id3v2` `id3v1` `id3v11`</sub>|
|`webp` |WebP&nbsp;image |<sub>`vp8_frame`</sub>|
|`xing` |Xing&nbsp;header |<sub></sub>|
|[`xml`](#xml) |Extensible&nbsp;Markup&nbsp;Language |<sub></sub>|
|`yaml` |YAML&nbsp;Ain't&nbsp;Markup&nbsp;Language |<sub></sub>|
|[`zip`](#zip) |ZIP&nbsp;archive |<sub>`probe`</sub>|
|`image` |Group |<sub>`gif` `jpeg` `mp4` `png` `tiff` `webp`</sub>|
|`inet_packet` |Group |<sub>`ipv4_packet` `ipv6_packet`</sub>|
|`ip_packet` |Group |<sub>`icmp` `icmpv6` `tcp_segment` `udp_datagram`</sub>|
|`link_frame` |Group |<sub>`bsd_loopback_frame` `ether8023_frame` `sll2_packet` `sll_packet`</sub>|
|`probe` |Group |<sub>`adts` `ar` `avro_ocf` `bitcoin_blkdat` `bzip2` `elf` `flac` `gif` `gzip` `jpeg` `json` `macho` `matroska` `mp3` `mp4` `mpeg_ts` `ogg` `pcap` `pcapng` `png` `tar` `tiff` `wav` `webp` `zip`</sub>|
|`probe` |Group |<sub>`adts` `ar` `avro_ocf` `bitcoin_blkdat` `bzip2` `elf` `flac` `gif` `gzip` `jpeg` `json` `macho` `matroska` `mp3` `mp4` `mpeg_ts` `ogg` `pcap` `pcapng` `png` `tar` `tiff` `toml` `wav` `webp` `xml` `yaml` `zip`</sub>|
|`tcp_stream` |Group |<sub>`dns` `rtmp`</sub>|
|`udp_payload` |Group |<sub>`dns`</sub>|
@ -280,6 +285,27 @@ Supports `torepr`
- https://en.wikipedia.org/wiki/CBOR
- https://www.rfc-editor.org/rfc/rfc8949.html
### csv
#### Options
|Name |Default|Description|
|- |- |-|
|`comma` |, |Separator character|
|`comment`|# |Comment line character|
#### Examples
Decode file using csv options
```
$ fq -d csv -o comma="," -o comment="#" . file
```
Decode value as csv
```
... | csv({comma:",",comment:"#"})
```
### flac_frame
#### Options
@ -320,6 +346,27 @@ Decode value as hevc_au
... | hevc_au({length_size:4})
```
### html
#### Options
|Name |Default|Description|
|- |- |-|
|`array`|false |Decode as nested arrays|
|`seq` |false |Use seq attribute to preserve element order|
#### Examples
Decode file using html options
```
$ fq -d html -o array=false -o seq=false . file
```
Decode value as html
```
... | html({array:false,seq:false})
```
### macho
Supports decoding vanilla and FAT Mach-O binaries.
@ -456,6 +503,27 @@ Current only supports plain RTMP (not RTMPT or encrypted variants etc) with AMF0
- https://rtmp.veriskope.com/docs/spec/
- https://rtmp.veriskope.com/pdf/video_file_format_spec_v10.pdf
### xml
#### Options
|Name |Default|Description|
|- |- |-|
|`array`|false |Decode as nested arrays|
|`seq` |false |Use seq attribute to preserve element order|
#### Examples
Decode file using xml options
```
$ fq -d xml -o array=false -o seq=false . file
```
Decode value as xml
```
... | xml({array:false,seq:false})
```
### zip
Supports ZIP64.

File diff suppressed because it is too large Load Diff

Before

Width:  |  Height:  |  Size: 125 KiB

After

Width:  |  Height:  |  Size: 128 KiB

View File

@ -620,14 +620,14 @@ zip> ^D
- `fromxmlentities` Decode XML entities.
- `toxmlentities` Encode XML entities.
- `fromurlpath` Decode URL path component.
- `tourlpath` Encode URL path component.
- `tourlpath` Encode URL path component. Whitespace as %20.
- `fromurlencode` Decode URL query encoding.
- `tourlencode` Encode URL to query encoding.
- `tourlencode` Encode URL to query encoding. Whitespace as "+".
- `fromurlquery` Decode URL query into object. For duplicates keys value will be an array.
- `tourlquery` Encode objet into query string.
- `fromurl` Decode URL into object.
```jq
> "schema://user:pass@host/path?key=value#fragement" | fromurl
> "schema://user:pass@host/path?key=value#fragment" | fromurl
{
"fragment": "fragement",
"host": "host",

View File

@ -21,8 +21,11 @@ $ fq -n _registry.groups.probe
"tiff",
"webp",
"zip",
"mp3",
"mpeg_ts",
"wav",
"mp3",
"json"
"json",
"toml",
"xml",
"yaml"
]

View File

@ -13,6 +13,8 @@ import (
_ "github.com/wader/fq/format/bson"
_ "github.com/wader/fq/format/bzip2"
_ "github.com/wader/fq/format/cbor"
_ "github.com/wader/fq/format/crypto"
_ "github.com/wader/fq/format/csv"
_ "github.com/wader/fq/format/dns"
_ "github.com/wader/fq/format/elf"
_ "github.com/wader/fq/format/fairplay"
@ -25,6 +27,7 @@ import (
_ "github.com/wader/fq/format/jpeg"
_ "github.com/wader/fq/format/json"
_ "github.com/wader/fq/format/macho"
_ "github.com/wader/fq/format/math"
_ "github.com/wader/fq/format/matroska"
_ "github.com/wader/fq/format/mp3"
_ "github.com/wader/fq/format/mp4"
@ -38,10 +41,14 @@ import (
_ "github.com/wader/fq/format/raw"
_ "github.com/wader/fq/format/rtmp"
_ "github.com/wader/fq/format/tar"
_ "github.com/wader/fq/format/text"
_ "github.com/wader/fq/format/tiff"
_ "github.com/wader/fq/format/toml"
_ "github.com/wader/fq/format/vorbis"
_ "github.com/wader/fq/format/vpx"
_ "github.com/wader/fq/format/wav"
_ "github.com/wader/fq/format/webp"
_ "github.com/wader/fq/format/xml"
_ "github.com/wader/fq/format/yaml"
_ "github.com/wader/fq/format/zip"
)

View File

@ -250,6 +250,20 @@ out ... | cbor | torepr
out References and links
out https://en.wikipedia.org/wiki/CBOR
out https://www.rfc-editor.org/rfc/rfc8949.html
"help(csv)"
out csv: Comma separated values decoder
out Options:
out comma=, Separator character
out comment=# Comment line character
out Examples:
out # Decode file as csv
out $ fq -d csv . file
out # Decode value as csv
out ... | csv
out # Decode file using csv options
out $ fq -d csv -o comma="," -o comment="#" . file
out # Decode value as csv
out ... | csv({comma:",",comment:"#"})
"help(dns)"
out dns: DNS packet decoder
out Examples:
@ -409,6 +423,20 @@ out # Decode file as hevc_vps
out $ fq -d hevc_vps . file
out # Decode value as hevc_vps
out ... | hevc_vps
"help(html)"
out html: HyperText Markup Language decoder
out Options:
out array=false Decode as nested arrays
out seq=false Use seq attribute to preserve element order
out Examples:
out # Decode file as html
out $ fq -d html . file
out # Decode value as html
out ... | html
out # Decode file using html options
out $ fq -d html -o array=false -o seq=false . file
out # Decode value as html
out ... | html({array:false,seq:false})
"help(icc_profile)"
out icc_profile: International Color Consortium profile decoder
out Examples:
@ -473,7 +501,7 @@ out $ fq -d jpeg . file
out # Decode value as jpeg
out ... | jpeg
"help(json)"
out json: JSON decoder
out json: JavaScript Object Notation decoder
out Examples:
out # Decode file as json
out $ fq -d json . file
@ -726,6 +754,13 @@ out # Decode file as tiff
out $ fq -d tiff . file
out # Decode value as tiff
out ... | tiff
"help(toml)"
out toml: Tom's Obvious, Minimal Language decoder
out Examples:
out # Decode file as toml
out $ fq -d toml . file
out # Decode value as toml
out ... | toml
"help(udp_datagram)"
out udp_datagram: User datagram protocol decoder
out Examples:
@ -796,6 +831,27 @@ out # Decode file as xing
out $ fq -d xing . file
out # Decode value as xing
out ... | xing
"help(xml)"
out xml: Extensible Markup Language decoder
out Options:
out array=false Decode as nested arrays
out seq=false Use seq attribute to preserve element order
out Examples:
out # Decode file as xml
out $ fq -d xml . file
out # Decode value as xml
out ... | xml
out # Decode file using xml options
out $ fq -d xml -o array=false -o seq=false . file
out # Decode value as xml
out ... | xml({array:false,seq:false})
"help(yaml)"
out yaml: YAML Ain't Markup Language decoder
out Examples:
out # Decode file as yaml
out $ fq -d yaml . file
out # Decode value as yaml
out ... | yaml
"help(zip)"
out zip: ZIP archive decoder
out Supports ZIP64.

View File

@ -1,8 +1,5 @@
# appendix_a.json from https://github.com/cbor/test-vectors
# TODO: "w0kBAAAAAAAAAAA=" "wkkBAAAAAAAAAAA=" semantic bigint
# NOTE: "O///////////" test uses bigint and is correct but test success currently relay on -18446744073709551616
# in input json being turned into a float as it can't be represented in json and cbor decoded bigint will also be
# converted to a float when comparing.
$ fq -i -d json . appendix_a.json
json> length
82
@ -16,7 +13,7 @@ json> map(select(.decoded) | (.cbor | frombase64 | cbor | torepr) as $a | select
},
"test": {
"cbor": "wkkBAAAAAAAAAAA=",
"decoded": 18446744073709552000,
"decoded": 18446744073709551616,
"hex": "c249010000000000000000",
"roundtrip": true
}
@ -29,7 +26,7 @@ json> map(select(.decoded) | (.cbor | frombase64 | cbor | torepr) as $a | select
},
"test": {
"cbor": "w0kBAAAAAAAAAAA=",
"decoded": -18446744073709552000,
"decoded": -18446744073709551617,
"hex": "c349010000000000000000",
"roundtrip": true
}

80
format/crypto/hash.go Normal file
View File

@ -0,0 +1,80 @@
package crypto
import (
"crypto/md5"
//nolint: gosec
"crypto/sha1"
"crypto/sha256"
"crypto/sha512"
"embed"
"fmt"
"hash"
"io"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/interp"
//nolint: staticcheck
"golang.org/x/crypto/md4"
"golang.org/x/crypto/sha3"
)
//go:embed hash.jq
var hashFS embed.FS
func init() {
interp.RegisterFunc1("_tohash", toHash)
interp.RegisterFS(hashFS)
}
func hashFn(s string) hash.Hash {
switch s {
case "md4":
return md4.New()
case "md5":
return md5.New()
case "sha1":
return sha1.New()
case "sha256":
return sha256.New()
case "sha512":
return sha512.New()
case "sha3_224":
return sha3.New224()
case "sha3_256":
return sha3.New256()
case "sha3_384":
return sha3.New384()
case "sha3_512":
return sha3.New512()
default:
return nil
}
}
type toHashOpts struct {
Name string
}
func toHash(_ *interp.Interp, c any, opts toHashOpts) any {
inBR, err := interp.ToBitReader(c)
if err != nil {
return err
}
h := hashFn(opts.Name)
if h == nil {
return fmt.Errorf("unknown hash function %s", opts.Name)
}
if _, err := io.Copy(h, bitio.NewIOReader(inBR)); err != nil {
return err
}
outBR := bitio.NewBitReader(h.Sum(nil), -1)
bb, err := interp.NewBinaryFromBitReader(outBR, 8, 0)
if err != nil {
return err
}
return bb
}

9
format/crypto/hash.jq Normal file
View File

@ -0,0 +1,9 @@
def tomd4: _tohash({name: "md4"});
def tomd5: _tohash({name: "md5"});
def tosha1: _tohash({name: "sha1"});
def tosha256: _tohash({name: "sha256"});
def tosha512: _tohash({name: "sha512"});
def tosha3_224: _tohash({name: "sha3_224"});
def tosha3_256: _tohash({name: "sha3_256"});
def tosha3_384: _tohash({name: "sha3_384"});
def tosha3_512: _tohash({name: "sha3_512"});

14
format/crypto/pem.go Normal file
View File

@ -0,0 +1,14 @@
package crypto
import (
"embed"
"github.com/wader/fq/pkg/interp"
)
//go:embed pem.jq
var pemFS embed.FS
func init() {
interp.RegisterFS(pemFS)
}

20
format/crypto/pem.jq Normal file
View File

@ -0,0 +1,20 @@
# https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail
def frompem:
( tobytes
| tostring
| capture("-----BEGIN(.*?)-----(?<s>.*?)-----END(.*?)-----"; "mg").s
| _frombase64({encoding: "std"})
) // error("no pem header or footer found");
def topem($label):
( tobytes
| _tobase64({encoding: "std"})
| ($label | if $label != "" then " " + $label end) as $label
| [ "-----BEGIN\($label)-----"
, .
, "-----END\($label)-----"
, ""
]
| join("\n")
);
def topem: topem("");

7
format/crypto/testdata/pem.fqtest vendored Normal file
View File

@ -0,0 +1,7 @@
$ fq -i
null> "abc" | topem
"-----BEGIN-----\nYWJj\n-----END-----\n"
null> "abc" | topem | "before" + . + "between" + . + "after" | frompem | tostring
"abc"
"abc"
null> ^D

106
format/csv/csv.go Normal file
View File

@ -0,0 +1,106 @@
package csv
import (
"bytes"
"embed"
"encoding/csv"
"errors"
"fmt"
"io"
"github.com/wader/fq/format"
"github.com/wader/fq/internal/gojqextra"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/decode"
"github.com/wader/fq/pkg/interp"
"github.com/wader/fq/pkg/scalar"
)
//go:embed csv.jq
var csvFS embed.FS
func init() {
interp.RegisterFormat(decode.Format{
Name: format.CSV,
Description: "Comma separated values",
ProbeOrder: format.ProbeOrderText,
DecodeFn: decodeCSV,
DecodeInArg: format.CSVLIn{
Comma: ",",
Comment: "#",
},
Functions: []string{"_todisplay"},
Files: csvFS,
})
interp.RegisterFunc1("_tocsv", toCSV)
}
func decodeCSV(d *decode.D, in any) any {
ci, _ := in.(format.CSVLIn)
var rvs []any
br := d.RawLen(d.Len())
r := csv.NewReader(bitio.NewIOReader(br))
r.TrimLeadingSpace = true
r.LazyQuotes = true
if ci.Comma != "" {
r.Comma = rune(ci.Comma[0])
}
if ci.Comment != "" {
r.Comment = rune(ci.Comment[0])
}
for {
r, err := r.Read()
if errors.Is(err, io.EOF) {
break
} else if err != nil {
return err
}
var vs []any
for _, s := range r {
vs = append(vs, s)
}
rvs = append(rvs, vs)
}
d.Value.V = &scalar.S{Actual: rvs}
d.Value.Range.Len = d.Len()
return nil
}
type ToCSVOpts struct {
Comma string
}
func toCSV(_ *interp.Interp, c []any, opts ToCSVOpts) any {
b := &bytes.Buffer{}
w := csv.NewWriter(b)
if opts.Comma != "" {
w.Comma = rune(opts.Comma[0])
}
for _, row := range c {
rs, ok := gojqextra.Cast[[]any](row)
if !ok {
return fmt.Errorf("expected row to be an array, got %s", gojqextra.TypeErrorPreview(row))
}
vs, ok := gojqextra.NormalizeToStrings(rs).([]any)
if !ok {
panic("not array")
}
var ss []string
for _, v := range vs {
s, ok := v.(string)
if !ok {
return fmt.Errorf("expected row record to be scalars, got %s", gojqextra.TypeErrorPreview(v))
}
ss = append(ss, s)
}
if err := w.Write(ss); err != nil {
return err
}
}
w.Flush()
return b.String()
}

3
format/csv/csv.jq Normal file
View File

@ -0,0 +1,3 @@
def tocsv($opts): _tocsv($opts);
def tocsv: _tocsv(null);
def _csv__todisplay: tovalue;

View File

@ -1,3 +1,13 @@
/test:
1,2,3
$ fq -d csv . /test
[
[
"1",
"2",
"3"
]
]
$ fq -i
null> "a,b,c,d" | fromcsv | ., tocsv
[
@ -27,4 +37,11 @@ null> "a\t\"b\t c\"\td" | fromcsv({comma:"\t"}) | ., tocsv({comma: "\t"})
]
]
"a\t\"b\t c\"\td\n"
null> [[bsl(1;100)]] | tocsv | ., fromcsv
"1267650600228229401496703205376\n"
[
[
"1267650600228229401496703205376"
]
]
null> ^D

View File

@ -1,5 +1,13 @@
package format
// TODO: do before-format somehow and topology sort?
const (
ProbeOrderBinUnique = 0 // binary with unlikely overlap
ProbeOrderBinFuzzy = 50 // binary with possible overlap
ProbeOrderText = 100 // text format
)
// TODO: change to CamelCase?
//nolint:revive
const (
ALL = "all"
@ -39,6 +47,7 @@ const (
BSON = "bson"
BZIP2 = "bzip2"
CBOR = "cbor"
CSV = "csv"
DNS = "dns"
DNS_TCP = "dns_tcp"
ELF = "elf"
@ -61,6 +70,7 @@ const (
HEVC_PPS = "hevc_pps"
HEVC_SPS = "hevc_sps"
HEVC_VPS = "hevc_vps"
HTML = "html"
ICC_PROFILE = "icc_profile"
ICMP = "icmp"
ICMPV6 = "icmpv6"
@ -99,6 +109,7 @@ const (
TAR = "tar"
TCP_SEGMENT = "tcp_segment"
TIFF = "tiff"
TOML = "toml"
UDP_DATAGRAM = "udp_datagram"
VORBIS_COMMENT = "vorbis_comment"
VORBIS_PACKET = "vorbis_packet"
@ -109,6 +120,8 @@ const (
WAV = "wav"
WEBP = "webp"
XING = "xing"
XML = "xml"
YAML = "yaml"
ZIP = "zip"
)
@ -274,3 +287,18 @@ type Mp4In struct {
type ZipIn struct {
Uncompress bool `doc:"Uncompress and probe files"`
}
type XMLIn struct {
Seq bool `doc:"Use seq attribute to preserve element order"`
Array bool `doc:"Decode as nested arrays"`
}
type HTMLIn struct {
Seq bool `doc:"Use seq attribute to preserve element order"`
Array bool `doc:"Decode as nested arrays"`
}
type CSVLIn struct {
Comma string `doc:"Separator character"`
Comment string `doc:"Comment line character"`
}

14
format/json/jq.go Normal file
View File

@ -0,0 +1,14 @@
package json
import (
"embed"
"github.com/wader/fq/pkg/interp"
)
//go:embed jq.jq
var jqFS embed.FS
func init() {
interp.RegisterFS(jqFS)
}

96
format/json/jq.jq Normal file
View File

@ -0,0 +1,96 @@
# to jq-flavoured json
def _tojq($opts):
def _is_ident: test("^[a-zA-Z_][a-zA-Z_0-9]*$");
def _key: if _is_ident | not then tojson end;
def _f($opts; $indent):
def _r($prefix):
( type as $t
| if $t == "null" then tojson
elif $t == "string" then tojson
elif $t == "number" then tojson
elif $t == "boolean" then tojson
elif $t == "array" then
if length == 0 then "[]"
else
[ "[", $opts.compound_newline
, ( [ .[]
| $prefix, $indent
, _r($prefix+$indent), $opts.array_sep
]
| .[0:-1]
)
, $opts.compound_newline
, $prefix, "]"
]
end
elif $t == "object" then
if length == 0 then "{}"
else
[ "{", $opts.compound_newline
, ( [ to_entries[]
| $prefix, $indent
, (.key | _key), $opts.key_sep
, (.value | _r($prefix+$indent)), $opts.object_sep
]
| .[0:-1]
)
, $opts.compound_newline
, $prefix, "}"
]
end
else error("unknown type \($t)")
end
);
_r("");
( _f($opts; $opts.indent * " ")
| if _is_array then flatten | join("") end
);
def tojq($opts):
_tojq(
( { indent: 0,
key_sep: ":",
object_sep: ",",
array_sep: ",",
compound_newline: "",
} + $opts
| if .indent > 0 then
( .key_sep = ": "
| .object_sep = ",\n"
| .array_sep = ",\n"
| .compound_newline = "\n"
)
end
)
);
def tojq: tojq(null);
# from jq-flavoured json
def fromjq:
def _f:
( . as $v
| .term.type
| if . == "TermTypeNull" then null
elif . == "TermTypeTrue" then true
elif . == "TermTypeFalse" then false
elif . == "TermTypeString" then $v.term.str.str
elif . == "TermTypeNumber" then $v.term.number | tonumber
elif . == "TermTypeObject" then
( $v.term.object.key_vals // []
| map(
{ key: (.key // .key_string.str),
value: (.val.queries[0] | _f)
}
)
| from_entries
)
elif . == "TermTypeArray" then
( def _a: if .op then .left, .right | _a end;
[$v.term.array.query // empty | _a | _f]
)
else error("unknown term")
end
);
try
(_query_fromstring | _f)
catch
error("fromjq only supports constant literals");

View File

@ -1,46 +1,93 @@
package json
import (
"bytes"
"embed"
stdjson "encoding/json"
"errors"
"fmt"
"io"
"math/big"
"github.com/wader/fq/format"
"github.com/wader/fq/internal/colorjson"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/decode"
"github.com/wader/fq/pkg/interp"
"github.com/wader/fq/pkg/scalar"
"github.com/wader/gojq"
)
// TODO: should read multiple json values or just one?
// TODO: root not array/struct how to add unknown gaps?
// TODO: ranges not end up correct
// TODO: use jd.InputOffset() * 8?
//go:embed json.jq
var jsonFS embed.FS
func init() {
interp.RegisterFormat(decode.Format{
Name: format.JSON,
Description: "JSON",
ProbeOrder: 100, // last
Description: "JavaScript Object Notation",
ProbeOrder: format.ProbeOrderText,
Groups: []string{format.PROBE},
DecodeFn: decodeJSON,
Functions: []string{"_todisplay"},
Files: jsonFS,
})
interp.RegisterFunc1("_tojson", toJSON)
}
func decodeJSON(d *decode.D, _ any) any {
br := d.RawLen(d.Len())
// keep in sync with gojq fromJSON
jd := stdjson.NewDecoder(bitio.NewIOReader(br))
jd.UseNumber()
var s scalar.S
if err := jd.Decode(&s.Actual); err != nil {
d.Fatalf(err.Error())
}
switch s.Actual.(type) {
case map[string]any,
[]any:
default:
d.Fatalf("root not object or array")
if err := jd.Decode(new(any)); !errors.Is(err, io.EOF) {
d.Fatalf("trialing data after top-level value")
}
s.Actual = gojq.NormalizeNumbers(s.Actual)
// switch s.Actual.(type) {
// case map[string]any,
// []any:
// default:
// d.Fatalf("top-level not object or array")
// }
d.Value.V = &s
d.Value.Range.Len = d.Len()
return nil
}
type ToJSONOpts struct {
Indent int
}
func toJSON(_ *interp.Interp, c any, opts ToJSONOpts) any {
// TODO: share
cj := colorjson.NewEncoder(
false,
false,
opts.Indent,
func(v any) any {
switch v := v.(type) {
case gojq.JQValue:
return v.JQValueToGoJQ()
case nil, bool, float64, int, string, *big.Int, map[string]any, []any:
return v
default:
panic(fmt.Sprintf("toValue not a JQValue value: %#v %T", v, v))
}
},
colorjson.Colors{},
)
bb := &bytes.Buffer{}
if err := cj.Marshal(c, bb); err != nil {
return err
}
return bb.String()
}

3
format/json/json.jq Normal file
View File

@ -0,0 +1,3 @@
def tojson($opts): _tojson($opts);
def tojson: _tojson(null);
def _json__todisplay: tovalue;

10
format/json/testdata/bigint.fqtest vendored Normal file
View File

@ -0,0 +1,10 @@
$ fq -n "{a: bsl(1;100)} | tojq | ., fromjq"
"{a:1267650600228229401496703205376}"
{
"a": 1267650600228229401496703205376
}
$ fq -n "{a: bsl(1;100)} | tojson | ., fromjson"
"{\"a\":1267650600228229401496703205376}"
{
"a": 1267650600228229401496703205376
}

View File

@ -134,3 +134,15 @@ string
"white space": 123
}
----
[]
[]
----
[]
[]
----
{}
{}
----
{}
{}
----

View File

@ -1,60 +1,158 @@
$ fq -d json . test.json
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|
0x00|7b 0a 20 20 20 20 22 61 22 3a 20 31 32 33 2c 0a|{. "a": 123,.|.: {} (json)
* |until 0x74.7 (end) (117) | |
$ fq -d json tovalue test.json
/probe.json:
{"a": 123}
/probe_scalar.json:
123
$ fq . /probe.json
{
"a": 123,
"b": [
1,
2,
3
"a": 123
}
$ fq . /probe_scalar.json
123
$ fq -rRs 'fromjson[] | (tojson | ., fromjson), "----", (tojson({indent:2}) | ., fromjson), "----"' variants.json
null
null
----
null
null
----
true
true
----
true
true
----
false
false
----
false
false
----
123
123
----
123
123
----
123.123
123.123
----
123.123
123.123
----
"string"
string
----
"string"
string
----
[1,2,3]
[
1,
2,
3
]
----
[
1,
2,
3
]
[
1,
2,
3
]
----
{"array":[true,false,null,1.2,"string",[1.2,3],{"a":1}],"escape \\\"":456,"false":false,"null":null,"number":1.2,"object":{"a":1},"string":"string","true":true,"white space":123}
{
"array": [
true,
false,
null,
1.2,
"string",
[
1.2,
3
],
{
"a": 1
}
],
"c:": "string",
"d": null,
"e": 123.4
}
$ fq . test.json
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|
0x00|7b 0a 20 20 20 20 22 61 22 3a 20 31 32 33 2c 0a|{. "a": 123,.|.: {} (json)
* |until 0x74.7 (end) (117) | |
$ fq .b[1] test.json
2
$ fq . json.gz
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|.{}: json.gz (gzip)
0x00|1f 8b |.. | identification: raw bits (valid)
0x00| 08 | . | compression_method: "deflate" (8)
0x00| 00 | . | flags{}:
0x00| 65 0a 08 61 | e..a | mtime: 1627916901 (2021-08-02T15:08:21Z)
0x00| 00 | . | extra_flags: 0
0x00| 03 | . | os: "unix" (3)
0x0|7b 22 61 22 3a 20 31 32 33 7d 0a| |{"a": 123}.| | uncompressed: {} (json)
0x00| ab 56 4a 54 b2 52| .VJT.R| compressed: raw bits
0x10|30 34 32 ae e5 02 00 |042.... |
0x10| 20 ac d2 9c | ... | crc32: 0x9cd2ac20 (valid)
0x10| 0b 00 00 00| | ....|| isize: 11
$ fq tovalue json.gz
{
"compressed": "<13>q1ZKVLJSMDQyruUCAA==",
"compression_method": "deflate",
"crc32": 2631052320,
"extra_flags": 0,
"flags": {
"comment": false,
"extra": false,
"header_crc": false,
"name": false,
"reserved": 0,
"text": false
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {
"a": 1
},
"identification": "<2>H4s=",
"isize": 11,
"mtime": 1627916901,
"os": "unix",
"uncompressed": {
"a": 123
}
"string": "string",
"true": true,
"white space": 123
}
$ fq .uncompressed json.gz
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|
0x0|7b 22 61 22 3a 20 31 32 33 7d 0a| |{"a": 123}.| |.uncompressed: {} (json)
----
{
"array": [
true,
false,
null,
1.2,
"string",
[
1.2,
3
],
{
"a": 1
}
],
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {
"a": 1
},
"string": "string",
"true": true,
"white space": 123
}
{
"array": [
true,
false,
null,
1.2,
"string",
[
1.2,
3
],
{
"a": 1
}
],
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {
"a": 1
},
"string": "string",
"true": true,
"white space": 123
}
----
[]
[]
----
[]
[]
----
{}
{}
----
{}
{}
----

77
format/json/testdata/tofromjson.fqtest vendored Normal file
View File

@ -0,0 +1,77 @@
$ fq -d json . test.json
{
"a": 123,
"b": [
1,
2,
3
],
"c:": "string",
"d": null,
"e": 123.4
}
$ fq -d json tovalue test.json
{
"a": 123,
"b": [
1,
2,
3
],
"c:": "string",
"d": null,
"e": 123.4
}
$ fq . test.json
{
"a": 123,
"b": [
1,
2,
3
],
"c:": "string",
"d": null,
"e": 123.4
}
$ fq .b[1] test.json
2
$ fq . json.gz
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|.{}: json.gz (gzip)
0x00|1f 8b |.. | identification: raw bits (valid)
0x00| 08 | . | compression_method: "deflate" (8)
0x00| 00 | . | flags{}:
0x00| 65 0a 08 61 | e..a | mtime: 1627916901 (2021-08-02T15:08:21Z)
0x00| 00 | . | extra_flags: 0
0x00| 03 | . | os: "unix" (3)
0x0|7b 22 61 22 3a 20 31 32 33 7d 0a| |{"a": 123}.| | uncompressed: {} (json)
0x00| ab 56 4a 54 b2 52| .VJT.R| compressed: raw bits
0x10|30 34 32 ae e5 02 00 |042.... |
0x10| 20 ac d2 9c | ... | crc32: 0x9cd2ac20 (valid)
0x10| 0b 00 00 00| | ....|| isize: 11
$ fq tovalue json.gz
{
"compressed": "<13>q1ZKVLJSMDQyruUCAA==",
"compression_method": "deflate",
"crc32": 2631052320,
"extra_flags": 0,
"flags": {
"comment": false,
"extra": false,
"header_crc": false,
"name": false,
"reserved": 0,
"text": false
},
"identification": "<2>H4s=",
"isize": 11,
"mtime": 1627916901,
"os": "unix",
"uncompressed": {
"a": 123
}
}
$ fq .uncompressed json.gz
{
"a": 123
}

4
format/json/testdata/trailing.fqtest vendored Normal file
View File

@ -0,0 +1,4 @@
$ fq -n '"123 trailing" | fromjson._error.error'
exitcode: 5
stderr:
error: error at position 0xc: trialing data after top-level value

View File

@ -16,5 +16,7 @@
"string": "string",
"true": true,
"white space": 123
}
},
[],
{}
]

14
format/math/radix.go Normal file
View File

@ -0,0 +1,14 @@
package math
import (
"embed"
"github.com/wader/fq/pkg/interp"
)
//go:embed radix.jq
var radixFS embed.FS
func init() {
interp.RegisterFS(radixFS)
}

45
format/math/radix.jq Normal file
View File

@ -0,0 +1,45 @@
def fromradix($base; $table):
( if _is_string | not then error("cannot fromradix convert: \(.)") end
| split("")
| reverse
| map($table[.])
| if . == null then error("invalid char \(.)") end
# state: [power, ans]
| reduce .[] as $c ([1,0];
( (.[0] * $base) as $b
| [$b, .[1] + (.[0] * $c)]
)
)
| .[1]
);
def fromradix($base):
fromradix($base; {
"0": 0, "1": 1, "2": 2, "3": 3,"4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9,
"a": 10, "b": 11, "c": 12, "d": 13, "e": 14, "f": 15, "g": 16,
"h": 17, "i": 18, "j": 19, "k": 20, "l": 21, "m": 22, "n": 23,
"o": 24, "p": 25, "q": 26, "r": 27, "s": 28, "t": 29, "u": 30,
"v": 31, "w": 32, "x": 33, "y": 34, "z": 35,
"A": 36, "B": 37, "C": 38, "D": 39, "E": 40, "F": 41, "G": 42,
"H": 43, "I": 44, "J": 45, "K": 46, "L": 47, "M": 48, "N": 49,
"O": 50, "P": 51, "Q": 52, "R": 53, "S": 54, "T": 55, "U": 56,
"V": 57, "W": 58, "X": 59, "Y": 60, "Z": 61,
"@": 62, "_": 63,
});
def toradix($base; $table):
( if type != "number" then error("cannot toradix convert: \(.)") end
| if . == 0 then "0"
else
( [ recurse(if . > 0 then _intdiv(.; $base) else empty end) | . % $base]
| reverse
| .[1:]
| if $base <= ($table | length) then
map($table[.]) | join("")
else
error("base too large")
end
)
end
);
def toradix($base):
toradix($base; "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ@_");

View File

@ -1,10 +1,4 @@
$ fq -i
null> "abc" | topem
"-----BEGIN-----\nYWJj\n-----END-----\n"
null> "abc" | topem | "before" + . + "between" + . + "after" | frompem | tostring
"abc"
"abc"
null>
null> (0,1,1024,99999999999999999999) as $n | (2,8,16,62,64) as $r | "\($r): \($n) \($n | toradix($r)) \($n | toradix($r) | fromradix($r))" | println
2: 0 0 0
8: 0 0 0

View File

@ -17,7 +17,7 @@ var mp3Frame decode.Group
func init() {
interp.RegisterFormat(decode.Format{
Name: format.MP3,
ProbeOrder: 20, // after most others (silent samples and jpeg header can look like mp3 sync)
ProbeOrder: format.ProbeOrderBinFuzzy, // after most others (silent samples and jpeg header can look like mp3 sync)
Description: "MP3 file",
Groups: []string{format.PROBE},
DecodeFn: mp3Decode,

View File

@ -10,7 +10,7 @@ import (
func init() {
interp.RegisterFormat(decode.Format{
Name: format.MPEG_TS,
ProbeOrder: 10, // make sure to be after gif, both start with 0x47
ProbeOrder: format.ProbeOrderBinFuzzy, // make sure to be after gif, both start with 0x47
Description: "MPEG Transport Stream",
Groups: []string{format.PROBE},
DecodeFn: tsDecode,

View File

@ -16,7 +16,7 @@ func init() {
}
// transform to binary using fn
func makeBinaryTransformFn(fn func(r io.Reader) (io.Reader, error)) func(i *interp.Interp, c any) any {
func makeBinaryTransformFn(fn func(r io.Reader) (io.Reader, error)) func(_ *interp.Interp, c any) any {
return func(_ *interp.Interp, c any) any {
inBR, err := interp.ToBitReader(c)
if err != nil {

242
format/text/encoding.go Normal file
View File

@ -0,0 +1,242 @@
package text
import (
"bytes"
"embed"
"encoding/base64"
"encoding/hex"
"fmt"
"io"
"strings"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/interp"
"golang.org/x/text/encoding"
"golang.org/x/text/encoding/charmap"
"golang.org/x/text/encoding/unicode"
)
//go:embed encoding.jq
var textFS embed.FS
func init() {
interp.RegisterFunc0("fromhex", func(_ *interp.Interp, c string) any {
b, err := hex.DecodeString(c)
if err != nil {
return err
}
bb, err := interp.NewBinaryFromBitReader(bitio.NewBitReader(b, -1), 8, 0)
if err != nil {
return err
}
return bb
})
interp.RegisterFunc0("tohex", func(_ *interp.Interp, c string) any {
br, err := interp.ToBitReader(c)
if err != nil {
return err
}
buf := &bytes.Buffer{}
if _, err := io.Copy(hex.NewEncoder(buf), bitio.NewIOReader(br)); err != nil {
return err
}
return buf.String()
})
// TODO: other encodings and share?
base64Encoding := func(enc string) *base64.Encoding {
switch enc {
case "url":
return base64.URLEncoding
case "rawstd":
return base64.RawStdEncoding
case "rawurl":
return base64.RawURLEncoding
default:
return base64.StdEncoding
}
}
type fromBase64Opts struct {
Encoding string
}
interp.RegisterFunc1("_frombase64", func(_ *interp.Interp, c string, opts fromBase64Opts) any {
b, err := base64Encoding(opts.Encoding).DecodeString(c)
if err != nil {
return err
}
bin, err := interp.NewBinaryFromBitReader(bitio.NewBitReader(b, -1), 8, 0)
if err != nil {
return err
}
return bin
})
type toBase64Opts struct {
Encoding string
}
interp.RegisterFunc1("_tobase64", func(_ *interp.Interp, c string, opts toBase64Opts) any {
br, err := interp.ToBitReader(c)
if err != nil {
return err
}
bb := &bytes.Buffer{}
wc := base64.NewEncoder(base64Encoding(opts.Encoding), bb)
if _, err := io.Copy(wc, bitio.NewIOReader(br)); err != nil {
return err
}
wc.Close()
return bb.String()
})
strEncoding := func(s string) encoding.Encoding {
switch s {
case "UTF8":
return unicode.UTF8
case "UTF16":
return unicode.UTF16(unicode.LittleEndian, unicode.UseBOM)
case "UTF16LE":
return unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM)
case "UTF16BE":
return unicode.UTF16(unicode.BigEndian, unicode.IgnoreBOM)
case "CodePage037":
return charmap.CodePage037
case "CodePage437":
return charmap.CodePage437
case "CodePage850":
return charmap.CodePage850
case "CodePage852":
return charmap.CodePage852
case "CodePage855":
return charmap.CodePage855
case "CodePage858":
return charmap.CodePage858
case "CodePage860":
return charmap.CodePage860
case "CodePage862":
return charmap.CodePage862
case "CodePage863":
return charmap.CodePage863
case "CodePage865":
return charmap.CodePage865
case "CodePage866":
return charmap.CodePage866
case "CodePage1047":
return charmap.CodePage1047
case "CodePage1140":
return charmap.CodePage1140
case "ISO8859_1":
return charmap.ISO8859_1
case "ISO8859_2":
return charmap.ISO8859_2
case "ISO8859_3":
return charmap.ISO8859_3
case "ISO8859_4":
return charmap.ISO8859_4
case "ISO8859_5":
return charmap.ISO8859_5
case "ISO8859_6":
return charmap.ISO8859_6
case "ISO8859_6E":
return charmap.ISO8859_6E
case "ISO8859_6I":
return charmap.ISO8859_6I
case "ISO8859_7":
return charmap.ISO8859_7
case "ISO8859_8":
return charmap.ISO8859_8
case "ISO8859_8E":
return charmap.ISO8859_8E
case "ISO8859_8I":
return charmap.ISO8859_8I
case "ISO8859_9":
return charmap.ISO8859_9
case "ISO8859_10":
return charmap.ISO8859_10
case "ISO8859_13":
return charmap.ISO8859_13
case "ISO8859_14":
return charmap.ISO8859_14
case "ISO8859_15":
return charmap.ISO8859_15
case "ISO8859_16":
return charmap.ISO8859_16
case "KOI8R":
return charmap.KOI8R
case "KOI8U":
return charmap.KOI8U
case "Macintosh":
return charmap.Macintosh
case "MacintoshCyrillic":
return charmap.MacintoshCyrillic
case "Windows874":
return charmap.Windows874
case "Windows1250":
return charmap.Windows1250
case "Windows1251":
return charmap.Windows1251
case "Windows1252":
return charmap.Windows1252
case "Windows1253":
return charmap.Windows1253
case "Windows1254":
return charmap.Windows1254
case "Windows1255":
return charmap.Windows1255
case "Windows1256":
return charmap.Windows1256
case "Windows1257":
return charmap.Windows1257
case "Windows1258":
return charmap.Windows1258
case "XUserDefined":
return charmap.XUserDefined
default:
return nil
}
}
type toStrEncodingOpts struct {
Encoding string
}
interp.RegisterFunc1("_tostrencoding", func(_ *interp.Interp, c string, opts toStrEncodingOpts) any {
h := strEncoding(opts.Encoding)
if h == nil {
return fmt.Errorf("unknown string encoding %s", opts.Encoding)
}
bb := &bytes.Buffer{}
if _, err := io.Copy(h.NewEncoder().Writer(bb), strings.NewReader(c)); err != nil {
return err
}
outBR := bitio.NewBitReader(bb.Bytes(), -1)
bin, err := interp.NewBinaryFromBitReader(outBR, 8, 0)
if err != nil {
return err
}
return bin
})
type fromStrEncodingOpts struct {
Encoding string
}
interp.RegisterFunc1("_fromstrencoding", func(_ *interp.Interp, c any, opts fromStrEncodingOpts) any {
inBR, err := interp.ToBitReader(c)
if err != nil {
return err
}
h := strEncoding(opts.Encoding)
if h == nil {
return fmt.Errorf("unknown string encoding %s", opts.Encoding)
}
bb := &bytes.Buffer{}
if _, err := io.Copy(bb, h.NewDecoder().Reader(bitio.NewIOReader(inBR))); err != nil {
return err
}
return bb.String()
})
interp.RegisterFS(textFS)
}

19
format/text/encoding.jq Normal file
View File

@ -0,0 +1,19 @@
def toiso8859_1: _tostrencoding({encoding: "ISO8859_1"});
def fromiso8859_1: _fromstrencoding({encoding: "ISO8859_1"});
def toutf8: _tostrencoding({encoding: "UTF8"});
def fromutf8: _fromstrencoding({encoding: "UTF8"});
def toutf16: _tostrencoding({encoding: "UTF16"});
def fromutf16: _fromstrencoding({encoding: "UTF16"});
def toutf16le: _tostrencoding({encoding: "UTF16LE"});
def fromutf16le: _fromstrencoding({encoding: "UTF16LE"});
def toutf16be: _tostrencoding({encoding: "UTF16BE"});
def fromutf16be: _fromstrencoding({encoding: "UTF16BE"});
def frombase64($opts): _frombase64({encoding: "std"} + $opts);
def frombase64: _frombase64(null);
def tobase64($opts): _tobase64({encoding: "std"} + $opts);
def tobase64: _tobase64(null);
# TODO: compat: remove at some point
def hex: _binary_or_orig(tohex; fromhex);
def base64: _binary_or_orig(tobase64; frombase64);

153
format/text/url.go Normal file
View File

@ -0,0 +1,153 @@
package text
import (
"net/url"
"github.com/wader/fq/internal/gojqextra"
"github.com/wader/fq/pkg/interp"
)
func init() {
interp.RegisterFunc0("fromurlencode", func(_ *interp.Interp, c string) any {
u, err := url.QueryUnescape(c)
if err != nil {
return err
}
return u
})
interp.RegisterFunc0("tourlencode", func(_ *interp.Interp, c string) any {
return url.QueryEscape(c)
})
interp.RegisterFunc0("fromurlpath", func(_ *interp.Interp, c string) any {
u, err := url.PathUnescape(c)
if err != nil {
return err
}
return u
})
interp.RegisterFunc0("tourlpath", func(_ *interp.Interp, c string) any {
return url.PathEscape(c)
})
fromURLValues := func(q url.Values) any {
qm := map[string]any{}
for k, v := range q {
if len(v) > 1 {
vm := []any{}
for _, v := range v {
vm = append(vm, v)
}
qm[k] = vm
} else {
qm[k] = v[0]
}
}
return qm
}
interp.RegisterFunc0("fromurlquery", func(_ *interp.Interp, c string) any {
q, err := url.ParseQuery(c)
if err != nil {
return err
}
return fromURLValues(q)
})
toURLValues := func(c map[string]any) url.Values {
qv := url.Values{}
for k, v := range c {
if va, ok := gojqextra.Cast[[]any](v); ok {
var ss []string
for _, s := range va {
if s, ok := gojqextra.Cast[string](s); ok {
ss = append(ss, s)
}
}
qv[k] = ss
} else if vs, ok := gojqextra.Cast[string](v); ok {
qv[k] = []string{vs}
}
}
return qv
}
interp.RegisterFunc0("tourlquery", func(_ *interp.Interp, c map[string]any) any {
// TODO: nicer
c, ok := gojqextra.NormalizeToStrings(c).(map[string]any)
if !ok {
panic("not map")
}
return toURLValues(c).Encode()
})
interp.RegisterFunc0("fromurl", func(_ *interp.Interp, c string) any {
u, err := url.Parse(c)
if err != nil {
return err
}
m := map[string]any{}
if u.Scheme != "" {
m["scheme"] = u.Scheme
}
if u.User != nil {
um := map[string]any{
"username": u.User.Username(),
}
if p, ok := u.User.Password(); ok {
um["password"] = p
}
m["user"] = um
}
if u.Host != "" {
m["host"] = u.Host
}
if u.Path != "" {
m["path"] = u.Path
}
if u.RawPath != "" {
m["rawpath"] = u.RawPath
}
if u.RawQuery != "" {
m["rawquery"] = u.RawQuery
m["query"] = fromURLValues(u.Query())
}
if u.Fragment != "" {
m["fragment"] = u.Fragment
}
return m
})
interp.RegisterFunc0("tourl", func(_ *interp.Interp, c map[string]any) any {
// TODO: nicer
c, ok := gojqextra.NormalizeToStrings(c).(map[string]any)
if !ok {
panic("not map")
}
str := func(v any) string { s, _ := gojqextra.Cast[string](v); return s }
u := url.URL{
Scheme: str(c["scheme"]),
Host: str(c["host"]),
Path: str(c["path"]),
Fragment: str(c["fragment"]),
}
if um, ok := gojqextra.Cast[map[string]any](c["user"]); ok {
username, password := str(um["username"]), str(um["password"])
if username != "" {
if password == "" {
u.User = url.User(username)
} else {
u.User = url.UserPassword(username, password)
}
}
}
if s, ok := gojqextra.Cast[string](c["rawquery"]); ok {
u.RawQuery = s
}
if qm, ok := gojqextra.Cast[map[string]any](c["query"]); ok {
u.RawQuery = toURLValues(qm).Encode()
}
return u.String()
})
}

5
format/toml/testdata/bigint.fqtest vendored Normal file
View File

@ -0,0 +1,5 @@
$ fq -n "{a: bsl(1;100)} | totoml | ., fromtoml"
"a = \"1267650600228229401496703205376\"\n"
{
"a": "1267650600228229401496703205376"
}

View File

@ -1,20 +1,28 @@
/probe.toml:
[test]
key = 123
$ fq . probe.toml
{
"test": {
"key": 123
}
}
# toml does not support null in arrays
# TODO: add uint64 norm test
$ fq -rRs 'fromjson[] | (walk(if type == "array" then map(select(. != null)) end) | try (totoml | ., fromtoml) catch .), "----"' variants.json
{}
totoml cannot be applied to: null
----
totoml cannot be applied to: boolean (true)
toml: top-level values must be Go maps or structs
----
totoml cannot be applied to: boolean (false)
toml: top-level values must be Go maps or structs
----
totoml cannot be applied to: number (123)
toml: top-level values must be Go maps or structs
----
totoml cannot be applied to: number (123.123)
toml: top-level values must be Go maps or structs
----
totoml cannot be applied to: string ("string")
toml: top-level values must be Go maps or structs
----
totoml cannot be applied to: array ([1,2,3])
toml: top-level values must be Go maps or structs
----
array = [true, false, 1.2, "string", [1.2, 3], {a = 1}]
"escape \\\"" = 456
@ -52,3 +60,12 @@ true = true
"white space": 123
}
----
toml: top-level values must be Go maps or structs
----
error at position 0x0: root object has no values
----
$ fq -n '"" | fromtoml'
exitcode: 5
stderr:
error: error at position 0x0: root object has no values

4
format/toml/testdata/trailing.fqtest vendored Normal file
View File

@ -0,0 +1,4 @@
$ fq.go -n '"[a] trailing" | fromtoml._error.error'
exitcode: 5
stderr:
error: error at position 0xc: toml: line 1 (last key "a"): expected a top-level item to end with a newline, comment, or EOF, but got 't' instead

22
format/toml/testdata/variants.json vendored Normal file
View File

@ -0,0 +1,22 @@
[
null,
true,
false,
123,
123.123,
"string",
[1, 2, 3],
{
"array": [ true, false, null, 1.2, "string", [1.2, 3], {"a": 1} ],
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {"a": 1},
"string": "string",
"true": true,
"white space": 123
},
[],
{}
]

69
format/toml/toml.go Normal file
View File

@ -0,0 +1,69 @@
package toml
import (
"bytes"
"embed"
"github.com/BurntSushi/toml"
"github.com/wader/fq/format"
"github.com/wader/fq/internal/gojqextra"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/decode"
"github.com/wader/fq/pkg/interp"
"github.com/wader/fq/pkg/scalar"
)
//go:embed toml.jq
var tomlFS embed.FS
func init() {
interp.RegisterFormat(decode.Format{
Name: format.TOML,
Description: "Tom's Obvious, Minimal Language",
ProbeOrder: format.ProbeOrderText,
Groups: []string{format.PROBE},
DecodeFn: decodeTOML,
Functions: []string{"_todisplay"},
Files: tomlFS,
})
interp.RegisterFunc0("totoml", toTOML)
}
func decodeTOML(d *decode.D, _ any) any {
br := d.RawLen(d.Len())
var r any
if _, err := toml.NewDecoder(bitio.NewIOReader(br)).Decode(&r); err != nil {
d.Fatalf("%s", err)
}
var s scalar.S
s.Actual = gojqextra.Normalize(r)
// TODO: better way to handle that an empty file is valid toml and parsed as an object
switch v := s.Actual.(type) {
case map[string]any:
if len(v) == 0 {
d.Fatalf("root object has no values")
}
case []any:
default:
d.Fatalf("root not object or array")
}
d.Value.V = &s
d.Value.Range.Len = d.Len()
return nil
}
func toTOML(_ *interp.Interp, c any) any {
if c == nil {
return gojqextra.FuncTypeError{Name: "totoml", V: c}
}
b := &bytes.Buffer{}
if err := toml.NewEncoder(b).Encode(gojqextra.Normalize(c)); err != nil {
return err
}
return b.String()
}

1
format/toml/toml.jq Normal file
View File

@ -0,0 +1 @@
def _toml__todisplay: tovalue;

View File

@ -23,7 +23,7 @@ var footerFormat decode.Group
func init() {
interp.RegisterFormat(decode.Format{
Name: format.WAV,
ProbeOrder: 10, // after most others (overlap some with webp)
ProbeOrder: format.ProbeOrderBinFuzzy, // after most others (overlap some with webp)
Description: "WAV file",
Groups: []string{format.PROBE},
DecodeFn: wavDecode,

206
format/xml/html.go Normal file
View File

@ -0,0 +1,206 @@
package xml
import (
"embed"
"strings"
"github.com/wader/fq/format"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/decode"
"github.com/wader/fq/pkg/interp"
"github.com/wader/fq/pkg/scalar"
"golang.org/x/net/html"
)
//go:embed html.jq
var htmlFS embed.FS
func init() {
interp.RegisterFormat(decode.Format{
Name: format.HTML,
Description: "HyperText Markup Language",
DecodeFn: decodeHTML,
DecodeInArg: format.HTMLIn{
Seq: false,
Array: false,
},
Functions: []string{"_todisplay"},
Files: htmlFS,
})
}
func fromHTMLObject(n *html.Node, hi format.HTMLIn) any {
var f func(n *html.Node, seq int) any
f = func(n *html.Node, seq int) any {
attrs := map[string]any{}
switch n.Type {
case html.ElementNode:
for _, a := range n.Attr {
attrs["-"+a.Key] = a.Val
}
default:
// skip
}
nNodes := 0
for c := n.FirstChild; c != nil; c = c.NextSibling {
if c.Type == html.ElementNode {
nNodes++
}
}
nSeq := -1
if nNodes > 1 {
nSeq = 0
}
var textSb *strings.Builder
var commentSb *strings.Builder
for c := n.FirstChild; c != nil; c = c.NextSibling {
switch c.Type {
case html.ElementNode:
if e, ok := attrs[c.Data]; ok {
if ea, ok := e.([]any); ok {
attrs[c.Data] = append(ea, f(c, nSeq))
} else {
attrs[c.Data] = []any{e, f(c, nSeq)}
}
} else {
attrs[c.Data] = f(c, nSeq)
}
if nNodes > 1 {
nSeq++
}
case html.TextNode:
if !whitespaceRE.MatchString(c.Data) {
if textSb == nil {
textSb = &strings.Builder{}
}
textSb.WriteString(c.Data)
}
case html.CommentNode:
if !whitespaceRE.MatchString(c.Data) {
if commentSb == nil {
commentSb = &strings.Builder{}
}
commentSb.WriteString(c.Data)
}
default:
// skip other nodes
}
if textSb != nil {
attrs["#text"] = strings.TrimSpace(textSb.String())
}
if commentSb != nil {
attrs["#comment"] = strings.TrimSpace(commentSb.String())
}
}
if hi.Seq && seq != -1 {
attrs["#seq"] = seq
}
if len(attrs) == 0 {
return ""
} else if len(attrs) == 1 && attrs["#text"] != nil {
return attrs["#text"]
}
return attrs
}
return f(n, -1)
}
func fromHTMLArray(n *html.Node) any {
var f func(n *html.Node) any
f = func(n *html.Node) any {
attrs := map[string]any{}
switch n.Type {
case html.ElementNode:
for _, a := range n.Attr {
attrs[a.Key] = a.Val
}
default:
// skip
}
nodes := []any{}
var textSb *strings.Builder
var commentSb *strings.Builder
for c := n.FirstChild; c != nil; c = c.NextSibling {
switch c.Type {
case html.ElementNode:
nodes = append(nodes, f(c))
case html.TextNode:
if !whitespaceRE.MatchString(c.Data) {
if textSb == nil {
textSb = &strings.Builder{}
}
textSb.WriteString(c.Data)
}
case html.CommentNode:
if !whitespaceRE.MatchString(c.Data) {
if commentSb == nil {
commentSb = &strings.Builder{}
}
commentSb.WriteString(c.Data)
}
default:
// skip other nodes
}
}
if textSb != nil {
attrs["#text"] = strings.TrimSpace(textSb.String())
}
if commentSb != nil {
attrs["#comment"] = strings.TrimSpace(commentSb.String())
}
elm := []any{n.Data}
if len(attrs) > 0 {
elm = append(elm, attrs)
}
if len(nodes) > 0 {
elm = append(elm, nodes)
}
return elm
}
return f(n.FirstChild)
}
func decodeHTML(d *decode.D, in any) any {
hi, _ := in.(format.HTMLIn)
br := d.RawLen(d.Len())
var r any
var err error
// disabled scripting means parse noscript tags etc
n, err := html.ParseWithOptions(bitio.NewIOReader(br), html.ParseOptionEnableScripting(false))
if err != nil {
d.Fatalf("%s", err)
}
if hi.Array {
r = fromHTMLArray(n)
} else {
r = fromHTMLObject(n, hi)
}
if err != nil {
d.Fatalf("%s", err)
}
var s scalar.S
s.Actual = r
d.Value.V = &s
d.Value.Range.Len = d.Len()
return nil
}

1
format/xml/html.jq Normal file
View File

@ -0,0 +1 @@
def _html__todisplay: tovalue;

5
format/xml/testdata/bigint.fqtest vendored Normal file
View File

@ -0,0 +1,5 @@
$ fq -n "{a: bsl(1;100)} | toxml | ., fromxml"
"<a>1267650600228229401496703205376</a>"
{
"a": "1267650600228229401496703205376"
}

4
format/xml/testdata/defaultns.xml vendored Normal file
View File

@ -0,0 +1,4 @@
<elm xmlns="a:b:c">
<ns1:aaa ns1:attr1="v1">1</ns1:aaa>
<bbb key="value">3</bbb>
</elm>

View File

@ -1,4 +1,13 @@
$ fq -d raw -ni . all.xml multi_diff.xml multi_same.xml ns.xml simple.xml escape.xml
/test:
test
$ fq -d html . /test
{
"html": {
"body": "test",
"head": ""
}
}
$ fq -d raw -ni . all.xml multi_diff.xml multi_same.xml ns.xml simple.xml escape.xml noscript.html
null> inputs | {name: input_filename, str: (tobytes | tostring)} | slurp("files")
null> spew("files") | .name, (.str | fromhtml | ., (toxml({indent: 2}) | println))
"all.xml"
@ -146,6 +155,25 @@ null> spew("files") | .name, (.str | fromhtml | ., (toxml({indent: 2}) | println
</body>
<head></head>
</html>
"noscript.html"
{
"html": {
"body": {
"a": "text"
},
"head": {
"noscript": ""
}
}
}
<html>
<body>
<a>text</a>
</body>
<head>
<noscript></noscript>
</head>
</html>
null> spew("files") | .name, (.str | fromhtml({seq: true}) | ., (toxml({indent: 2}) | println))
"all.xml"
{
@ -331,6 +359,27 @@ null> spew("files") | .name, (.str | fromhtml({seq: true}) | ., (toxml({indent:
<a attr="&amp;&lt;&gt;">&amp;&lt;&gt;</a>
</body>
</html>
"noscript.html"
{
"html": {
"body": {
"#seq": 1,
"a": "text"
},
"head": {
"#seq": 0,
"noscript": ""
}
}
}
<html>
<head>
<noscript></noscript>
</head>
<body>
<a>text</a>
</body>
</html>
null> spew("files") | .name, (.str | fromhtml({array: true}) | ., (toxml({indent: 2}) | println))
"all.xml"
[
@ -565,4 +614,37 @@ null> spew("files") | .name, (.str | fromhtml({array: true}) | ., (toxml({indent
<a attr="&amp;&lt;&gt;">&amp;&lt;&gt;</a>
</body>
</html>
"noscript.html"
[
"html",
[
[
"head",
[
[
"noscript"
]
]
],
[
"body",
[
[
"a",
{
"#text": "text"
}
]
]
]
]
]
<html>
<head>
<noscript></noscript>
</head>
<body>
<a>text</a>
</body>
</html>
null> ^D

3
format/xml/testdata/noscript.html vendored Normal file
View File

@ -0,0 +1,3 @@
<noscript>
<a>text</a>
</noscript>

4
format/xml/testdata/trailing.fqtest vendored Normal file
View File

@ -0,0 +1,4 @@
$ fq -n '"<a></a> trailing" | fromxml'
exitcode: 5
stderr:
error: error at position 0xa: root element has trailing data

View File

@ -1,6 +1,12 @@
/probe.xml:
<a></a>
$ fq . probe.xml
{
"a": ""
}
$ fq -d raw -ni . all.xml decl.xml multi_diff.xml multi_same.xml ns.xml simple.xml escape.xml
null> inputs | {name: input_filename, str: (tobytes | tostring)} | slurp("files")
null> spew("files") | .name, (.str | fromxml | ., (toxml({indent: 2}) | println))
null> spew("files") | .name, try (.str | fromxml | ., (toxml({indent: 2}) | println)) catch .
"all.xml"
{
"elm": {
@ -29,15 +35,9 @@ null> spew("files") | .name, (.str | fromxml | ., (toxml({indent: 2}) | println)
}
<elm></elm>
"multi_diff.xml"
{
"elm1": ""
}
<elm1></elm1>
"error at position 0x10: root element has trailing data"
"multi_same.xml"
{
"elm": ""
}
<elm></elm>
"error at position 0xe: root element has trailing data"
"ns.xml"
{
"elm": {
@ -51,14 +51,14 @@ null> spew("files") | .name, (.str | fromxml | ., (toxml({indent: 2}) | println)
"ns2:aaa": {
"#text": "2",
"-ns2:attr2": "v2",
"ccc": {
"-ns2:attr5": "v5"
},
"ns1:ccc": {
"-ns1:attr3": "v3"
},
"ns2:ccc": {
"-ns2:attr4": "v4"
},
"ns3:ccc": {
"-ns2:attr5": "v5"
}
}
}
@ -67,9 +67,9 @@ null> spew("files") | .name, (.str | fromxml | ., (toxml({indent: 2}) | println)
<aaa>3</aaa>
<ns1:aaa ns1:attr1="v1">1</ns1:aaa>
<ns2:aaa ns2:attr2="v2">2
<ccc ns2:attr5="v5"></ccc>
<ns1:ccc ns1:attr3="v3"></ns1:ccc>
<ns2:ccc ns2:attr4="v4"></ns2:ccc>
<ns3:ccc ns2:attr5="v5"></ns3:ccc>
</ns2:aaa>
</elm>
"simple.xml"
@ -85,7 +85,7 @@ null> spew("files") | .name, (.str | fromxml | ., (toxml({indent: 2}) | println)
}
}
<a attr="&amp;&lt;&gt;">&amp;&lt;&gt;</a>
null> spew("files") | .name, (.str | fromxml({seq: true}) | ., (toxml({indent: 2}) | println))
null> spew("files") | .name, try (.str | fromxml({seq: true}) | ., (toxml({indent: 2}) | println)) catch .
"all.xml"
{
"elm": {
@ -119,15 +119,9 @@ null> spew("files") | .name, (.str | fromxml({seq: true}) | ., (toxml({indent: 2
}
<elm></elm>
"multi_diff.xml"
{
"elm1": ""
}
<elm1></elm1>
"error at position 0x10: root element has trailing data"
"multi_same.xml"
{
"elm": ""
}
<elm></elm>
"error at position 0xe: root element has trailing data"
"ns.xml"
{
"elm": {
@ -146,6 +140,10 @@ null> spew("files") | .name, (.str | fromxml({seq: true}) | ., (toxml({indent: 2
"#seq": 1,
"#text": "2",
"-ns2:attr2": "v2",
"ccc": {
"#seq": 2,
"-ns2:attr5": "v5"
},
"ns1:ccc": {
"#seq": 0,
"-ns1:attr3": "v3"
@ -153,10 +151,6 @@ null> spew("files") | .name, (.str | fromxml({seq: true}) | ., (toxml({indent: 2
"ns2:ccc": {
"#seq": 1,
"-ns2:attr4": "v4"
},
"ns3:ccc": {
"#seq": 2,
"-ns2:attr5": "v5"
}
}
}
@ -166,7 +160,7 @@ null> spew("files") | .name, (.str | fromxml({seq: true}) | ., (toxml({indent: 2
<ns2:aaa ns2:attr2="v2">2
<ns1:ccc ns1:attr3="v3"></ns1:ccc>
<ns2:ccc ns2:attr4="v4"></ns2:ccc>
<ns3:ccc ns2:attr5="v5"></ns3:ccc>
<ccc ns2:attr5="v5"></ccc>
</ns2:aaa>
<aaa>3</aaa>
</elm>
@ -183,7 +177,7 @@ null> spew("files") | .name, (.str | fromxml({seq: true}) | ., (toxml({indent: 2
}
}
<a attr="&amp;&lt;&gt;">&amp;&lt;&gt;</a>
null> spew("files") | .name, (.str | fromxml({array: true}) | ., (toxml({indent: 2}) | println))
null> spew("files") | .name, try (.str | fromxml({array: true}) | ., (toxml({indent: 2}) | println)) catch .
"all.xml"
[
"elm",
@ -224,15 +218,9 @@ null> spew("files") | .name, (.str | fromxml({array: true}) | ., (toxml({indent:
]
<elm></elm>
"multi_diff.xml"
[
"elm1"
]
<elm1></elm1>
"error at position 0x10: root element has trailing data"
"multi_same.xml"
[
"elm"
]
<elm></elm>
"error at position 0xe: root element has trailing data"
"ns.xml"
[
"elm",
@ -268,7 +256,7 @@ null> spew("files") | .name, (.str | fromxml({array: true}) | ., (toxml({indent:
}
],
[
"ns3:ccc",
"ccc",
{
"ns2:attr5": "v5"
}
@ -288,7 +276,7 @@ null> spew("files") | .name, (.str | fromxml({array: true}) | ., (toxml({indent:
<ns2:aaa ns2:attr2="v2">2
<ns1:ccc ns1:attr3="v3"></ns1:ccc>
<ns2:ccc ns2:attr4="v4"></ns2:ccc>
<ns3:ccc ns2:attr5="v5"></ns3:ccc>
<ccc ns2:attr5="v5"></ccc>
</ns2:aaa>
<aaa>3</aaa>
</elm>
@ -337,5 +325,5 @@ null> {a: ["b", "c"]} | toxml
null> {a: [123, null, true, false]} | toxml
"<doc><a>123</a><a></a><a>true</a><a>false</a></doc>"
null> 123 | toxml
error: toxml cannot be applied to: string ("123")
error: toxml cannot be applied to: number (123)
null> ^D

448
format/xml/xml.go Normal file
View File

@ -0,0 +1,448 @@
package xml
// object mode inspired by https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html
// TODO: keep <?xml>? root #desc?
// TODO: xml default indent?
import (
"bytes"
"embed"
"encoding/xml"
"html"
"regexp"
"sort"
"strconv"
"strings"
"github.com/wader/fq/format"
"github.com/wader/fq/internal/gojqextra"
"github.com/wader/fq/internal/proxysort"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/decode"
"github.com/wader/fq/pkg/interp"
"github.com/wader/fq/pkg/scalar"
)
//go:embed xml.jq
var xmlFS embed.FS
func init() {
interp.RegisterFormat(decode.Format{
Name: format.XML,
Description: "Extensible Markup Language",
ProbeOrder: format.ProbeOrderText,
Groups: []string{format.PROBE},
DecodeFn: decodeXML,
DecodeInArg: format.XMLIn{
Seq: false,
Array: false,
},
Functions: []string{"_todisplay"},
Files: xmlFS,
})
interp.RegisterFunc1("toxml", toXML)
interp.RegisterFunc0("fromxmlentities", func(_ *interp.Interp, c string) any {
return html.UnescapeString(c)
})
interp.RegisterFunc0("toxmlentities", func(_ *interp.Interp, c string) any {
return html.EscapeString(c)
})
}
var whitespaceRE = regexp.MustCompile(`^\s*$`)
type xmlNode struct {
XMLName xml.Name
Attrs []xml.Attr `xml:",attr"`
Chardata []byte `xml:",chardata"`
Comment []byte `xml:",comment"`
Nodes []xmlNode `xml:",any"`
}
func (n *xmlNode) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
n.Attrs = start.Attr
type node xmlNode
return d.DecodeElement((*node)(n), &start)
}
type xmlNS struct {
name string
url string
}
// xmlNNStack is used to undo namespace url resolving, space is url not the "alias" name
type xmlNNStack []xmlNS
func (nss xmlNNStack) lookup(name xml.Name) string {
for i := len(nss) - 1; i >= 0; i-- {
ns := nss[i]
if name.Space == ns.url {
return ns.name
}
}
return ""
}
func (nss xmlNNStack) push(name string, url string) xmlNNStack {
n := append([]xmlNS{}, nss...)
n = append(n, xmlNS{name: name, url: url})
return xmlNNStack(n)
}
func fromXMLArray(n xmlNode) any {
var f func(n xmlNode, nss xmlNNStack) []any
f = func(n xmlNode, nss xmlNNStack) []any {
attrs := map[string]any{}
for _, a := range n.Attrs {
local, space := a.Name.Local, a.Name.Space
name := local
if space != "" {
if space == "xmlns" {
nss = nss.push(local, a.Value)
} else {
space = nss.lookup(a.Name)
}
name = space + ":" + local
}
attrs[name] = a.Value
}
if attrs["#text"] == nil && !whitespaceRE.Match(n.Chardata) {
attrs["#text"] = strings.TrimSpace(string(n.Chardata))
}
if attrs["#comment"] == nil && !whitespaceRE.Match(n.Comment) {
attrs["#comment"] = strings.TrimSpace(string(n.Comment))
}
nodes := []any{}
for _, c := range n.Nodes {
nodes = append(nodes, f(c, nss))
}
name, space := n.XMLName.Local, n.XMLName.Space
if space != "" {
space = nss.lookup(n.XMLName)
}
// only add if ns is found and not default ns
if space != "" {
name = space + ":" + name
}
elm := []any{name}
if len(attrs) > 0 {
elm = append(elm, attrs)
}
if len(nodes) > 0 {
elm = append(elm, nodes)
}
return elm
}
return f(n, nil)
}
func fromXMLObject(n xmlNode, xi format.XMLIn) any {
var f func(n xmlNode, seq int, nss xmlNNStack) any
f = func(n xmlNode, seq int, nss xmlNNStack) any {
attrs := map[string]any{}
for _, a := range n.Attrs {
local, space := a.Name.Local, a.Name.Space
name := local
if space != "" {
if space == "xmlns" {
nss = nss.push(local, a.Value)
} else {
space = nss.lookup(a.Name)
}
name = space + ":" + local
}
attrs["-"+name] = a.Value
}
for i, nn := range n.Nodes {
nSeq := i
if len(n.Nodes) == 1 {
nSeq = -1
}
local, space := nn.XMLName.Local, nn.XMLName.Space
name := local
if space != "" {
space = nss.lookup(nn.XMLName)
}
// only add if ns is found and not default ns
if space != "" {
name = space + ":" + name
}
if e, ok := attrs[name]; ok {
if ea, ok := e.([]any); ok {
attrs[name] = append(ea, f(nn, nSeq, nss))
} else {
attrs[name] = []any{e, f(nn, nSeq, nss)}
}
} else {
attrs[name] = f(nn, nSeq, nss)
}
}
if xi.Seq && seq != -1 {
attrs["#seq"] = seq
}
if attrs["#text"] == nil && !whitespaceRE.Match(n.Chardata) {
attrs["#text"] = strings.TrimSpace(string(n.Chardata))
}
if attrs["#comment"] == nil && !whitespaceRE.Match(n.Comment) {
attrs["#comment"] = strings.TrimSpace(string(n.Comment))
}
if len(attrs) == 0 {
return ""
} else if len(attrs) == 1 && attrs["#text"] != nil {
return attrs["#text"]
}
return attrs
}
return map[string]any{
n.XMLName.Local: f(n, -1, nil),
}
}
var wsRE *regexp.Regexp
func decodeXML(d *decode.D, in any) any {
xi, _ := in.(format.XMLIn)
br := d.RawLen(d.Len())
var r any
var err error
xd := xml.NewDecoder(bitio.NewIOReader(br))
xd.Strict = false
var n xmlNode
if err := xd.Decode(&n); err != nil {
d.Fatalf("%s", err)
}
if xi.Array {
r = fromXMLArray(n)
} else {
r = fromXMLObject(n, xi)
}
if err != nil {
d.Fatalf("%s", err)
}
var s scalar.S
s.Actual = r
switch s.Actual.(type) {
case map[string]any,
[]any:
default:
d.Fatalf("root not object or array")
}
d.SeekAbs(xd.InputOffset() * 8)
if d.RE(&wsRE, `^\s*$`) == nil {
d.Fatalf("root element has trailing data")
}
d.Value.V = &s
d.Value.Range.Len = d.Len()
return nil
}
type ToXMLOpts struct {
Indent int
}
func toXMLObject(c any, opts ToXMLOpts) any {
var f func(name string, content any) (xmlNode, int)
f = func(name string, content any) (xmlNode, int) {
n := xmlNode{
XMLName: xml.Name{Local: name},
}
seq := -1
var orderSeqs []int
var orderNames []string
switch v := content.(type) {
case string:
n.Chardata = []byte(v)
case map[string]any:
for k, v := range v {
switch {
case k == "#seq":
seq, _ = strconv.Atoi(v.(string))
case k == "#text":
s, _ := v.(string)
n.Chardata = []byte(s)
case k == "#comment":
s, _ := v.(string)
n.Comment = []byte(s)
case strings.HasPrefix(k, "-"):
s, _ := v.(string)
n.Attrs = append(n.Attrs, xml.Attr{
Name: xml.Name{Local: k[1:]},
Value: s,
})
default:
switch v := v.(type) {
case []any:
if len(v) > 0 {
for _, c := range v {
nn, nseq := f(k, c)
n.Nodes = append(n.Nodes, nn)
orderNames = append(orderNames, k)
orderSeqs = append(orderSeqs, nseq)
}
} else {
nn, nseq := f(k, "")
n.Nodes = append(n.Nodes, nn)
orderNames = append(orderNames, k)
orderSeqs = append(orderSeqs, nseq)
}
default:
nn, nseq := f(k, v)
n.Nodes = append(n.Nodes, nn)
orderNames = append(orderNames, k)
orderSeqs = append(orderSeqs, nseq)
}
}
}
}
// if one #seq was found, assume all have them, otherwise sort by name
if len(orderSeqs) > 0 && orderSeqs[0] != -1 {
proxysort.Sort(orderSeqs, n.Nodes, func(ss []int, i, j int) bool { return ss[i] < ss[j] })
} else {
proxysort.Sort(orderNames, n.Nodes, func(ss []string, i, j int) bool { return ss[i] < ss[j] })
}
sort.Slice(n.Attrs, func(i, j int) bool {
a, b := n.Attrs[i].Name, n.Attrs[j].Name
return a.Space < b.Space || a.Local < b.Local
})
return n, seq
}
n, _ := f("doc", c)
if len(n.Nodes) == 1 && len(n.Attrs) == 0 && n.Comment == nil && n.Chardata == nil {
n = n.Nodes[0]
}
bb := &bytes.Buffer{}
e := xml.NewEncoder(bb)
e.Indent("", strings.Repeat(" ", opts.Indent))
if err := e.Encode(n); err != nil {
return err
}
if err := e.Flush(); err != nil {
return err
}
return bb.String()
}
// ["elm", {attrs}, [children]] -> <elm attrs...>children...</elm>
func toXMLArray(c any, opts ToXMLOpts) any {
var f func(elm []any) (xmlNode, bool)
f = func(elm []any) (xmlNode, bool) {
var name string
var attrs map[string]any
var children []any
for _, v := range elm {
switch v := v.(type) {
case string:
if name == "" {
name = v
}
case map[string]any:
if attrs == nil {
attrs = v
}
case []any:
if children == nil {
children = v
}
}
}
if name == "" {
return xmlNode{}, false
}
n := xmlNode{
XMLName: xml.Name{Local: name},
}
for k, v := range attrs {
switch k {
case "#comment":
s, _ := v.(string)
n.Comment = []byte(s)
case "#text":
s, _ := v.(string)
n.Chardata = []byte(s)
default:
s, _ := v.(string)
n.Attrs = append(n.Attrs, xml.Attr{
Name: xml.Name{Local: k},
Value: s,
})
}
}
sort.Slice(n.Attrs, func(i, j int) bool {
a, b := n.Attrs[i].Name, n.Attrs[j].Name
return a.Space < b.Space || a.Local < b.Local
})
for _, c := range children {
c, ok := c.([]any)
if !ok {
continue
}
if cn, ok := f(c); ok {
n.Nodes = append(n.Nodes, cn)
}
}
return n, true
}
ca, ok := c.([]any)
if !ok {
return gojqextra.FuncTypeError{Name: "toxml", V: c}
}
n, ok := f(ca)
if !ok {
// TODO: better error
return gojqextra.FuncTypeError{Name: "toxml", V: c}
}
bb := &bytes.Buffer{}
e := xml.NewEncoder(bb)
e.Indent("", strings.Repeat(" ", opts.Indent))
if err := e.Encode(n); err != nil {
return err
}
if err := e.Flush(); err != nil {
return err
}
return bb.String()
}
func toXML(_ *interp.Interp, c any, opts ToXMLOpts) any {
if v, ok := gojqextra.Cast[map[string]any](c); ok {
return toXMLObject(gojqextra.NormalizeToStrings(v), opts)
} else if v, ok := gojqextra.Cast[[]any](c); ok {
return toXMLArray(gojqextra.NormalizeToStrings(v), opts)
}
return gojqextra.FuncTypeError{Name: "toxml", V: c}
}

2
format/xml/xml.jq Normal file
View File

@ -0,0 +1,2 @@
def toxml: toxml(null);
def _xml__todisplay: tovalue;

5
format/yaml/testdata/bigint.fqtest vendored Normal file
View File

@ -0,0 +1,5 @@
$ fq -n "{a: bsl(1;100)} | toyaml | ., fromyaml"
"a: \"1267650600228229401496703205376\"\n"
{
"a": "1267650600228229401496703205376"
}

4
format/yaml/testdata/trailing.fqtest vendored Normal file
View File

@ -0,0 +1,4 @@
$ fq -n '"- a\ntrailing" | fromyaml._error.error'
exitcode: 5
stderr:
error: error at position 0xc: yaml: line 2: could not find expected ':'

22
format/yaml/testdata/variants.json vendored Normal file
View File

@ -0,0 +1,22 @@
[
null,
true,
false,
123,
123.123,
"string",
[1, 2, 3],
{
"array": [ true, false, null, 1.2, "string", [1.2, 3], {"a": 1} ],
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {"a": 1},
"string": "string",
"true": true,
"white space": 123
},
[],
{}
]

View File

@ -1,28 +1,37 @@
/probe.yaml:
test:
key: 123
$ fq . probe.yaml
{
"test": {
"key": 123
}
}
# TODO: add uint64 norm test
$ fq -rRs 'fromjson[] | (try (toyaml | ., fromyaml) catch .), "----"' variants.json
null
null
error at position 0x5: root not object or array
----
true
true
error at position 0x5: root not object or array
----
false
false
error at position 0x6: root not object or array
----
123
123
error at position 0x4: root not object or array
----
123.123
123.123
error at position 0x8: root not object or array
----
string
string
error at position 0x7: root not object or array
----
- 1
- 2
@ -80,3 +89,11 @@ white space: 123
"white space": 123
}
----
[]
[]
----
{}
{}
----

62
format/yaml/yaml.go Normal file
View File

@ -0,0 +1,62 @@
package yaml
// TODO: yaml type eval? walk eval?
import (
"embed"
"github.com/wader/fq/format"
"github.com/wader/fq/internal/gojqextra"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/decode"
"github.com/wader/fq/pkg/interp"
"github.com/wader/fq/pkg/scalar"
"gopkg.in/yaml.v3"
)
//go:embed yaml.jq
var yamlFS embed.FS
func init() {
interp.RegisterFormat(decode.Format{
Name: format.YAML,
Description: "YAML Ain't Markup Language",
ProbeOrder: format.ProbeOrderText,
Groups: []string{format.PROBE},
DecodeFn: decodeYAML,
Functions: []string{"_todisplay"},
Files: yamlFS,
})
interp.RegisterFunc0("toyaml", toYAML)
}
func decodeYAML(d *decode.D, _ any) any {
br := d.RawLen(d.Len())
var r any
if err := yaml.NewDecoder(bitio.NewIOReader(br)).Decode(&r); err != nil {
d.Fatalf("%s", err)
}
var s scalar.S
s.Actual = r
switch s.Actual.(type) {
case map[string]any,
[]any:
default:
d.Fatalf("root not object or array")
}
d.Value.V = &s
d.Value.Range.Len = d.Len()
return nil
}
func toYAML(_ *interp.Interp, c any) any {
b, err := yaml.Marshal(gojqextra.Normalize(c))
if err != nil {
return err
}
return string(b)
}

1
format/yaml/yaml.jq Normal file
View File

@ -0,0 +1 @@
def _yaml__todisplay: tovalue;

6
go.mod
View File

@ -14,6 +14,10 @@ require (
// bump: gomod-BurntSushi/toml command go get -d github.com/BurntSushi/toml@v$LATEST && go mod tidy
// bump: gomod-BurntSushi/toml link "Source diff $CURRENT..$LATEST" https://github.com/BurntSushi/toml/compare/v$CURRENT..v$LATEST
github.com/BurntSushi/toml v1.2.0
// bump: gomod-creasty-defaults /github\.com\/creasty\/defaults v(.*)/ https://github.com/creasty/defaults.git|^1
// bump: gomod-creasty-defaults command go get -d github.com/creasty/defaults@v$LATEST && go mod tidy
// bump: gomod-creasty-defaults link "Source diff $CURRENT..$LATEST" https://github.com/creasty/defaults/compare/v$CURRENT..v$LATEST
github.com/creasty/defaults v1.6.0
// bump: gomod-golang-snappy /github\.com\/golang\/snappy v(.*)/ https://github.com/golang/snappy.git|^0
// bump: gomod-golang-snappy command go get -d github.com/golang/snappy@v$LATEST && go mod tidy
// bump: gomod-golang-snappy link "Source diff $CURRENT..$LATEST" https://github.com/golang/snappy/compare/v$CURRENT..v$LATEST
@ -53,5 +57,7 @@ require (
require (
github.com/itchyny/timefmt-go v0.1.3 // indirect
github.com/mitchellh/reflectwalk v1.0.2 // indirect
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e // indirect
golang.org/x/sys v0.0.0-20220627191245-f75cf1eec38b // indirect
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f // indirect
)

10
go.sum
View File

@ -1,17 +1,24 @@
github.com/BurntSushi/toml v1.2.0 h1:Rt8g24XnyGTyglgET/PRUNlrUeu9F5L+7FilkXfZgs0=
github.com/BurntSushi/toml v1.2.0/go.mod h1:CxXYINrC8qIiEnFrOxCa7Jy5BFHlXnUU2pbicEuybxQ=
github.com/creasty/defaults v1.6.0 h1:ltuE9cfphUtlrBeomuu8PEyISTXnxqkBIoQfXgv7BSc=
github.com/creasty/defaults v1.6.0/go.mod h1:iGzKe6pbEHnpMPtfDXZEr0NVxWnPTjb1bbDy08fPzYM=
github.com/golang/snappy v0.0.4 h1:yAGX7huGHXlcLOEtBnF4w7FQwA26wojNCwOYAEhLjQM=
github.com/golang/snappy v0.0.4/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
github.com/google/gopacket v1.1.19 h1:ves8RnFZPGiFnTS0uPQStjwru6uO6h+nlr9j6fL7kF8=
github.com/google/gopacket v1.1.19/go.mod h1:iJ8V8n6KS+z2U1A8pUwu8bW5SyEMkXJB8Yo/Vo+TKTo=
github.com/itchyny/timefmt-go v0.1.3 h1:7M3LGVDsqcd0VZH2U+x393obrzZisp7C0uEe921iRkU=
github.com/itchyny/timefmt-go v0.1.3/go.mod h1:0osSSCQSASBJMsIZnhAaF1C2fCBTJZXrnj37mG8/c+A=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/mitchellh/copystructure v1.2.0 h1:vpKXTN4ewci03Vljg/q9QvCGUDttBOGBIa15WveJJGw=
github.com/mitchellh/copystructure v1.2.0/go.mod h1:qLl+cE2AmVv+CoeAwDPye/v+N2HKCj9FbZEVFJRxO9s=
github.com/mitchellh/mapstructure v1.5.0 h1:jeMsZIYE/09sWLaz43PL7Gy6RuMjD2eJVyuac5Z2hdY=
github.com/mitchellh/mapstructure v1.5.0/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo=
github.com/mitchellh/reflectwalk v1.0.2 h1:G2LzWKi524PWgd3mLHV8Y5k7s6XUvT0Gef6zxSIeXaQ=
github.com/mitchellh/reflectwalk v1.0.2/go.mod h1:mSTlrgnPZtwu0c4WaC2kGObEpuNDbx0jmZXqmk4esnw=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e h1:fD57ERR4JtEqsWbfPhv4DMiApHyliiK5xCTNVSPiaAs=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/wader/gojq v0.12.1-0.20220703094036-0eed2734a1d7 h1:3IQ6iYU/tkMcEpYu64CfhzQZNemPevlQyOsiga5uN2o=
@ -39,7 +46,8 @@ golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20200130002326-2f3ba24bd6e7/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f h1:BLraFXnmrev5lT+xlilqcH8XK9/i0At2xKjWk4p6zsU=
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@ -8,6 +8,7 @@ import (
"regexp"
"strings"
"github.com/creasty/defaults"
"github.com/mitchellh/mapstructure"
)
@ -21,6 +22,7 @@ func CamelToSnake(s string) string {
}
func ToStruct(m any, v any) error {
_ = defaults.Set(v)
ms, err := mapstructure.NewDecoder(&mapstructure.DecoderConfig{
MatchName: func(mapKey, fieldName string) bool {
return CamelToSnake(fieldName) == mapKey

View File

@ -7,8 +7,10 @@ import (
"io"
"io/ioutil"
"math/big"
"regexp"
"github.com/wader/fq/internal/bitioextra"
"github.com/wader/fq/internal/ioextra"
"github.com/wader/fq/internal/recoverfn"
"github.com/wader/fq/pkg/bitio"
"github.com/wader/fq/pkg/ranges"
@ -104,8 +106,8 @@ func decode(ctx context.Context, br bitio.ReaderAtSeeker, group Group, opts Opti
switch vv := d.Value.V.(type) {
case *Compound:
// TODO: hack, changes V
vv.Err = formatErr
d.Value.V = vv
d.Value.Err = formatErr
}
if len(group) != 1 {
@ -171,7 +173,6 @@ func newDecoder(ctx context.Context, format Format, br bitio.ReaderAtSeeker, opt
RangeSorted: true,
Children: nil,
Description: opts.Description,
Format: &format,
}
return &D{
@ -184,6 +185,7 @@ func newDecoder(ctx context.Context, format Format, br bitio.ReaderAtSeeker, opt
RootReader: br,
Range: ranges.Range{Start: 0, Len: 0},
IsRoot: opts.IsRoot,
Format: &format,
},
Options: opts,
@ -1226,3 +1228,56 @@ func (d *D) FieldScalarFn(name string, sfn scalar.Fn, sms ...scalar.Mapper) *sca
}
return v
}
func (d *D) RE(reRef **regexp.Regexp, reStr string) []ranges.Range {
if *reRef == nil {
*reRef = regexp.MustCompile(reStr)
}
startPos := d.Pos()
rr := ioextra.ByteRuneReader{RS: bitio.NewIOReadSeeker(d.bitBuf)}
locs := (*reRef).FindReaderSubmatchIndex(rr)
if locs == nil {
return nil
}
d.SeekAbs(startPos)
var rs []ranges.Range
l := len(locs) / 2
for i := 0; i < l; i++ {
loc := locs[i*2 : i*2+2]
if loc[0] == -1 {
rs = append(rs, ranges.Range{Start: -1})
} else {
rs = append(rs, ranges.Range{
Start: startPos + int64(loc[0]*8),
Len: int64((loc[1] - loc[0]) * 8)},
)
}
}
return rs
}
func (d *D) FieldRE(reRef **regexp.Regexp, reStr string, mRef *map[string]string, sms ...scalar.Mapper) {
if *reRef == nil {
*reRef = regexp.MustCompile(reStr)
}
subexpNames := (*reRef).SubexpNames()
rs := d.RE(reRef, reStr)
for i, r := range rs {
if i == 0 || r.Start == -1 {
continue
}
d.SeekAbs(r.Start)
name := subexpNames[i]
value := d.FieldUTF8(name, int(r.Len/8), sms...)
if mRef != nil {
(*mRef)[name] = value
}
}
d.SeekAbs(rs[0].Stop())
}

View File

@ -1,8 +1,5 @@
package decode
// TODO: Encoding, u16le, varint etc, encode?
// TODO: Value/Compound interface? can have per type and save memory
import (
"errors"
"sort"
@ -16,20 +13,23 @@ type Compound struct {
IsArray bool
RangeSorted bool
Children []*Value
Description string
Format *Format
Err error
}
// TODO: Encoding, u16le, varint etc, encode?
// TODO: Value/Compound interface? can have per type and save memory
// TODO: Make some fields optional somehow? map/slice?
type Value struct {
Parent *Value
Name string
V any // scalar.S or Compound (array/struct)
Index int // index in parent array/struct
Range ranges.Range
RootReader bitio.ReaderAtSeeker
IsRoot bool // TODO: rework?
Parent *Value
Name string
V any // scalar.S or Compound (array/struct)
Index int // index in parent array/struct
Range ranges.Range
RootReader bitio.ReaderAtSeeker
IsRoot bool // TODO: rework?
Format *Format // TODO: rework
Description string
Err error
}
type WalkFn func(v *Value, rootV *Value, depth int, rootDepth int) error
@ -148,12 +148,8 @@ func (v *Value) root(findSubRoot bool, findFormatRoot bool) *Value {
if findSubRoot && rootV.IsRoot {
break
}
if findFormatRoot {
if c, ok := rootV.V.(*Compound); ok {
if c.Format != nil {
break
}
}
if findFormatRoot && rootV.Format != nil {
break
}
rootV = rootV.Parent
@ -167,12 +163,9 @@ func (v *Value) FormatRoot() *Value { return v.root(true, true) }
func (v *Value) Errors() []error {
var errs []error
_ = v.WalkPreOrder(func(_ *Value, rootV *Value, _ int, _ int) error {
switch vv := rootV.V.(type) {
case *Compound:
if vv.Err != nil {
errs = append(errs, vv.Err)
}
_ = v.WalkPreOrder(func(v *Value, _ *Value, _ int, _ int) error {
if v.Err != nil {
errs = append(errs, v.Err)
}
return nil
})

View File

@ -561,17 +561,11 @@ func (dvb decodeValueBase) JQValueKey(name string) any {
case "_path":
return valuePath(dv)
case "_error":
switch vv := dv.V.(type) {
case *decode.Compound:
var formatErr decode.FormatError
if errors.As(vv.Err, &formatErr) {
return formatErr.Value()
}
return vv.Err
default:
return nil
var formatErr decode.FormatError
if errors.As(dv.Err, &formatErr) {
return formatErr.Value()
}
return nil
case "_bits":
return Binary{
br: dv.RootReader,
@ -585,23 +579,10 @@ func (dvb decodeValueBase) JQValueKey(name string) any {
unit: 8,
}
case "_format":
switch vv := dv.V.(type) {
case *decode.Compound:
if vv.Format != nil {
return vv.Format.Name
}
return nil
case *scalar.S:
// TODO: hack, Scalar interface?
switch vv.Actual.(type) {
case map[string]any, []any:
return "json"
default:
return nil
}
default:
return nil
if dv.Format != nil {
return dv.Format.Name
}
return nil
case "_out":
return dvb.out
case "_unknown":

View File

@ -142,18 +142,13 @@ func dumpEx(v *decode.Value, ctx *dumpCtx, depth int, rootV *decode.Value, rootD
if vv.Description != "" {
cfmt(colField, " %s", deco.Value.F(vv.Description))
}
if vv.Format != nil {
cfmt(colField, " (%s)", deco.Value.F(vv.Format.Name))
}
valueErr = vv.Err
case *scalar.S:
// TODO: rethink scalar array/struct (json format)
switch av := vv.Actual.(type) {
case map[string]any:
cfmt(colField, ": %s (%s)", deco.Object.F("{}"), deco.Value.F("json"))
cfmt(colField, ": %s", deco.Object.F("{}"))
case []any:
cfmt(colField, ": %s%s:%s%s (%s)", deco.Index.F("["), deco.Number.F("0"), deco.Number.F(strconv.Itoa(len(av))), deco.Index.F("]"), deco.Value.F("json"))
// TODO: format?
cfmt(colField, ": %s%s:%s%s", deco.Index.F("["), deco.Number.F("0"), deco.Number.F(strconv.Itoa(len(av))), deco.Index.F("]"))
default:
cprint(colField, ":")
if vv.Sym == nil {
@ -162,20 +157,23 @@ func dumpEx(v *decode.Value, ctx *dumpCtx, depth int, rootV *decode.Value, rootD
cfmt(colField, " %s", deco.ValueColor(vv.Sym).F(previewValue(vv.Sym, vv.SymDisplay)))
cfmt(colField, " (%s)", deco.ValueColor(vv.Actual).F(previewValue(vv.Actual, vv.ActualDisplay)))
}
}
if opts.Verbose && isInArray {
cfmt(colField, " %s", v.Name)
}
// TODO: similar to struct/array?
if vv.Description != "" {
cfmt(colField, fmt.Sprintf(" (%s)", deco.Value.F(vv.Description)))
}
if opts.Verbose && isInArray {
cfmt(colField, " %s", v.Name)
}
if vv.Description != "" {
cfmt(colField, " (%s)", deco.Value.F(vv.Description))
}
default:
panic(fmt.Sprintf("unreachable vv %#+v", vv))
}
if v.Format != nil {
cfmt(colField, " (%s)", deco.Value.F(v.Format.Name))
}
valueErr = v.Err
innerRange := v.InnerRange()
if opts.Verbose {

File diff suppressed because it is too large Load Diff

View File

@ -1,240 +0,0 @@
include "internal";
include "binary";
# convert all scalars to strings, null as empty string (same as @csv)
def _walk_tostring:
walk(
if _is_null then ""
elif _is_scalar then tostring
end
);
# overloads builtin tojson to have options
def tojson($opts): _tojson({} + $opts);
def tojson: tojson(null);
def fromxml($opts): _fromxml({} + $opts);
def fromxml: _fromxml(null);
def toxml($opts): _walk_tostring | _toxml({} + $opts);
def toxml: toxml(null);
def fromhtml($opts): _fromhtml({} + $opts);
def fromhtml: fromhtml(null);
def fromyaml: _fromyaml;
def toyaml: _toyaml;
def fromtoml: _fromtoml;
def totoml: _totoml;
def fromcsv($opts): _fromcsv({comma: ",", comment: "#"} + $opts);
def fromcsv: fromcsv(null);
def tocsv($opts): _walk_tostring | _tocsv({comma: ","} + $opts);
def tocsv: tocsv(null);
def fromxmlentities: _fromxmlentities;
def toxmlentities: _toxmlentities;
def fromurlpath: _fromurlpath;
def tourlpath: _tourlpath;
def fromurlencode: _fromurlencode;
def tourlencode: _tourlencode;
def fromurlquery: _fromurlquery;
def tourlquery: _tourlquery;
def fromurl: _fromurl;
def tourl: _tourl;
def fromhex: _fromhex;
def tohex: _tohex;
def frombase64($opts): _frombase64({encoding: "std"} + $opts);
def frombase64: _frombase64(null);
def tobase64($opts): _tobase64({encoding: "std"} + $opts);
def tobase64: _tobase64(null);
def tomd4: _tohash({name: "md4"});
def tomd5: _tohash({name: "md5"});
def tosha1: _tohash({name: "sha1"});
def tosha256: _tohash({name: "sha256"});
def tosha512: _tohash({name: "sha512"});
def tosha3_224: _tohash({name: "sha3_224"});
def tosha3_256: _tohash({name: "sha3_256"});
def tosha3_384: _tohash({name: "sha3_384"});
def tosha3_512: _tohash({name: "sha3_512"});
# _tostrencoding/_fromstrencoding can do more but not exposed as functions yet
def toiso8859_1: _tostrencoding({encoding: "ISO8859_1"});
def fromiso8859_1: _fromstrencoding({encoding: "ISO8859_1"});
def toutf8: _tostrencoding({encoding: "UTF8"});
def fromutf8: _fromstrencoding({encoding: "UTF8"});
def toutf16: _tostrencoding({encoding: "UTF16"});
def fromutf16: _fromstrencoding({encoding: "UTF16"});
def toutf16le: _tostrencoding({encoding: "UTF16LE"});
def fromutf16le: _fromstrencoding({encoding: "UTF16LE"});
def toutf16be: _tostrencoding({encoding: "UTF16BE"});
def fromutf16be: _fromstrencoding({encoding: "UTF16BE"});
# https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail
# TODO: add test
def frompem:
( tobytes
| tostring
| capture("-----BEGIN(.*?)-----(?<s>.*?)-----END(.*?)-----"; "mg").s
| frombase64
) // error("no pem header or footer found");
def topem($label):
( tobytes
| tobase64
| ($label | if $label != "" then " " + $label end) as $label
| [ "-----BEGIN\($label)-----"
, .
, "-----END\($label)-----"
, ""
]
| join("\n")
);
def topem: topem("");
def fromradix($base; $table):
( if _is_string | not then error("cannot fromradix convert: \(.)") end
| split("")
| reverse
| map($table[.])
| if . == null then error("invalid char \(.)") end
# state: [power, ans]
| reduce .[] as $c ([1,0];
( (.[0] * $base) as $b
| [$b, .[1] + (.[0] * $c)]
)
)
| .[1]
);
def fromradix($base):
fromradix($base; {
"0": 0, "1": 1, "2": 2, "3": 3,"4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9,
"a": 10, "b": 11, "c": 12, "d": 13, "e": 14, "f": 15, "g": 16,
"h": 17, "i": 18, "j": 19, "k": 20, "l": 21, "m": 22, "n": 23,
"o": 24, "p": 25, "q": 26, "r": 27, "s": 28, "t": 29, "u": 30,
"v": 31, "w": 32, "x": 33, "y": 34, "z": 35,
"A": 36, "B": 37, "C": 38, "D": 39, "E": 40, "F": 41, "G": 42,
"H": 43, "I": 44, "J": 45, "K": 46, "L": 47, "M": 48, "N": 49,
"O": 50, "P": 51, "Q": 52, "R": 53, "S": 54, "T": 55, "U": 56,
"V": 57, "W": 58, "X": 59, "Y": 60, "Z": 61,
"@": 62, "_": 63,
});
def toradix($base; $table):
( if type != "number" then error("cannot toradix convert: \(.)") end
| if . == 0 then "0"
else
( [ recurse(if . > 0 then _intdiv(.; $base) else empty end) | . % $base]
| reverse
| .[1:]
| if $base <= ($table | length) then
map($table[.]) | join("")
else
error("base too large")
end
)
end
);
def toradix($base):
toradix($base; "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ@_");
# to jq-flavoured json
def _tojq($opts):
def _is_ident: test("^[a-zA-Z_][a-zA-Z_0-9]*$");
def _key: if _is_ident | not then tojson end;
def _f($opts; $indent):
def _r($prefix):
( type as $t
| if $t == "null" then tojson
elif $t == "string" then tojson
elif $t == "number" then tojson
elif $t == "boolean" then tojson
elif $t == "array" then
[ "[", $opts.compound_newline
, ( [ .[]
| $prefix, $indent
, _r($prefix+$indent), $opts.array_sep
]
| .[0:-1]
)
, $opts.compound_newline
, $prefix, "]"
]
elif $t == "object" then
[ "{", $opts.compound_newline
, ( [ to_entries[]
| $prefix, $indent
, (.key | _key), $opts.key_sep
, (.value | _r($prefix+$indent)), $opts.object_sep
]
| .[0:-1]
)
, $opts.compound_newline
, $prefix, "}"
]
else error("unknown type \($t)")
end
);
_r("");
( _f($opts; $opts.indent * " ")
| if _is_array then flatten | join("") end
);
def tojq($opts):
_tojq(
( { indent: 0,
key_sep: ":",
object_sep: ",",
array_sep: ",",
compound_newline: "",
} + $opts
| if .indent > 0 then
( .key_sep = ": "
| .object_sep = ",\n"
| .array_sep = ",\n"
| .compound_newline = "\n"
)
end
)
);
def tojq: tojq(null);
# from jq-flavoured json
def fromjq:
def _f:
( . as $v
| .term.type
| if . == "TermTypeNull" then null
elif . == "TermTypeTrue" then true
elif . == "TermTypeFalse" then false
elif . == "TermTypeString" then $v.term.str.str
elif . == "TermTypeNumber" then $v.term.number | tonumber
elif . == "TermTypeObject" then
( $v.term.object.key_vals
| map(
{ key: (.key // .key_string.str),
value: (.val.queries[0] | _f)
}
)
| from_entries
)
elif . == "TermTypeArray" then
( def _a: if .op then .left, .right | _a end;
[$v.term.array.query | _a | _f]
)
else error("unknown term")
end
);
try
(_query_fromstring | _f)
catch
error("fromjq only supports constant literals");
# TODO: compat remove at some point
def hex: _binary_or_orig(tohex; fromhex);
def base64: _binary_or_orig(tobase64; frombase64);

View File

@ -6,4 +6,6 @@
| select(.key != "all")
| "def \(.key)($opts): decode(\(.key | tojson); $opts);"
, "def \(.key): decode(\(.key | tojson); {});"
, "def from\(.key)($opts): decode(\(.key | tojson); $opts) | if ._error then error(._error.error) end;"
, "def from\(.key): from\(.key)({});"
] | join("\n")

View File

@ -2,8 +2,6 @@ include "internal";
include "options";
include "binary";
include "decode";
include "encoding";
def intdiv(a; b): _intdiv(a; b);

View File

@ -36,7 +36,6 @@ import (
//go:embed interp.jq
//go:embed internal.jq
//go:embed options.jq
//go:embed encoding.jq
//go:embed binary.jq
//go:embed decode.jq
//go:embed format_decode.jq

View File

@ -5,11 +5,17 @@ include "decode";
def _display_default_opts:
options({depth: 1});
def _display_default_opts:
options({depth: 1});
def _todisplay:
( format as $f
# TODO: not sure about the error check here
| if $f == null or ._error != null then error("value is not a format root or has errors") end
| _format_func($f; "_todisplay")
);
def display($opts):
( options($opts) as $opts
( . as $c
| options($opts) as $opts
| try _todisplay catch $c
| if _can_display then _display($opts)
else
( if _is_string and $opts.raw_string then print

View File

@ -122,6 +122,7 @@ bsd_loopback_frame BSD loopback frame
bson Binary JSON
bzip2 bzip2 compression
cbor Concise Binary Object Representation
csv Comma separated values
dns DNS packet
dns_tcp DNS packet (TCP)
elf Executable and Linkable Format
@ -143,6 +144,7 @@ hevc_nalu H.265/HEVC Network Access Layer Unit
hevc_pps H.265/HEVC Picture Parameter Set
hevc_sps H.265/HEVC Sequence Parameter Set
hevc_vps H.265/HEVC Video Parameter Set
html HyperText Markup Language
icc_profile International Color Consortium profile
icmp Internet Control Message Protocol
icmpv6 Internet Control Message Protocol v6
@ -152,7 +154,7 @@ id3v2 ID3v2 metadata
ipv4_packet Internet protocol v4 packet
ipv6_packet Internet protocol v6 packet
jpeg Joint Photographic Experts Group file
json JSON
json JavaScript Object Notation
macho Mach-O macOS executable
matroska Matroska file
mp3 MP3 file
@ -181,6 +183,7 @@ sll_packet Linux cooked capture encapsulation
tar Tar archive
tcp_segment Transmission control protocol segment
tiff Tag Image File Format
toml Tom's Obvious, Minimal Language
udp_datagram User datagram protocol
vorbis_comment Vorbis comment
vorbis_packet Vorbis packet
@ -191,6 +194,8 @@ vpx_ccr VPX Codec Configuration Record
wav WAV file
webp WebP image
xing Xing header
xml Extensible Markup Language
yaml YAML Ain't Markup Language
zip ZIP archive
$ fq -X
exitcode: 2

View File

@ -1,36 +0,0 @@
$ fq -i
null> {a: bsl(1;100)} | repl
> object> tojq | ., fromjq
"{a:1267650600228229401496703205376}"
{
"a": 1267650600228229401496703205376
}
> object> tojson | ., fromjson
"{\"a\":1267650600228229401496703205376}"
{
"a": 1267650600228229401496703205376
}
> object> toyaml | ., fromyaml
"a: \"1267650600228229401496703205376\"\n"
{
"a": "1267650600228229401496703205376"
}
> object> totoml | ., fromtoml
"a = \"1267650600228229401496703205376\"\n"
{
"a": "1267650600228229401496703205376"
}
> object> toxml | ., fromxml
"<a>1267650600228229401496703205376</a>"
{
"a": "1267650600228229401496703205376"
}
> object> ^D
null> [[bsl(1;100)]] | tocsv | ., fromcsv
"1267650600228229401496703205376\n"
[
[
"1267650600228229401496703205376"
]
]
null> ^D

View File

@ -1,136 +0,0 @@
$ fq -rRs 'fromjson[] | (tojson | ., fromjson), "----", (tojson({indent:2}) | ., fromjson), "----"' variants.json
null
null
----
null
null
----
true
true
----
true
true
----
false
false
----
false
false
----
123
123
----
123
123
----
123.123
123.123
----
123.123
123.123
----
"string"
string
----
"string"
string
----
[1,2,3]
[
1,
2,
3
]
----
[
1,
2,
3
]
[
1,
2,
3
]
----
{"array":[true,false,null,1.2,"string",[1.2,3],{"a":1}],"escape \\\"":456,"false":false,"null":null,"number":1.2,"object":{"a":1},"string":"string","true":true,"white space":123}
{
"array": [
true,
false,
null,
1.2,
"string",
[
1.2,
3
],
{
"a": 1
}
],
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {
"a": 1
},
"string": "string",
"true": true,
"white space": 123
}
----
{
"array": [
true,
false,
null,
1.2,
"string",
[
1.2,
3
],
{
"a": 1
}
],
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {
"a": 1
},
"string": "string",
"true": true,
"white space": 123
}
{
"array": [
true,
false,
null,
1.2,
"string",
[
1.2,
3
],
{
"a": 1
}
],
"escape \\\"": 456,
"false": false,
"null": null,
"number": 1.2,
"object": {
"a": 1
},
"string": "string",
"true": true,
"white space": 123
}
----

View File

@ -1 +0,0 @@
# TODO

View File

@ -1,7 +1,6 @@
$ fq -i -n '"[]" | json'
json> (.) | ., tovalue, type, length?
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|
0x0|5b 5d| |[]| |.: [0:0] (json)
[]
[]
"array"
0

View File

@ -1,7 +1,6 @@
$ fq -i -n '"{}" | json'
json> (.) | ., tovalue, type, length?
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|
0x0|7b 7d| |{}| |.: {} (json)
{}
{}
"object"
0