2022-01-09 18:57:11 +03:00
|
|
|
## Implement a decoder
|
|
|
|
|
|
|
|
### Steps to add new decoder
|
|
|
|
|
|
|
|
- Create a directory `format/<name>`
|
|
|
|
- Copy some similar decoder, `format/format/bson.go` is quite small, to `format/<name>/<name>.go`
|
2022-01-11 14:28:21 +03:00
|
|
|
- Cleanup and fill in the register struct, rename `format.BSON` and add it
|
|
|
|
to `format/fromat.go` and don't forget to change the string constant.
|
2022-01-09 18:57:11 +03:00
|
|
|
- Add an import to `format/all/all.go`
|
|
|
|
|
2022-01-11 14:28:21 +03:00
|
|
|
### Some general tips
|
2020-06-08 03:29:51 +03:00
|
|
|
|
2022-01-11 14:28:21 +03:00
|
|
|
- Main goal is to produce a tree structure that is user-friendly and easy to work with.
|
|
|
|
Prefer a nice and easy query tree structure over nice decoder implementation.
|
|
|
|
- Use same names, symbols, constant number bases etc as in specification.
|
|
|
|
But maybe in lowercase to be jq/JSON-ish.
|
2022-01-09 18:57:11 +03:00
|
|
|
- Decode only ranges you know what they are. If possible let "parent" decide what to do with unknown
|
2022-01-11 14:28:21 +03:00
|
|
|
bits by using `*Decode*Len/Range/Limit` functions. fq will also automatically add "unknown" fields if
|
|
|
|
it finds gaps.
|
|
|
|
- Try to not decode too much as one value.
|
2021-12-04 18:48:59 +03:00
|
|
|
A length encoded int could be two fields, but maybe a length prefixed string should be one.
|
|
|
|
Flags can be struct with bit-fields.
|
2022-01-11 14:28:21 +03:00
|
|
|
- Map as many value as possible to more symbolic values.
|
2022-01-09 18:57:11 +03:00
|
|
|
- Endian is inherited inside one format decoder, defaults to big endian for new format decoder
|
2022-01-11 14:28:21 +03:00
|
|
|
- Make sure zero length or no frames found etc fails decoding
|
|
|
|
- If format is in the probe group make sure to validate input to make it non-ambiguous with other decoders
|
2021-12-04 18:48:59 +03:00
|
|
|
- Try keep decoder code as declarative as possible
|
|
|
|
- Split into multiple sub formats if possible. Makes it possible to use them separately.
|
|
|
|
- Validate/Assert
|
|
|
|
- Error/Fatal/panic
|
2021-12-31 19:13:16 +03:00
|
|
|
- Is format probeable or not
|
2021-12-04 18:48:59 +03:00
|
|
|
- Can new formats be added to other formats
|
2021-12-31 19:13:16 +03:00
|
|
|
- Does the new format include existing formats
|
|
|
|
|
2022-01-11 14:28:21 +03:00
|
|
|
### Development tips
|
2021-12-31 19:13:16 +03:00
|
|
|
|
2022-01-11 14:28:21 +03:00
|
|
|
I ususally use `-d <format>` and `v` while developing, that way you will get a decode tree
|
|
|
|
even if it fails. `v` gives verbose output and also includes stacktrace.
|
2021-12-31 19:13:16 +03:00
|
|
|
|
2022-01-11 14:28:21 +03:00
|
|
|
```sh
|
|
|
|
go run fq.go -d <format> v file
|
|
|
|
```
|
2020-06-08 03:29:51 +03:00
|
|
|
|
2022-01-11 14:28:21 +03:00
|
|
|
If the format is inside some other format it can be handy to first extract the bits and run
|
|
|
|
the decode directly. For example if working a `aac_frame` decoder issue:
|
2020-06-08 03:29:51 +03:00
|
|
|
|
2021-12-04 18:48:59 +03:00
|
|
|
```sh
|
2022-01-11 14:28:21 +03:00
|
|
|
fq '.tracks[0].samples[1234] | tobytes' file.mp4 > aac_frame_1234
|
|
|
|
fq -d aac_frame v aac_frame_1234
|
|
|
|
```
|
|
|
|
|
|
|
|
Sometimes nested decoding fails then maybe a good way is to change the parent decoder to
|
|
|
|
use `d.RawLen()` etc instead of `d.FormatLen()` etc temporary to extract the bits. Hopefully
|
|
|
|
there will be some option to do this in the future.
|
|
|
|
|
|
|
|
When researching or investinging something I can recommend to use `watchexec`, `modd` etc to
|
|
|
|
make things more comfortable. Also using vscode/delve for debugging should work fine once
|
|
|
|
launch `args` are setup etc.
|
|
|
|
|
|
|
|
```
|
|
|
|
watchexec "go run fq.go -d aac_frame v aac_frame"
|
|
|
|
```
|
|
|
|
|
|
|
|
Some different ways to run tests:
|
|
|
|
```sh
|
|
|
|
# run all tests
|
|
|
|
make test
|
|
|
|
# run all go tests
|
|
|
|
go test ./...
|
2021-12-31 19:13:16 +03:00
|
|
|
# run all tests for one format
|
|
|
|
go test -run TestFQTests/mp4 ./format/
|
|
|
|
# write all actual outputs
|
|
|
|
make actual
|
2022-01-11 14:28:21 +03:00
|
|
|
# write actual output for specific tests
|
2021-12-31 19:13:16 +03:00
|
|
|
WRITE_ACTUAL=1 go run -run ...
|
2022-01-11 14:28:21 +03:00
|
|
|
# color diff
|
|
|
|
DIFF_COLOR=1 go test ...
|
|
|
|
```
|
|
|
|
|
|
|
|
To lint source use:
|
|
|
|
```
|
|
|
|
make lint
|
2021-12-04 18:48:59 +03:00
|
|
|
```
|
2020-06-08 03:29:51 +03:00
|
|
|
|
2022-01-11 14:28:21 +03:00
|
|
|
Generate documentation. Requires [FFmpeg](https://github.com/FFmpeg/FFmpeg) and [Graphviz](https://gitlab.com/graphviz/graphviz):
|
|
|
|
```sh
|
|
|
|
make doc
|
|
|
|
```
|
|
|
|
|
|
|
|
TODO: `make fuzz`
|
|
|
|
|
2021-08-27 10:47:43 +03:00
|
|
|
## Debug
|
|
|
|
|
2021-12-04 18:48:59 +03:00
|
|
|
Split debug and normal output even when using repl:
|
|
|
|
|
|
|
|
Write `log` package output and stderr to a file that can be `tail -f`:ed in another terminal:
|
2021-08-27 10:47:43 +03:00
|
|
|
```sh
|
2021-12-03 12:35:52 +03:00
|
|
|
LOGFILE=/tmp/log go run fq.go ... 2>>/tmp/log
|
2021-08-27 10:47:43 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
gojq execution debug:
|
|
|
|
```sh
|
2021-12-03 12:35:52 +03:00
|
|
|
GOJQ_DEBUG=1 go run -tags debug fq.go ...
|
2021-08-27 10:47:43 +03:00
|
|
|
```
|
2021-11-06 01:52:31 +03:00
|
|
|
|
2021-12-04 18:48:59 +03:00
|
|
|
Memory and CPU profile (will open a browser):
|
|
|
|
```sh
|
|
|
|
make memprof ARGS=". file"
|
|
|
|
make cpuprof ARGS=". test.mp3"
|
|
|
|
```
|
|
|
|
|
2021-11-06 01:52:31 +03:00
|
|
|
## From start to decoded value
|
|
|
|
|
|
|
|
```
|
|
|
|
main:main()
|
|
|
|
cli.Main(default registry)
|
|
|
|
interp.New(registry, std os interp implementation)
|
|
|
|
interp.(*Interp).Main()
|
|
|
|
interp.jq _main/0:
|
|
|
|
args.jq _args_parse/2
|
|
|
|
populate filenames for input/0
|
|
|
|
interp.jq inputs/0
|
|
|
|
foreach valid input/0 output
|
|
|
|
interp.jq open
|
|
|
|
funcs.go _open
|
|
|
|
interp.jq decode
|
|
|
|
funcs.go _decode
|
|
|
|
decode.go Decode(...)
|
|
|
|
...
|
|
|
|
interp.jq eval expr
|
|
|
|
funcs.go _eval
|
|
|
|
interp.jq display
|
|
|
|
funcs.go _display
|
|
|
|
for interp.(decodeValueBase).Display()
|
|
|
|
dump.go
|
|
|
|
print tree
|
|
|
|
empty output
|
|
|
|
```
|
2021-11-21 23:55:53 +03:00
|
|
|
|
|
|
|
## bitio and other io packages
|
|
|
|
|
|
|
|
```
|
|
|
|
*os.File, *bytes.Buffer
|
|
|
|
^
|
|
|
|
ctxreadseeker.Reader defers blocking io operations to a goroutine to make them cancellable
|
|
|
|
^
|
|
|
|
progressreadseeker.Reader approximates how much of a file has been read
|
|
|
|
^
|
|
|
|
aheadreadseeker.Reader does readahead caching
|
|
|
|
^
|
|
|
|
| (io.ReadSeeker interface)
|
|
|
|
|
|
|
|
|
bitio.Reader (implements bitio.Bit* interfaces)
|
|
|
|
^
|
|
|
|
| (bitio.Bit* interfaces)
|
|
|
|
|
|
|
|
|
bitio.Buffer convenience wrapper to read bytes from bit reader, create section readers etc
|
|
|
|
SectionBitReader
|
|
|
|
MultiBitReader
|
|
|
|
```
|
2021-12-04 18:48:59 +03:00
|
|
|
|
|
|
|
## jq oddities
|
|
|
|
|
|
|
|
```
|
|
|
|
jq -n '[1,2,3,4] | .[null:], .[null:2], .[2:null], .[:null]'
|
|
|
|
```
|
2022-01-05 00:14:57 +03:00
|
|
|
|
|
|
|
## Setup docker desktop with golang windows container
|
|
|
|
|
|
|
|
```sh
|
|
|
|
git clone https://github.com/StefanScherer/windows-docker-machine.git
|
|
|
|
cd windows-docker-machine
|
|
|
|
vagrant up 2016-box
|
|
|
|
cd ../fq
|
|
|
|
docker --context 2016-box run --rm -ti -v "C:${PWD//\//\\}:C:${PWD//\//\\}" -w "$PWD" golang:1.17.5-windowsservercore-ltsc2016
|
2022-01-06 11:57:47 +03:00
|
|
|
```
|
|
|
|
|
2022-01-12 20:28:02 +03:00
|
|
|
## Implementation details
|
2022-01-11 14:28:21 +03:00
|
|
|
|
|
|
|
- fq uses a gojq fork that can be found at https://github.com/wader/gojq/tree/fq (the "fq" branch)
|
|
|
|
- fq uses a readline fork that can be found at https://github.com/wader/readline/tree/fq (the "fq" branch)
|
|
|
|
- cli readline uses raw mode so blocks ctrl-c to become a SIGINT
|
|
|
|
|
2022-01-12 20:28:02 +03:00
|
|
|
## Dependencies and source origins
|
|
|
|
|
|
|
|
- [gojq](https://github.com/itchyny/gojq) fork that can be found at https://github.com/wader/gojq/tree/fq<br>
|
|
|
|
Issues and PR:s related to fq:<br>
|
|
|
|
[#43](https://github.com/itchyny/gojq/issues/43) Support for functions written in go when used as a library<br>
|
|
|
|
[#46](https://github.com/itchyny/gojq/pull/46) Support custom internal functions<br>
|
|
|
|
[#56](https://github.com/itchyny/gojq/issues/56) String format query with no operator using %#v or %#+v panics
|
|
|
|
[#65](https://github.com/itchyny/gojq/issues/65) Try-catch with custom function<br>
|
|
|
|
[#67](https://github.com/itchyny/gojq/pull/67) Add custom iterator function support which enables implementing a REPL in jq<br>
|
|
|
|
[#81](https://github.com/itchyny/gojq/issues/81) path/1 behaviour and path expression question<br>
|
|
|
|
[#86](https://github.com/itchyny/gojq/issues/86) ER: basic TCO
|
|
|
|
[#109](https://github.com/itchyny/gojq/issues/109) jq halt_error behaviour difference<br>
|
|
|
|
[#113](https://github.com/itchyny/gojq/issues/113) error/0 and error/1 behavior difference<br>
|
|
|
|
[#117](https://github.com/itchyny/gojq/issues/117) Negative number modulus *big.Int behaves differently to int<br>
|
|
|
|
[#118](https://github.com/itchyny/gojq/issues/118) Regression introduced by "remove fork analysis from tail call optimization (ref #86)"<br>
|
|
|
|
[#122](https://github.com/itchyny/gojq/issues/122) Slow performance for large error values that ends up using typeErrorPreview()<br>
|
|
|
|
[#125](https://github.com/itchyny/gojq/pull/125) improve performance of join by make it internal<br>
|
|
|
|
[#141](https://github.com/itchyny/gojq/issues/141) Empty array flatten regression since "improve flatten performance by reducing copy"
|
|
|
|
|
|
|
|
- [readline](https://github.com/chzyer/readline) fork that can be found at https://github.com/wader/readline/tree/fq
|
|
|
|
- [gopacket](https://github.com/google/gopacket) for TCP and IPv4 reassembly
|
|
|
|
- [mapstructure](https://github.com/mitchellh/mapstructure) for convenient JSON/map conversion
|
|
|
|
- [go-difflib](https://github.com/pmezard/go-difflib) for diff tests
|
|
|
|
- [golang.org/x/text](https://pkg.go.dev/golang.org/x/text) for text encoding conversions
|
|
|
|
- [float16.go](https://android.googlesource.com/platform/tools/gpu/+/gradle_2.0.0/binary/float16.go) to convert bits into 16-bit floats
|
|
|
|
|
2022-01-06 11:57:47 +03:00
|
|
|
## Release process
|
|
|
|
|
|
|
|
Run and follow instructions:
|
|
|
|
```
|
|
|
|
make release=1.2.3
|
|
|
|
```
|