# sq: swiss-army knife for data `sq` is a command line tool that provides `jq`-style access to structured data sources: SQL databases, or document formats like CSV or Excel. `sq` can perform cross-source joins, execute database-native SQL, and output to a multitude of formats including JSON, Excel, CSV, HTML, Markdown and XML, or insert directly to a SQL database. `sq` can also inspect sources to view metadata about the source structure (tables, columns, size) and has commands for common database operations such as copying or dropping tables. ## Install For other installation options, see [here](https://github.com/neilotoole/sq/wiki#install). It is strongly advised to install [shell completion](#shell-completion). ### macOS ```shell brew install neilotoole/sq/sq ``` ### Windows ``` scoop bucket add sq https://github.com/neilotoole/sq scoop install sq ``` ### Linux #### install.sh The easiest method is to use [install.sh](./install.sh): ```shell /bin/sh -c "$(curl -fsSL https://sq.io/install.sh)" ``` The script detects if any of `apt`, `yum`, `apk` or `brew` are installed, and then installs via the usual procedure. > Note that `https://sq.io/install.sh` is simply a redirect > to [https://raw.githubusercontent.com/neilotoole/sq/master/install.sh](https://raw.githubusercontent.com/neilotoole/sq/master/install.sh). You can of course directly use `apt`, `yum` etc. if desired. See the wiki for [more installation options](https://github.com/neilotoole/sq/wiki#install). ## Shell completion Shell completion is available for `bash`, `zsh`, `fish`, and `powershell`. It is strongly recommended to install it. Execute `sq completion --help` for the procedure. ## Quickstart Use `sq help` to see command help. The [tutorial](https://github.com/neilotoole/sq/wiki/Tutorial) is the best place to start. The [cookbook](https://github.com/neilotoole/sq/wiki/Cookbook) has recipes for common actions. The major concept is: `sq` operates on data sources, which are treated as SQL databases (even if the source is really a CSV or XLSX file etc). In a nutshell, you `sq add` a source (giving it a `handle`), and then execute commands against the source. ### Sources Initially there are no sources. ```shell $ sq ls ``` Let's add a source. First we'll add a SQLite database, but this could also be Postgres, SQL Server, Excel, etc. Download the sample DB, and `sq add` the source. We use `-h` to specify a _handle_ to use. ```shell $ wget https://sq.io/testdata/sakila.db $ sq add ./sakila.db -h @sakila_sl3 @sakila_sl3 sqlite3 sakila.db $ sq ls -v HANDLE DRIVER LOCATION OPTIONS @sakila_sl3* sqlite3 sqlite3:/root/sakila.db $ sq ping @sakila_sl3 @sakila_sl3 1ms pong $ sq src @sakila_sl3 sqlite3 sakila.db ``` The `sq ping` command simply pings the source to verify that it's available. `sq src` lists the _active source_, which in our case is `@sakila_sl3`. You can change the active source using `sq src @other_src`. When there's an active source specified, you can usually omit the handle from `sq` commands. Thus you could instead do: ```shell $ sq ping @sakila_sl3 1ms pong ``` ### Query Fundamentally, `sq` is for querying data. Using our jq-style syntax: ```shell $ sq '.actor | .actor_id < 100 | .[0:3]' actor_id first_name last_name last_update 1 PENELOPE GUINESS 2020-02-15T06:59:28Z 2 NICK WAHLBERG 2020-02-15T06:59:28Z 3 ED CHASE 2020-02-15T06:59:28Z ``` The above query selected some rows from the `actor` table. You could also use native SQL, e.g.: ```shell $ sq sql 'SELECT * FROM actor WHERE actor_id < 100 LIMIT 3' actor_id first_name last_name last_update 1 PENELOPE GUINESS 2020-02-15T06:59:28Z 2 NICK WAHLBERG 2020-02-15T06:59:28Z 3 ED CHASE 2020-02-15T06:59:28Z ``` But we're flying a bit blind here: how did we know about the `actor` table? ### Inspect `sq inspect` is your friend (output abbreviated): ```shell HANDLE DRIVER NAME FQ NAME SIZE TABLES LOCATION @sakila_sl3 sqlite3 sakila.db sakila.db/main 5.6MB 21 sqlite3:/Users/neilotoole/work/sq/sq/drivers/sqlite3/testdata/sakila.db TABLE ROWS COL NAMES actor 200 actor_id, first_name, last_name, last_update address 603 address_id, address, address2, district, city_id, postal_code, phone, last_update category 16 category_id, name, last_update ``` Use the `--verbose` (`-v`) flag to see more detail. And use `--json` (`-j`) to output in JSON ( output abbreviated): ```shell $ sq inspect -j { "handle": "@sakila_sl3", "name": "sakila.db", "driver": "sqlite3", "db_version": "3.31.1", "location": "sqlite3:///root/sakila.db", "size": 5828608, "tables": [ { "name": "actor", "table_type": "table", "row_count": 200, "columns": [ { "name": "actor_id", "position": 0, "primary_key": true, "base_type": "numeric", "column_type": "numeric", "kind": "decimal", "nullable": false } ``` Combine `sq inspect` with [jq](https://stedolan.github.io/jq/) for some useful capabilities. Here's how to [list](https://github.com/neilotoole/sq/wiki/Cookbook#list-name-of-each-table-in-a-source) all the table names in the active source: ```shell $ sq inspect -j | jq -r '.tables[] | .name' actor address category city country customer [...] ``` And here's how you could [export](https://github.com/neilotoole/sq/wiki/Cookbook#export-all-tables-to-csv) each table to a CSV file: ```shell $ sq inspect -j | jq -r '.tables[] | .name' | xargs -I % sq .% --csv --output %.csv $ ls actor.csv city.csv customer_list.csv film_category.csv inventory.csv rental.csv staff.csv address.csv country.csv film.csv film_list.csv language.csv sales_by_film_category.csv staff_list.csv category.csv customer.csv film_actor.csv film_text.csv payment.csv sales_by_store.csv store.csv ``` Note that you can also inspect an individual table: ```shell $ sq inspect -v @sakila_sl3.actor TABLE ROWS TYPE SIZE NUM COLS COL NAMES COL TYPES actor 200 table - 4 actor_id, first_name, last_name, last_update numeric, VARCHAR(45), VARCHAR(45), TIMESTAMP ``` ### Insert Output Into Database Source `sq` query results can be output in various formats (JSON, XML, CSV, etc), and can also be " outputted" as an *insert* into database sources. That is, you can use `sq` to insert results from a Postgres query into a MySQL table, or copy an Excel worksheet into a SQLite table, or a push a CSV file into a SQL Server table etc. > **Note:** If you want to copy a table inside the same (database) source, use `sq tbl copy` > instead, which uses the database's native table copy functionality. For this example, we'll insert an Excel worksheet into our `@sakila_sl3` SQLite database. First, we download the XLSX file, and `sq add` it as a source. ```shell $ wget https://sq.io/testdata/xl_demo.xlsx $ sq add ./xl_demo.xlsx --opts header=true @xl_demo_xlsx xlsx xl_demo.xlsx $ sq @xl_demo_xlsx.person uid username email address_id 1 neilotoole neilotoole@apache.org 1 2 ksoze kaiser@soze.org 2 3 kubla kubla@khan.mn NULL [...] ``` Now, execute the same query, but this time `sq` inserts the results into a new table (`person`) in `@sakila_sl3`: ```shell $ sq @xl_demo_xlsx.person --insert @sakila_sl3.person Inserted 7 rows into @sakila_sl3.person $ sq inspect -v @sakila_sl3.person TABLE ROWS TYPE SIZE NUM COLS COL NAMES COL TYPES person 7 table - 4 uid, username, email, address_id INTEGER, TEXT, TEXT, INTEGER $ sq @sakila_sl3.person uid username email address_id 1 neilotoole neilotoole@apache.org 1 2 ksoze kaiser@soze.org 2 3 kubla kubla@khan.mn NULL [...] ``` ### Cross-Source Join `sq` has rudimentary support for cross-source joins. That is, you can join an Excel worksheet with a CSV file, or Postgres table, etc. > **Note:** The current mechanism for these joins is highly naive: `sq` copies the joined table from > each source to a "scratch database" (SQLite by default), and then performs the JOIN using the > scratch database's SQL interface. Thus, performance is abysmal for larger tables. There are massive > optimizations to be made, but none have been implemented yet. See the [tutorial](https://github.com/neilotoole/sq/wiki/Tutorial#join) for further details, but given an Excel source `@xl_demo` and a CSV source `@csv_demo`, you can do: ```shell $ sq '@csv_demo.data, @xl_demo.address | join(.D == .address_id) | .C, .city' C city neilotoole@apache.org Washington kaiser@soze.org Ulan Bator nikola@tesla.rs Washington augustus@caesar.org Ulan Bator plato@athens.gr Washington ``` ### Table Commands `sq` provides several handy commands for working with tables. Note that these commands work directly against SQL database sources, using their native SQL commands. ```shell $ sq tbl copy .actor .actor_copy Copied table: @sakila_sl3.actor --> @sakila_sl3.actor_copy (200 rows copied) $ sq tbl truncate .actor_copy Truncated 200 rows from @sakila_sl3.actor_copy $ sq tbl drop .actor_copy Dropped table @sakila_sl3.actor_copy ``` ### UNIX Pipes For file-based sources (such as CSV or XLSX), you can `sq add` the source file, but you can also pipe it: ```shell $ cat ./example.xlsx | sq .Sheet1 ``` Similarly, you can inspect: ```shell $ cat ./example.xlsx | sq inspect ``` ## Data Source Drivers `sq` knows how to deal with a data source type via a _driver_ implementation. To view the installed/supported drivers: ```shell $ sq driver ls DRIVER DESCRIPTION sqlite3 SQLite postgres PostgreSQL sqlserver Microsoft SQL Server / Azure SQL Edge mysql MySQL csv Comma-Separated Values tsv Tab-Separated Values json JSON jsona JSON Array: LF-delimited JSON arrays jsonl JSON Lines: LF-delimited JSON objects xlsx Microsoft Excel XLSX ``` ## Output Formats `sq` has many output formats: - `--table`: Text/Table - `--json`: JSON - `--jsona`: JSON Array - `--jsonl`: JSON Lines - `--csv` / `--tsv` : CSV / TSV - `--xlsx`: XLSX (Microsoft Excel) - `--html`: HTML - `--xml`: XML - `--markdown`: Markdown - `--raw`: Raw (bytes) ## Changelog See [CHANGELOG.md](./CHANGELOG.md). ## Acknowledgements - Thanks to [Diego Souza](https://github.com/diegosouza) for creating the [Arch Linux package](https://aur.archlinux.org/packages/sq-bin). - Much inspiration is owed to [jq](https://stedolan.github.io/jq/). - See [`go.mod`](https://github.com/neilotoole/sq/blob/master/go.mod) for a list of third-party packages. - Additionally, `sq` incorporates modified versions of: - [`olekukonko/tablewriter`](https://github.com/olekukonko/tablewriter) - [`segmentio/encoding`](https://github.com/segmentio/encoding) for JSON encoding. - The [_Sakila_](https://dev.mysql.com/doc/sakila/en/) example databases were lifted from [jOOQ](https://github.com/jooq/jooq), which in turn owe their heritage to earlier work on Sakila. ## Similar / Related / Noteworthy Projects - [usql](https://github.com/xo/usql) - [textql](https://github.com/dinedal/textql) - [golang-migrate](https://github.com/golang-migrate/migrate) - [octosql](https://github.com/cube2222/octosql) - [rq](https://github.com/dflemstr/rq)