html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/446#issuecomment-1162234441,https://api.github.com/repos/simonw/sqlite-utils/issues/446,1162234441,IC_kwDOCGYnMM5FRkpJ,9599,simonw,2022-06-21T19:28:35Z,2022-06-21T19:28:35Z,OWNER,"`just -l` now does this: ``` % just -l Available recipes: black # Apply Black cog # Rebuild docs with cog default # Run tests and linters lint # Run linters: black, flake8, mypy, cog test *options # Run pytest with supplied options ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1277328147,Use Just to automate running tests and linters locally, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162231111,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1162231111,IC_kwDOCGYnMM5FRj1H,9599,simonw,2022-06-21T19:25:44Z,2022-06-21T19:25:44Z,OWNER,Pushed that prototype to a branch.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162223668,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1162223668,IC_kwDOCGYnMM5FRiA0,9599,simonw,2022-06-21T19:19:22Z,2022-06-21T19:22:15Z,OWNER,"Built a prototype of `--fast` for the `sqlite-utils memory` command: ``` % time sqlite-utils memory taxi.csv 'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count' --fast passenger_count COUNT(*) AVG(total_amount) --------------- -------- ----------------- 128020 32.2371511482553 0 42228 17.0214016766151 1 1533197 17.6418833067999 2 286461 18.0975870711456 3 72852 17.9153958710923 4 25510 18.452774990196 5 50291 17.2709248175672 6 32623 17.6002964166367 7 2 87.17 8 2 95.705 9 1 113.6 sqlite-utils memory taxi.csv --fast 12.71s user 0.48s system 104% cpu 12.627 total ``` Takes 13s - about the same time as calling `sqlite3 :memory: ...` directly as seen in https://til.simonwillison.net/sqlite/one-line-csv-operations Without the `--fast` option that takes several minutes (262s = 4m20s)! Here's the prototype so far: ```diff diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py index 86eddfb..1c83ef6 100644 --- a/sqlite_utils/cli.py +++ b/sqlite_utils/cli.py @@ -14,6 +14,8 @@ import io import itertools import json import os +import shutil +import subprocess import sys import csv as csv_std import tabulate @@ -1669,6 +1671,7 @@ def query( is_flag=True, help=""Analyze resulting tables and output results"", ) +@click.option(""--fast"", is_flag=True, help=""Fast mode, only works with CSV and TSV"") @load_extension_option def memory( paths, @@ -1692,6 +1695,7 @@ def memory( save, analyze, load_extension, + fast, ): """"""Execute SQL query against an in-memory database, optionally populated by imported data @@ -1719,6 +1723,22 @@ def memory( \b sqlite-utils memory animals.csv --schema """""" + if fast: + if ( + attach + or flatten + or param + or encoding + or no_detect_types + or analyze + or load_extension + ): + raise click.ClickException( + ""--fast mode does not support any of the following options: --attach, --flatten, --param, --encoding, --no-detect-types, --analyze, --load-extension"" + ) + # TODO: Figure out and pass other supported options + memory_fast(paths, sql) + return db = sqlite_utils.Database(memory=True) # If --dump or --save or --analyze used but no paths detected, assume SQL query is a path: if (dump or save or schema or analyze) and not paths: @@ -1791,6 +1811,33 @@ def memory( ) +def memory_fast(paths, sql): + if not shutil.which(""sqlite3""): + raise click.ClickException(""sqlite3 not found in PATH"") + args = [""sqlite3"", "":memory:"", ""-cmd"", "".mode csv""] + table_names = [] + + def name(path): + base_name = pathlib.Path(path).stem or ""t"" + table_name = base_name + prefix = 1 + while table_name in table_names: + prefix += 1 + table_name = ""{}_{}"".format(base_name, prefix) + return table_name + + for path in paths: + table_name = name(path) + table_names.append(table_name) + args.extend( + [""-cmd"", "".import {} {}"".format(pathlib.Path(path).resolve(), table_name)] + ) + + args.extend([""-cmd"", "".mode column""]) + args.append(sql) + subprocess.run(args) + + def _execute_query( db, sql, param, raw, table, csv, tsv, no_headers, fmt, nl, arrays, json_cols ): ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1162186856,https://api.github.com/repos/simonw/sqlite-utils/issues/447,1162186856,IC_kwDOCGYnMM5FRZBo,9599,simonw,2022-06-21T18:48:46Z,2022-06-21T18:48:46Z,OWNER,"That fixed it: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1278571700,Incorrect syntax highlighting in docs CLI reference, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162179354,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1162179354,IC_kwDOCGYnMM5FRXMa,9599,simonw,2022-06-21T18:44:03Z,2022-06-21T18:44:03Z,OWNER,The thing I like about that `--fast` option is that it could selectively use this alternative mechanism just for the files for which it can work (CSV and TSV files). I could also add a `--fast` option to `sqlite-utils memory` which could then kick in only for operations that involve just TSV and CSV files.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161869859,https://api.github.com/repos/simonw/sqlite-utils/issues/447,1161869859,IC_kwDOCGYnMM5FQLoj,9599,simonw,2022-06-21T15:00:42Z,2022-06-21T15:00:42Z,OWNER,Deploying that to https://sqlite-utils.datasette.io/en/latest/cli-reference.html#insert,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1278571700,Incorrect syntax highlighting in docs CLI reference, https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161857806,https://api.github.com/repos/simonw/sqlite-utils/issues/447,1161857806,IC_kwDOCGYnMM5FQIsO,9599,simonw,2022-06-21T14:55:51Z,2022-06-21T14:58:14Z,OWNER,"https://stackoverflow.com/a/44379513 suggests that the fix is: .. code-block:: text Or set this in `conf.py`: highlight_language = ""none"" I like that better - I don't like that all `::` blocks default to being treated as Python code.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1278571700,Incorrect syntax highlighting in docs CLI reference, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1161849874,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1161849874,IC_kwDOCGYnMM5FQGwS,9599,simonw,2022-06-21T14:49:12Z,2022-06-21T14:49:12Z,OWNER,"Since there are all sorts of existing options for `sqlite-utils insert` that won't work with this, maybe it would be better to have an entirely separate command - this for example: sqlite-utils fast-insert data.db mytable data.csv ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1160991031,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1160991031,IC_kwDOCGYnMM5FM1E3,9599,simonw,2022-06-21T00:35:20Z,2022-06-21T00:35:20Z,OWNER,Relevant TIL: https://til.simonwillison.net/sqlite/one-line-csv-operations,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-882052693,https://api.github.com/repos/simonw/sqlite-utils/issues/297,882052693,IC_kwDOCGYnMM40kw5V,9599,simonw,2021-07-18T12:57:54Z,2022-06-21T13:17:15Z,OWNER,"Another implementation option would be to use the CSV virtual table mechanism. This could avoid shelling out to the `sqlite3` binary, but requires solving the harder problem of compiling and distributing a loadable SQLite module: https://www.sqlite.org/csv.html (Would be neat to produce a Python wheel of this, see https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/) This would also help solve the challenge of making this optimization available to the `sqlite-utils memory` command. That command operates against an in-memory database so it's not obvious how it could shell out to a binary.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism,