{"html_url": "https://github.com/simonw/sqlite-utils/pull/486#issuecomment-1248527646", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/486", "id": 1248527646, "node_id": "IC_kwDOCGYnMM5KawUe", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2022-09-15T19:34:59Z", "updated_at": "2022-09-15T20:23:12Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/sqlite-utils/pull/486?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\nBase: **96.47**% // Head: **96.52**% // Increases project coverage by **`+0.04%`** :tada:\n> Coverage data is based on head [(`0acbc68`)](https://codecov.io/gh/simonw/sqlite-utils/pull/486?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) compared to base [(`d9b9e07`)](https://codecov.io/gh/simonw/sqlite-utils/commit/d9b9e075f07a20f1137cd2e34ed5d3f1a3db4ad8?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> Patch coverage: 100.00% of modified lines in pull request are covered.\n\n> :exclamation: Current head 0acbc68 differs from pull request most recent head d5db749. Consider uploading reports for the commit d5db749 to get more accurate results\n\n
Additional details and impacted files\n\n\n```diff\n@@ Coverage Diff @@\n## main #486 +/- ##\n==========================================\n+ Coverage 96.47% 96.52% +0.04% \n==========================================\n Files 6 6 \n Lines 2642 2646 +4 \n==========================================\n+ Hits 2549 2554 +5 \n+ Misses 93 92 -1 \n```\n\n\n| [Impacted Files](https://codecov.io/gh/simonw/sqlite-utils/pull/486?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage \u0394 | |\n|---|---|---|\n| [sqlite\\_utils/cli.py](https://codecov.io/gh/simonw/sqlite-utils/pull/486/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-c3FsaXRlX3V0aWxzL2NsaS5weQ==) | `95.86% <100.00%> (\u00f8)` | |\n| [sqlite\\_utils/utils.py](https://codecov.io/gh/simonw/sqlite-utils/pull/486/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-c3FsaXRlX3V0aWxzL3V0aWxzLnB5) | `94.98% <100.00%> (+0.47%)` | :arrow_up: |\n\nHelp us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n
\n\n[:umbrella: View full report at Codecov](https://codecov.io/gh/simonw/sqlite-utils/pull/486?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). \n:loudspeaker: Do you have feedback about the report comment? [Let us know in this issue](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366512990, "label": "progressbar for inserts/upserts of all fileformats, closes #485"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1810#issuecomment-1248204219", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1810", "id": 1248204219, "node_id": "IC_kwDOBm6k_c5KZhW7", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-09-15T14:44:47Z", "updated_at": "2022-09-15T14:46:26Z", "author_association": "CONTRIBUTOR", "body": "A couple+ of possible use case examples:\r\n\r\n- someone has a collection of articles indexed with FTS; they want to publish a simple search tool over the results;\r\n- someone has an image collection and they want to be able to search over description text to return images;\r\n- someone has a set of locations with descriptions, and wants to run a query over places and descriptions and get results as a listing or on a map;\r\n- someone has a set of audio or video files with titles, descriptions and/or transcripts, and wants to be able to search over them and return playable versions of returned items.\r\n\r\nIn many cases, I suspect the raw content will be in one table, but the search table will be a second (eg FTS) table. Generally, the search may be over one or more joined tables, and the results constructed from one or more tables (which may or may not be distinct from the search tables).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374626873, "label": "Featured table(s) on the homepage"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/485#issuecomment-1248597643", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/485", "id": 1248597643, "node_id": "IC_kwDOCGYnMM5KbBaL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:39:39Z", "updated_at": "2022-09-15T20:39:52Z", "author_association": "OWNER", "body": "A note from PR #486: https://github.com/simonw/sqlite-utils/issues/486#issuecomment-1248591268_\r\n\r\n> ```\r\n> sqlite-utils insert /tmp/t3.db t /tmp/big.json \r\n> [####################################] 100%\r\n> ```\r\n> This is actually not doing the right thing. The problem is that `sqlite-utils` doesn't include a streaming JSON parser, so it instead reads that entire JSON file into memory first (exhausting the progress bar to 100% instantly) and then does the rest of the work in-memory while the bar sticks at 100%.\r\n\r\nI decided to land this anyway. If a streaming JSON parser is added later it will start to work.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366423176, "label": "Progressbar not shown when inserting/upserting jsonlines file"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248484094", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248484094, "node_id": "IC_kwDOCGYnMM5Kalr-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T18:56:31Z", "updated_at": "2022-09-15T18:56:31Z", "author_association": "OWNER", "body": "Actually I quite like `--key X` - it could work for single nested objects too. You could insert a single record like this:\r\n\r\n```json\r\n{\r\n \"record\" {\r\n \"id\": 1\r\n }\r\n}\r\n```\r\n```\r\nsqlite-utils insert db.db records record.json --key record\r\n``` ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/486#issuecomment-1248565396", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/486", "id": 1248565396, "node_id": "IC_kwDOCGYnMM5Ka5iU", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:12:50Z", "updated_at": "2022-09-15T20:12:50Z", "author_association": "OWNER", "body": "Annoying `mypy` test failure:\r\n\r\n```\r\n/Users/runner/hostedtoolcache/Python/3.10.7/x64/lib/python3.10/site-packages/numpy/__init__.pyi:636:\r\nerror: Positional-only parameters are only supported in Python 3.8 and greater\r\n```\r\nLooks like this:\r\n- https://github.com/python/mypy/issues/13627", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366512990, "label": "progressbar for inserts/upserts of all fileformats, closes #485"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248474806", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248474806, "node_id": "IC_kwDOCGYnMM5Kaja2", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T18:48:09Z", "updated_at": "2022-09-15T18:48:09Z", "author_association": "OWNER", "body": "Built a prototype of this that works really well:\r\n```diff\r\n diff --git a/sqlite_utils/utils.py b/sqlite_utils/utils.py\r\nindex c0b7bf1..f9a482c 100644\r\n--- a/sqlite_utils/utils.py\r\n+++ b/sqlite_utils/utils.py\r\n@@ -272,7 +272,19 @@ def rows_from_file(\r\n if format == Format.JSON:\r\n decoded = json.load(fp)\r\n if isinstance(decoded, dict):\r\n- decoded = [decoded]\r\n+ # TODO: Solve for if this isn't what people want\r\n+ # Does it have just one key that is a list of dicts?\r\n+ list_keys = [\r\n+ k\r\n+ for k in decoded\r\n+ if isinstance(decoded[k], list)\r\n+ and decoded[k]\r\n+ and all(isinstance(o, dict) for o in decoded[k])\r\n+ ]\r\n+ if len(list_keys) == 1:\r\n+ decoded = decoded[list_keys[0]]\r\n+ else:\r\n+ decoded = [decoded]\r\n if not isinstance(decoded, list):\r\n raise RowsFromFileBadJSON(\"JSON must be a list or a dictionary\")\r\n return decoded, Format.JSON\r\n```\r\nI used that to build this: https://gist.github.com/simonw/0e6901974a14ab7d56c2746a04d72c8c\r\n\r\nOne problem though: right now, if you do this `sqlite-utils` treats it as a single object and adds a `tags` column with JSON in it:\r\n```\r\necho '{\"title\": \"Hi\", \"tags\": [{\"t\": \"one\"}]}` | sqlite-utils insert db.db t -\r\n```\r\nIf I implement this new mechanism the above line would behave differently - which would be a backwards incompatible change.\r\n\r\nSo I probably need some kind of opt-in mechanism for this. And I need a good name for it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248475718", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248475718, "node_id": "IC_kwDOCGYnMM5KajpG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T18:49:05Z", "updated_at": "2022-09-15T18:49:53Z", "author_association": "OWNER", "body": "Here's how I used my prototype to build [that Gist](https://gist.github.com/simonw/0e6901974a14ab7d56c2746a04d72c8c):\r\n\r\n sqlite-utils memory ~/Downloads/CVR_Export_20220908084311/*.json --schema > database.sql\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1810#issuecomment-1248290151", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1810", "id": 1248290151, "node_id": "IC_kwDOBm6k_c5KZ2Vn", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T15:51:04Z", "updated_at": "2022-09-15T15:51:25Z", "author_association": "OWNER", "body": "I could prototype this idea as a `datasette-featured-tables` plugin that delivers its own custom `index.html` template.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374626873, "label": "Featured table(s) on the homepage"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/486#issuecomment-1248591268", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/486", "id": 1248591268, "node_id": "IC_kwDOCGYnMM5Ka_2k", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:36:02Z", "updated_at": "2022-09-15T20:40:03Z", "author_association": "OWNER", "body": "I had a big CSV file lying around, I converted it to other formats like this:\r\n\r\n sqlite-utils insert /tmp/t.db t /tmp/en.openfoodfacts.org.products.csv --csv\r\n sqlite-utils rows /tmp/t.db t --nl > /tmp/big.nl\r\n sqlite-utils rows /tmp/t.db t > /tmp/big.json\r\n\r\nThen tested the progress bar like this:\r\n\r\n sqlite-utils insert /tmp/t2.db t /tmp/big.nl --nl\r\n\r\nOutput:\r\n\r\n```\r\nsqlite-utils insert /tmp/t2.db t /tmp/big.nl --nl\r\n [------------------------------------] 0%\r\n [#######-----------------------------] 20% 00:00:20\r\n```\r\nWith `--silent` it is silent.\r\n\r\nAnd for regular JSON:\r\n\r\n```\r\nsqlite-utils insert /tmp/t3.db t /tmp/big.json \r\n [####################################] 100%\r\n```\r\nThis is actually not doing the right thing. The problem is that `sqlite-utils` doesn't include a streaming JSON parser, so it instead reads that entire JSON file into memory first (exhausting the progress bar to 100% instantly) and then does the rest of the work in-memory while the bar sticks at 100%.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366512990, "label": "progressbar for inserts/upserts of all fileformats, closes #485"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248522618", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248522618, "node_id": "IC_kwDOCGYnMM5KavF6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T19:29:20Z", "updated_at": "2022-09-15T19:29:20Z", "author_association": "OWNER", "body": "I think refactoring `sqlite-utils insert` to use `rows_from_file` needs to happen as part of this work.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/406#issuecomment-1248440137", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/406", "id": 1248440137, "node_id": "IC_kwDOCGYnMM5Kaa9J", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-09-15T18:13:50Z", "updated_at": "2022-09-15T18:13:50Z", "author_association": "NONE", "body": "I was wondering if you have any more thoughts on this? I have a tangible use case now: adding a \"vector\" column to a database to support semantic search using doc2vec embeddings ([example](https://psychemedia.github.io/storynotes/Lang_Doc2Vec.html); note that the `vtfunc` package may no longer be reliable...).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1128466114, "label": "Creating tables with custom datatypes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/486#issuecomment-1248593835", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/486", "id": 1248593835, "node_id": "IC_kwDOCGYnMM5KbAer", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:37:14Z", "updated_at": "2022-09-15T20:37:14Z", "author_association": "OWNER", "body": "I'm going to land this anyway. The lack of a streaming JSON parser is a separate issue, I don't think it should block landing this improvement.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366512990, "label": "progressbar for inserts/upserts of all fileformats, closes #485"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1810#issuecomment-1248289857", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1810", "id": 1248289857, "node_id": "IC_kwDOBm6k_c5KZ2RB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T15:50:46Z", "updated_at": "2022-09-15T15:50:46Z", "author_association": "OWNER", "body": "Idea: allow the user to specify one or more featured tables. Each table is then shown as a summary on the homepage - with the total number of rows and the first 5 rows. If the table has search configured there's a search box too.\r\n\r\nIf the instance has only one database with only one table (excluding hidden tables) it gets featured automatically perhaps (maybe with a way to opt-out of that if you want to).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374626873, "label": "Featured table(s) on the homepage"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248481303", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248481303, "node_id": "IC_kwDOCGYnMM5KalAX", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T18:54:30Z", "updated_at": "2022-09-15T18:55:14Z", "author_association": "OWNER", "body": "Maybe this would make more sense as a mechanism where you can say \"Use the data in the key called X\" - but there's a special option for \"figure out that key automatically\".\r\n\r\nThe syntax then could be:\r\n\r\n`--list-key List`\r\n\r\nOr for automatic detection:\r\n\r\n`--list-key-auto`\r\n\r\nCould also go with `--key List` and `--key-auto` - but would that be as obvious as `--list-key`?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248621072", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248621072, "node_id": "IC_kwDOCGYnMM5KbHIQ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:56:09Z", "updated_at": "2022-09-15T20:56:09Z", "author_association": "OWNER", "body": "Prototype so far:\r\n```diff\r\ndiff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py\r\nindex 767b170..d96c507 100644\r\n--- a/sqlite_utils/cli.py\r\n+++ b/sqlite_utils/cli.py\r\n@@ -1762,6 +1762,17 @@ def query(\r\n is_flag=True,\r\n help=\"Analyze resulting tables and output results\",\r\n )\r\n+@click.option(\"--key\", help=\"read data from this key of the root object\")\r\n+@click.option(\r\n+ \"--auto-key\",\r\n+ is_flag=True,\r\n+ help=\"Find a key in the root object that is a list of objects\",\r\n+)\r\n+@click.option(\r\n+ \"--analyze\",\r\n+ is_flag=True,\r\n+ help=\"Analyze resulting tables and output results\",\r\n+)\r\n @load_extension_option\r\n def memory(\r\n paths,\r\n@@ -1784,6 +1795,8 @@ def memory(\r\n schema,\r\n dump,\r\n save,\r\n+ key,\r\n+ auto_key,\r\n analyze,\r\n load_extension,\r\n ):\r\n@@ -1838,7 +1851,9 @@ def memory(\r\n csv_table = stem\r\n stem_counts[stem] = stem_counts.get(stem, 1) + 1\r\n csv_fp = csv_path.open(\"rb\")\r\n- rows, format_used = rows_from_file(csv_fp, format=format, encoding=encoding)\r\n+ rows, format_used = rows_from_file(\r\n+ csv_fp, format=format, encoding=encoding, key=key, auto_key=auto_key\r\n+ )\r\n tracker = None\r\n if format_used in (Format.CSV, Format.TSV) and not no_detect_types:\r\n tracker = TypeTracker()\r\ndiff --git a/sqlite_utils/utils.py b/sqlite_utils/utils.py\r\nindex 8754554..2e69c26 100644\r\n--- a/sqlite_utils/utils.py\r\n+++ b/sqlite_utils/utils.py\r\n@@ -231,6 +231,8 @@ def rows_from_file(\r\n encoding: Optional[str] = None,\r\n ignore_extras: Optional[bool] = False,\r\n extras_key: Optional[str] = None,\r\n+ key: Optional[str] = None,\r\n+ auto_key: Optional[bool] = False,\r\n ) -> Tuple[Iterable[dict], Format]:\r\n \"\"\"\r\n Load a sequence of dictionaries from a file-like object containing one of four different formats.\r\n@@ -271,13 +273,31 @@ def rows_from_file(\r\n :param encoding: the character encoding to use when reading CSV/TSV data\r\n :param ignore_extras: ignore any extra fields on rows\r\n :param extras_key: put any extra fields in a list with this key\r\n+ :param key: read data from this key of the root object\r\n+ :param auto_key: find a key in the root object that is a list of objects\r\n \"\"\"\r\n if ignore_extras and extras_key:\r\n raise ValueError(\"Cannot use ignore_extras= and extras_key= together\")\r\n+ if key and auto_key:\r\n+ raise ValueError(\"Cannot use key= and auto_key= together\")\r\n if format == Format.JSON:\r\n decoded = json.load(fp)\r\n if isinstance(decoded, dict):\r\n- decoded = [decoded]\r\n+ if auto_key:\r\n+ list_keys = [\r\n+ k\r\n+ for k in decoded\r\n+ if isinstance(decoded[k], list)\r\n+ and decoded[k]\r\n+ and all(isinstance(o, dict) for o in decoded[k])\r\n+ ]\r\n+ if len(list_keys) == 1:\r\n+ decoded = decoded[list_keys[0]]\r\n+ elif key:\r\n+ # Raises KeyError, I think that's OK\r\n+ decoded = decoded[key]\r\n+ if not isinstance(decoded, list):\r\n+ decoded = [decoded]\r\n if not isinstance(decoded, list):\r\n raise RowsFromFileBadJSON(\"JSON must be a list or a dictionary\")\r\n return decoded, Format.JSON\r\n@@ -305,7 +325,9 @@ def rows_from_file(\r\n first_bytes = buffered.peek(2048).strip()\r\n if first_bytes.startswith(b\"[\") or first_bytes.startswith(b\"{\"):\r\n # TODO: Detect newline-JSON\r\n- return rows_from_file(buffered, format=Format.JSON)\r\n+ return rows_from_file(\r\n+ buffered, format=Format.JSON, key=key, auto_key=auto_key\r\n+ )\r\n else:\r\n dialect = csv.Sniffer().sniff(\r\n first_bytes.decode(encoding or \"utf-8-sig\", \"ignore\")\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248479485", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248479485, "node_id": "IC_kwDOCGYnMM5Kakj9", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T18:52:52Z", "updated_at": "2022-09-15T18:53:45Z", "author_association": "OWNER", "body": "The most similar option I have at the moment is probably `--flatten`. What would good names for this option be?\r\n\r\n- `--auto-list`\r\n- `--auto-key`\r\n- `--inner-key`\r\n- `--auto-json`\r\n- `--find-list`\r\n- `--find-key`\r\n\r\nThose are all bad.\r\n\r\nAnother option: introduce a new explicit format for it. Right now the explicit formats you can use are:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/d9b9e075f07a20f1137cd2e34ed5d3f1a3db4ad8/docs/cli-reference.rst#L153-L158\r\n\r\nSo I could add a `:autojson` format.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/486#issuecomment-1248567323", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/486", "id": 1248567323, "node_id": "IC_kwDOCGYnMM5Ka6Ab", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:14:45Z", "updated_at": "2022-09-15T20:14:45Z", "author_association": "OWNER", "body": "There's a fix for `mypy` that has landed but isn't out in a release yet:\r\n- https://github.com/python/mypy/issues/13385\r\n\r\nFor the moment looks like pinning to Python 3.10.6 could help. Need to figure out how to do that in GitHub Actions though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366512990, "label": "progressbar for inserts/upserts of all fileformats, closes #485"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248501824", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248501824, "node_id": "IC_kwDOCGYnMM5KaqBA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T19:10:48Z", "updated_at": "2022-09-15T19:10:48Z", "author_association": "OWNER", "body": "This feels pretty good:\r\n```\r\n% sqlite-utils memory ~/Downloads/CVR_Export_20220908084311/*.json --schema --auto-key\r\nCREATE TABLE [BallotTypeContestManifest] (\r\n [BallotTypeId] INTEGER,\r\n [ContestId] INTEGER\r\n);\r\nCREATE VIEW t1 AS select * from [BallotTypeContestManifest];\r\nCREATE VIEW t AS select * from [BallotTypeContestManifest];\r\nCREATE TABLE [BallotTypeManifest] (\r\n [Description] TEXT,\r\n [Id] INTEGER,\r\n [ExternalId] TEXT\r\n);\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/486#issuecomment-1248582147", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/486", "id": 1248582147, "node_id": "IC_kwDOCGYnMM5Ka9oD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:29:17Z", "updated_at": "2022-09-15T20:29:17Z", "author_association": "OWNER", "body": "This looks good to me. I need to run some manual tests before merging (it's a good sign that the automated tests pass though).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366512990, "label": "progressbar for inserts/upserts of all fileformats, closes #485"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1810#issuecomment-1248187089", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1810", "id": 1248187089, "node_id": "IC_kwDOBm6k_c5KZdLR", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T14:31:36Z", "updated_at": "2022-09-15T14:31:36Z", "author_association": "OWNER", "body": "Twitter conversation that inspired this issue: https://twitter.com/psychemedia/status/1570410108785684481", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374626873, "label": "Featured table(s) on the homepage"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/489#issuecomment-1248512739", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489", "id": 1248512739, "node_id": "IC_kwDOCGYnMM5Kasrj", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T19:18:24Z", "updated_at": "2022-09-15T19:21:01Z", "author_association": "OWNER", "body": "Why doesn't `sqlite-utils insert` use the `rows_from_file` function I wonder?\r\n\r\nhttps://github.com/simonw/sqlite-utils/issues/279#issuecomment-864207841 says:\r\n\r\n> I can refactor `sqlite-utils insert` to use this new code too.\r\n\r\nMaybe I forgot to do that?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374939463, "label": "Ability to load JSON records held in a file with a single top level key that is a list of objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/486#issuecomment-1248568775", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/486", "id": 1248568775, "node_id": "IC_kwDOCGYnMM5Ka6XH", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-09-15T20:16:14Z", "updated_at": "2022-09-15T20:16:14Z", "author_association": "OWNER", "body": "https://github.com/actions/setup-python/blob/main/docs/advanced-usage.md#using-the-python-version-input says can set the full version:\r\n\r\n```\r\n- uses: actions/setup-python@v4\r\n with:\r\n python-version: \"3.10.6\" \r\n```\r\nI'll try that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1366512990, "label": "progressbar for inserts/upserts of all fileformats, closes #485"}, "performed_via_github_app": null}