{"html_url": "https://github.com/simonw/sqlite-utils/pull/57#issuecomment-527211047", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/57", "id": 527211047, "node_id": "MDEyOklzc3VlQ29tbWVudDUyNzIxMTA0Nw==", "user": {"value": 49260, "label": "amjith"}, "created_at": "2019-09-02T17:30:43Z", "updated_at": "2019-09-02T17:30:43Z", "author_association": "CONTRIBUTOR", "body": "I have merged the other PR (#56) into this one. \r\n\r\nI have incorporated your suggestions. Cheers!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 487987958, "label": "Add triggers while enabling FTS"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/56#issuecomment-527209840", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/56", "id": 527209840, "node_id": "MDEyOklzc3VlQ29tbWVudDUyNzIwOTg0MA==", "user": {"value": 49260, "label": "amjith"}, "created_at": "2019-09-02T17:23:21Z", "updated_at": "2019-09-02T17:23:21Z", "author_association": "CONTRIBUTOR", "body": "I have updated the other PR with the changes from this one and added tests. I have also changed the escaping from double quotes to brackets. \r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 487847945, "label": "Escape the table name in populate_fts and search."}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/531#issuecomment-1501017004", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/531", "id": 1501017004, "node_id": "IC_kwDOCGYnMM5Zd7Os", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2023-04-09T01:49:43Z", "updated_at": "2023-04-09T01:49:43Z", "author_association": "CONTRIBUTOR", "body": "I'm going to close this in favor of #536. Will try a cleaner approach to custom paths once that one is merge.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1620164673, "label": "Add paths for homebrew on Apple silicon"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/531#issuecomment-1465315726", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/531", "id": 1465315726, "node_id": "IC_kwDOCGYnMM5XVvGO", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2023-03-12T22:21:56Z", "updated_at": "2023-03-12T22:21:56Z", "author_association": "CONTRIBUTOR", "body": "Exactly, that's what I was running into. On my M2 MacBook, SpatiaLite ends up in what is -- for the moment -- a non-standard location, so even when I passed in the location with `--load-extension`, I still hit an error on `create-spatial-index`.\r\n\r\nWhat I learned doing this originally is that SQLite needs to load the extension for each connection, even if all the SpatiaLite stuff is already in the database. So that's why `init_spatialite()` gets called again.\r\n\r\nHere's the code where I hit the error: https://github.com/eyeseast/boston-parcels/blob/main/Makefile#L30 It works using this branch.\r\n\r\nI'm not attached to this solution if you can think of something better. And I'm not sure, TBH, my test would actually catch what I'm after here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1620164673, "label": "Add paths for homebrew on Apple silicon"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/508#issuecomment-1297788531", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/508", "id": 1297788531, "node_id": "IC_kwDOCGYnMM5NWq5z", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-31T22:54:33Z", "updated_at": "2022-11-17T15:11:16Z", "author_association": "CONTRIBUTOR", "body": "Maybe this is actually a problem in the python sqlite bindings. Given [SQLITE's stance on this](https://www.sqlite.org/invalidutf.html) they should probably use `encode('utf-8', 'surrogatepass')`. As far as I understand the error here won't actually be resolved by this PR as-is. We would need to modify the data with `surrogateescape`... :/ or modify the sqlite3 module to use `surrogatepass`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1430563092, "label": "Allow surrogates in parameters"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/499#issuecomment-1292401308", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/499", "id": 1292401308, "node_id": "IC_kwDOCGYnMM5NCHqc", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-26T17:54:26Z", "updated_at": "2022-10-26T17:54:51Z", "author_association": "CONTRIBUTOR", "body": "The problem with how it is currently is that the transformed fts table _will_ return incorrect results (unless the table was only 1 row or something), even if create_triggers was enabled previously. Maybe the simplest solution is to disable fts on a transformed table rather than try to recreate it? Thoughts?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1405196044, "label": "feat: recreate fts triggers after table transform"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/498#issuecomment-1274153135", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/498", "id": 1274153135, "node_id": "IC_kwDOCGYnMM5L8giv", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-11T06:34:31Z", "updated_at": "2022-10-11T06:34:31Z", "author_association": "CONTRIBUTOR", "body": "nevermind it was because I was running `db[table].transform`. The fts tables would still be there but the triggers would be dropped", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1404013495, "label": "fix: enable-fts permanently save triggers"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/480#issuecomment-1232356302", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/480", "id": 1232356302, "node_id": "IC_kwDOCGYnMM5JdEPO", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-08-31T01:51:49Z", "updated_at": "2022-08-31T01:51:49Z", "author_association": "CONTRIBUTOR", "body": "Thanks for pointing me to the right place", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1355433619, "label": "search_sql add include_rank option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040998433", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/407", "id": 1040998433, "node_id": "IC_kwDOCGYnMM4-DGAh", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-16T01:29:39Z", "updated_at": "2022-02-16T01:29:39Z", "author_association": "CONTRIBUTOR", "body": "Happy to do it and have it in the library. Going to use it a bunch. This whole SpatiaLite toolchain become a huge part of my work in the past year.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1138948786, "label": "Add SpatiaLite helpers to CLI"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040580250", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/407", "id": 1040580250, "node_id": "IC_kwDOCGYnMM4-Bf6a", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-15T17:40:00Z", "updated_at": "2022-02-15T17:40:00Z", "author_association": "CONTRIBUTOR", "body": "@simonw I think this is ready for a look.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1138948786, "label": "Add SpatiaLite helpers to CLI"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1030002502", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/385", "id": 1030002502, "node_id": "IC_kwDOCGYnMM49ZJdG", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-04T13:50:19Z", "updated_at": "2022-02-04T13:50:19Z", "author_association": "CONTRIBUTOR", "body": "Awesome. Thanks for your help getting it in. Will now look at adding CLI versions of this. It's going to be super helpful on a bunch of my projects.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1102899312, "label": "Add new spatialite helper methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029370537", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/385", "id": 1029370537, "node_id": "IC_kwDOCGYnMM49WvKp", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-03T20:25:58Z", "updated_at": "2022-02-03T20:25:58Z", "author_association": "CONTRIBUTOR", "body": "OK, I moved all the GIS helpers into `db.py` as methods on `Database` and `Table`, and I put `find_spatialite` back in `utils.py`. I deleted `gis.py`, since there's nothing left it. Docs and tests are updated and passing.\r\n\r\nI think this is better.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1102899312, "label": "Add new spatialite helper methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029338360", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/385", "id": 1029338360, "node_id": "IC_kwDOCGYnMM49WnT4", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-03T19:43:56Z", "updated_at": "2022-02-03T19:43:56Z", "author_association": "CONTRIBUTOR", "body": "Works for me. I was just looking at how the FTS extensions work and they're just methods, too. So this can be consistent with that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1102899312, "label": "Add new spatialite helper methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029326568", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/385", "id": 1029326568, "node_id": "IC_kwDOCGYnMM49Wkbo", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-03T19:28:26Z", "updated_at": "2022-02-03T19:28:26Z", "author_association": "CONTRIBUTOR", "body": "> `from sqlite_utils.utils import find_spatialite` is part of the documented API already:\r\n> \r\n> https://sqlite-utils.datasette.io/en/3.22.1/python-api.html#finding-spatialite\r\n> \r\n> To avoid needing to bump the major version number to 4 to indicate a backwards incompatible change, we should keep a `from .gis import find_spatialite` line at the top of `utils.py` such that any existing code with that documented import continues to work.\r\n\r\nThis is fixed now. I had to take out the type annotations for `Database` and `Table` to avoid a circular import, but that's fine and may be moot if these become class methods.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1102899312, "label": "Add new spatialite helper methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029306428", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/385", "id": 1029306428, "node_id": "IC_kwDOCGYnMM49Wfg8", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-03T19:03:43Z", "updated_at": "2022-02-03T19:03:43Z", "author_association": "CONTRIBUTOR", "body": "I thought about adding these as methods on `Database` and `Table`, and I'm back and forth on it for the same reasons you are. It's certainly cleaner, and it's clearer what you're operating on. I could go either way. \r\n\r\nI do sort of like having all the Spatialite stuff in its own module, just because it's built around an extension you might not have or want, but I don't know if that's a good reason to have a different API.\r\n\r\nYou could have `init_spatialite` add methods to `Database` and `Table`, so they're only there if you have Spatialite set up. Is that too clever? It feels too clever.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1102899312, "label": "Add new spatialite helper methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029180984", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/385", "id": 1029180984, "node_id": "IC_kwDOCGYnMM49WA44", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-03T16:42:04Z", "updated_at": "2022-02-03T16:42:04Z", "author_association": "CONTRIBUTOR", "body": "Fixed my spelling. That's a useful thing.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1102899312, "label": "Add new spatialite helper methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029175907", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/385", "id": 1029175907, "node_id": "IC_kwDOCGYnMM49V_pj", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-03T16:36:54Z", "updated_at": "2022-02-03T16:36:54Z", "author_association": "CONTRIBUTOR", "body": "@simonw Not sure if you've seen this, but any chance you can run the tests?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1102899312, "label": "Add new spatialite helper methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/326#issuecomment-916119657", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/326", "id": 916119657, "node_id": "IC_kwDOCGYnMM42muBp", "user": {"value": 191622, "label": "meatcar"}, "created_at": "2021-09-09T13:54:10Z", "updated_at": "2021-09-09T13:54:10Z", "author_association": "CONTRIBUTOR", "body": "dupe of #293?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 991237645, "label": "Test against 3.10-dev"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/203#issuecomment-1404070841", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/203", "id": 1404070841, "node_id": "IC_kwDOCGYnMM5TsGu5", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-25T18:47:18Z", "updated_at": "2023-01-25T18:47:18Z", "author_association": "CONTRIBUTOR", "body": "i'll adopt this PR to make the changes @simonw suggested https://github.com/simonw/sqlite-utils/pull/203#issuecomment-753567932", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743384829, "label": "changes to allow for compound foreign keys"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/189#issuecomment-717359145", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/189", "id": 717359145, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNzM1OTE0NQ==", "user": {"value": 35681, "label": "adamwolf"}, "created_at": "2020-10-27T16:20:32Z", "updated_at": "2020-10-27T16:20:32Z", "author_association": "CONTRIBUTOR", "body": "No problem. I added a test. Let me know if it looks sufficient or if you want me to to tweak something!\r\n\r\nIf you don't mind, would you tag this PR as \"hacktoberfest-accepted\"? If you do mind, no problem and I'm sorry for asking :) My kiddos like the shirts.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 729818242, "label": "Allow iterables other than Lists in m2m records"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688573964", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/146", "id": 688573964, "node_id": "MDEyOklzc3VlQ29tbWVudDY4ODU3Mzk2NA==", "user": {"value": 96218, "label": "simonwiles"}, "created_at": "2020-09-08T01:55:07Z", "updated_at": "2020-09-08T01:55:07Z", "author_association": "CONTRIBUTOR", "body": "Okay, I've rewritten this PR to preserve the batching behaviour but still fix #145, and rebased the branch to account for the `db.execute()` api change.  It's not terribly sophisticated -- if it attempts to insert a batch which has too many variables, the exception is caught, the batch is split in two and each half is inserted separately, and then it carries on as before with the same `batch_size`.  In the edge case where this gets triggered, subsequent batches will all be inserted in two groups too if they continue to have the same number of columns (which is presumably reasonably likely).  Do you reckon this is acceptable when set against the awkwardness of recalculating the `batch_size` on the fly?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688668680, "label": "Handle case where subsequent records (after first batch) include extra columns"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688481317", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/146", "id": 688481317, "node_id": "MDEyOklzc3VlQ29tbWVudDY4ODQ4MTMxNw==", "user": {"value": 96218, "label": "simonwiles"}, "created_at": "2020-09-07T19:18:55Z", "updated_at": "2020-09-07T19:18:55Z", "author_association": "CONTRIBUTOR", "body": "Just force-pushed to update d042f9c with more formatting changes to satisfy `black==20.8b1` and pass the GitHub Actions \"Test\" workflow.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688668680, "label": "Handle case where subsequent records (after first batch) include extra columns"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688479163", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/146", "id": 688479163, "node_id": "MDEyOklzc3VlQ29tbWVudDY4ODQ3OTE2Mw==", "user": {"value": 96218, "label": "simonwiles"}, "created_at": "2020-09-07T19:10:33Z", "updated_at": "2020-09-07T19:11:57Z", "author_association": "CONTRIBUTOR", "body": "@simonw -- I've gone ahead updated the documentation to reflect the changes introduced in this PR.  IMO it's ready to merge now.\r\n\r\nIn writing the documentation changes, I begin to wonder about the value and role of `batch_size` at all, tbh.  May I assume it was originally intended to prevent using the entire row set to determine columns and column types, and that this was a performance consideration?  If so, this PR entirely undermines its purpose.  I've been passing in excess of 500,000 rows at a time to `insert_all()` with these changes and although I'm sure the performance difference is measurable it's not really noticeable; given #145, I don't know that any performance advantages outweigh the problems doing it this way removes.  What do you think about just dropping the argument and defaulting to the maximum `batch_size` permissible given `SQLITE_MAX_VARS`?  Are there other reasons one might want to restrict `batch_size` that I've overlooked?  I could open a new issue to discuss/implement this.\r\n\r\nOf course the documentation will need to change again too if/when something is done about #147.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688668680, "label": "Handle case where subsequent records (after first batch) include extra columns"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655643078", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/118", "id": 655643078, "node_id": "MDEyOklzc3VlQ29tbWVudDY1NTY0MzA3OA==", "user": {"value": 79913, "label": "tsibley"}, "created_at": "2020-07-08T17:05:59Z", "updated_at": "2020-07-08T17:05:59Z", "author_association": "CONTRIBUTOR", "body": "> The only thing missing from this PR is updates to the documentation.\r\n\r\nAh, yes, thanks for this reminder! I've repushed with doc bits added.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 651844316, "label": "Add insert --truncate option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/118", "id": 655239728, "node_id": "MDEyOklzc3VlQ29tbWVudDY1NTIzOTcyOA==", "user": {"value": 79913, "label": "tsibley"}, "created_at": "2020-07-08T02:16:42Z", "updated_at": "2020-07-08T02:16:42Z", "author_association": "CONTRIBUTOR", "body": "I fixed my original oops by moving the `DELETE FROM $table` out of the chunking loop and repushed. I think this change can be considered in isolation from issues around transactions, which I discuss next.\r\n\r\nI wanted to make the DELETE + INSERT happen all in the same transaction so it was robust, but that was more complicated than I expected. The transaction handling in the Database/Table classes isn't systematic, and this poses big hurdles to making `Table.insert_all` (or other operations) consistent and robust in the face of errors.\r\n\r\nFor example, I wanted to do this (whitespace ignored in diff, so indentation change not highlighted):\r\n\r\n```diff\r\ndiff --git a/sqlite_utils/db.py b/sqlite_utils/db.py\r\nindex d6b9ecf..4107ceb 100644\r\n--- a/sqlite_utils/db.py\r\n+++ b/sqlite_utils/db.py\r\n@@ -1028,6 +1028,11 @@ class Table(Queryable):\r\n         batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns))\r\n         self.last_rowid = None\r\n         self.last_pk = None\r\n+        with self.db.conn:\r\n+            # Explicit BEGIN is necessary because Python's sqlite3 doesn't\r\n+            # issue implicit BEGINs for DDL, only DML.  We mix DDL and DML\r\n+            # below and might execute DDL first, e.g. for table creation.\r\n+            self.db.conn.execute(\"BEGIN\")\r\n             if truncate and self.exists():\r\n                 self.db.conn.execute(\"DELETE FROM [{}];\".format(self.name))\r\n             for chunk in chunks(itertools.chain([first_record], records), batch_size):\r\n@@ -1038,7 +1043,11 @@ class Table(Queryable):\r\n                         # Use the first batch to derive the table names\r\n                         column_types = suggest_column_types(chunk)\r\n                         column_types.update(columns or {})\r\n-                    self.create(\r\n+                        # Not self.create() because that is wrapped in its own\r\n+                        # transaction and Python's sqlite3 doesn't support\r\n+                        # nested transactions.\r\n+                        self.db.create_table(\r\n+                            self.name,\r\n                             column_types,\r\n                             pk,\r\n                             foreign_keys,\r\n@@ -1139,7 +1148,6 @@ class Table(Queryable):\r\n                     flat_values = list(itertools.chain(*values))\r\n                     queries_and_params = [(sql, flat_values)]\r\n \r\n-            with self.db.conn:\r\n                 for query, params in queries_and_params:\r\n                     try:\r\n                         result = self.db.conn.execute(query, params)\r\n```\r\n\r\nbut that fails in tests because other methods call `insert/upsert/insert_all/upsert_all` in the middle of their transactions, so the BEGIN statement throws an error (no nested transactions allowed).\r\n\r\nStepping back, it would be nice to make the transaction handling systematic and predictable. One way to do this is to make the `sqlite_utils/db.py` code generally not begin or commit any transactions, and require the caller to do that instead. This lets the caller mix and match the Python API calls into transactions as appropriate (which is impossible for the API methods themselves to fully determine). Then, make `sqlite_utils/cli.py` begin and commit a transaction in each `@cli.command` function, making each command robust and consistent in the face of errors. The big change here, and why I didn't just submit a patch, is that it dramatically changes the Python API to _require_ callers to begin a transaction rather than just immediately calling methods.\r\n\r\nThere is also the caveat that for each transaction, an explicit `BEGIN` is also necessary so that DDL as well as DML (as well as `SELECT`s) are consistent and rolled back on error. There are several bugs.python.org discussions around this particular problem of DDL and some plans to make it better and consistent with DBAPI2, eventually. In the meantime, the sqlite-utils Database class could be a context manager which supports the incantations necessary to do proper transactions. This would still be a Python API change for callers but wouldn't expose them to the weirdness of the sqlite3's default transaction handling.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 651844316, "label": "Add insert --truncate option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655052451", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/118", "id": 655052451, "node_id": "MDEyOklzc3VlQ29tbWVudDY1NTA1MjQ1MQ==", "user": {"value": 79913, "label": "tsibley"}, "created_at": "2020-07-07T18:45:23Z", "updated_at": "2020-07-07T18:45:23Z", "author_association": "CONTRIBUTOR", "body": "Ah, I see the problem. The truncate is inside a loop I didn't realize was there.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 651844316, "label": "Add insert --truncate option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655018966", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/118", "id": 655018966, "node_id": "MDEyOklzc3VlQ29tbWVudDY1NTAxODk2Ng==", "user": {"value": 79913, "label": "tsibley"}, "created_at": "2020-07-07T17:41:06Z", "updated_at": "2020-07-07T17:41:06Z", "author_association": "CONTRIBUTOR", "body": "Hmm, while tests pass, this may not work as intended on larger datasets. Looking into it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 651844316, "label": "Add insert --truncate option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1029317527", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/79", "id": 1029317527, "node_id": "IC_kwDOCGYnMM49WiOX", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-03T19:18:02Z", "updated_at": "2022-02-03T19:18:02Z", "author_association": "CONTRIBUTOR", "body": "Taking part of the conversation from #385 here.\r\n\r\n> Would sqlite-utils add-geometry-column ... be a good CLI enhancement. for example?\r\n\r\nYes. And also `sqlite-utils create-spatial-index` would be great to have. My plan would be to add those once the Python API is settled.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 557842245, "label": "Helper methods for working with SpatiaLite"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1013698557", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/79", "id": 1013698557, "node_id": "IC_kwDOCGYnMM48a8_9", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-01-15T15:15:22Z", "updated_at": "2022-01-15T15:15:22Z", "author_association": "CONTRIBUTOR", "body": "@simonw I have a PR here https://github.com/simonw/sqlite-utils/pull/385 that adds Spatialite helpers on the Python side. Please let me know how it looks.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 557842245, "label": "Helper methods for working with SpatiaLite"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012413729", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/79", "id": 1012413729, "node_id": "IC_kwDOCGYnMM48WDUh", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-01-13T18:50:00Z", "updated_at": "2022-01-13T18:50:00Z", "author_association": "CONTRIBUTOR", "body": "One more thing I'm going to add: A method to add a geometry column, which I'll need to do to create a spatial index on a table.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 557842245, "label": "Helper methods for working with SpatiaLite"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012253198", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/79", "id": 1012253198, "node_id": "IC_kwDOCGYnMM48VcIO", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-01-13T15:39:14Z", "updated_at": "2022-01-13T15:39:14Z", "author_association": "CONTRIBUTOR", "body": "Other thing: If there get to be enough utils, I think it's worth moving all the spatialite stuff into its own file (`gis.py` or something) just so it's easier to find later.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 557842245, "label": "Helper methods for working with SpatiaLite"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012230212", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/79", "id": 1012230212, "node_id": "IC_kwDOCGYnMM48VWhE", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-01-13T15:15:13Z", "updated_at": "2022-01-13T15:15:13Z", "author_association": "CONTRIBUTOR", "body": "Some proposals I'd add to sqlite-utils:\r\n\r\nSome version of this, from [geojson-to-sqlite](https://github.com/simonw/geojson-to-sqlite/blob/main/geojson_to_sqlite/utils.py#L124-L130):\r\n\r\n```python\r\ndef init_spatialite(db, lib):\r\n    db.conn.enable_load_extension(True)\r\n    db.conn.load_extension(lib)\r\n    # Initialize SpatiaLite if not yet initialized\r\n    if \"spatial_ref_sys\" in db.table_names():\r\n        return\r\n    db.conn.execute(\"select InitSpatialMetadata(1)\")\r\n```\r\n\r\nAlso a function for creating a spatial index:\r\n\r\n```python\r\ndb.conn.execute(\"select CreateSpatialIndex(?, ?)\", [table, \"geometry\"])\r\n```\r\n\r\nI don't know the nuances of updating a spatial index, or checking if one already exists. This could be a CLI method like:\r\n\r\n```sh\r\nsqlite-utils spatial-index spatial.db table-name column-name\r\n```\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 557842245, "label": "Helper methods for working with SpatiaLite"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012158895", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/79", "id": 1012158895, "node_id": "IC_kwDOCGYnMM48VFGv", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-01-13T13:55:59Z", "updated_at": "2022-01-13T13:55:59Z", "author_association": "CONTRIBUTOR", "body": "Came here to add this. I might pick it up.\r\n\r\nWould also add a utility to create (and update and delete?) a spatial index. It's not much code but I have to look it up every time.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 557842245, "label": "Helper methods for working with SpatiaLite"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573389669", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/74", "id": 573389669, "node_id": "MDEyOklzc3VlQ29tbWVudDU3MzM4OTY2OQ==", "user": {"value": 15092, "label": "jayvdb"}, "created_at": "2020-01-12T07:21:17Z", "updated_at": "2020-01-12T07:21:17Z", "author_association": "CONTRIBUTOR", "body": "I guess there is some extra flag for ` CliRunner.invoke` to check exitcode and raise the exception, or that should be an extra assert added.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 546073980, "label": "Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573388052", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/74", "id": 573388052, "node_id": "MDEyOklzc3VlQ29tbWVudDU3MzM4ODA1Mg==", "user": {"value": 15092, "label": "jayvdb"}, "created_at": "2020-01-12T06:51:30Z", "updated_at": "2020-01-12T06:51:30Z", "author_association": "CONTRIBUTOR", "body": "Thanks.  That showed me that there was a click cli runner error, and setting `export LANG=en_US.UTF-8` fixed it.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 546073980, "label": "Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/61#issuecomment-533818697", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/61", "id": 533818697, "node_id": "MDEyOklzc3VlQ29tbWVudDUzMzgxODY5Nw==", "user": {"value": 49260, "label": "amjith"}, "created_at": "2019-09-21T18:09:01Z", "updated_at": "2019-09-21T18:09:28Z", "author_association": "CONTRIBUTOR", "body": "@witeshadow The library version doesn't have helpers around CSV (at least not from what I can see in the code). \r\n\r\nBut here's a snippet that makes it easy to insert from CSV using the library. \r\n\r\n```\r\nimport csv\r\nfrom sqlite_utils import Database\r\n\r\n# CSV Reader\r\n\r\ncsv_file = open(\"filename.csv\")   # open the csv file.\r\nreader = csv.reader(csv_file)  # Create a CSV reader\r\nheaders = next(reader)   # First line is the header\r\ndocs = (dict(zip(headers, row)) for row in reader)\r\n\r\n# Now you can use the `sqlite_utils` library. \r\n\r\ndb = Database(\"my_database.db\")\r\ndb[\"table_name\"].insert_all(docs)\r\n```\r\n\r\nThis snippet is adapted from reading the CLI source code on how it implements the csv option.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 491219910, "label": "importing CSV to SQLite as library"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/557#issuecomment-1590531892", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/557", "id": 1590531892, "node_id": "IC_kwDOCGYnMM5ezZc0", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2023-06-14T06:09:21Z", "updated_at": "2023-06-14T06:09:21Z", "author_association": "CONTRIBUTOR", "body": "I put together a [simple script](https://github.com/chapmanjacobd/library/blob/42129c5ebe15f9d74653c0f5ca4ed0c991d383e0/xklb/scripts/dedupe_db.py) to upsert and remove duplicate rows based on business keys. If anyone has similar problems with above this might help\r\n\r\n```\r\nCREATE TABLE my_table (\r\n    id INTEGER PRIMARY KEY,\r\n    column1 TEXT,\r\n    column2 TEXT,\r\n    column3 TEXT\r\n);\r\n\r\nINSERT INTO my_table (column1, column2, column3)\r\nVALUES\r\n    ('Value 1', 'Duplicate 1', 'Duplicate A'),\r\n    ('Value 2', 'Duplicate 2', 'Duplicate B'),\r\n    ('Value 3', 'Duplicate 2', 'Duplicate C'),\r\n    ('Value 4', 'Duplicate 3', 'Duplicate D'),\r\n    ('Value 5', 'Duplicate 3', 'Duplicate E'),\r\n    ('Value 6', 'Duplicate 3', 'Duplicate F');\r\n```\r\n\r\n```\r\nlibrary dedupe-db test.db my_table --bk column2\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1740150327, "label": "Aliased ROWID option for tables created from alter=True commands"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/557#issuecomment-1577355134", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/557", "id": 1577355134, "node_id": "IC_kwDOCGYnMM5eBId-", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2023-06-05T19:26:26Z", "updated_at": "2023-06-05T19:26:26Z", "author_association": "CONTRIBUTOR", "body": "this isn't really actionable... I'm just being a whiny baby. I have tasted the milk of being able to use `upsert_all`, `insert_all`, etc without having to write DDL to create tables. The meat of the issue is that SQLITE doesn't make rowid stable between vacuums so it is not possible to take shortcuts", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1740150327, "label": "Aliased ROWID option for tables created from alter=True commands"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/556#issuecomment-1575310378", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/556", "id": 1575310378, "node_id": "IC_kwDOCGYnMM5d5VQq", "user": {"value": 601708, "label": "mcint"}, "created_at": "2023-06-04T01:21:15Z", "updated_at": "2023-06-04T01:21:15Z", "author_association": "CONTRIBUTOR", "body": "I've resolved my use, with the line-buffered output and while read loop for line buffered input, but I leave this here so the incremental saving or line-buffered use-case can be explicitly handled or rejected (or deferred).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1740026046, "label": "Support storing incrementally piped values"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/555#issuecomment-1592047502", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/555", "id": 1592047502, "node_id": "IC_kwDOCGYnMM5e5LeO", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2023-06-14T22:00:10Z", "updated_at": "2023-06-14T22:01:57Z", "author_association": "CONTRIBUTOR", "body": "You may want to try doing a performance comparison between this and just selecting all the ids with few constraints and then doing the filtering within python.\r\n\r\nThat might seem like a lazy-programmer, inefficient way but queries with large resultsets are a different profile than what databases like SQLITE are designed for. That is not to say that SQLITE is slow or that python is always faster but when you start reading >20% of an index there is an equilibrium that is reached. Especially when adding in writing extra temp tables and stuff to memory/disk. And especially given the `NOT IN` style of query...\r\n\r\nYou may also try chunking like this:\r\n\r\n```py\r\ndef chunks(lst, n) -> Generator:\r\n    for i in range(0, len(lst), n):\r\n        yield lst[i : i + n]\r\n\r\nSQLITE_PARAM_LIMIT = 32765\r\n\r\ndata = []\r\nchunked = chunks(video_ids, consts.SQLITE_PARAM_LIMIT)\r\nfor ids in chunked:\r\n    data.expand(\r\n        list(\r\n            db.query(\r\n                f\"\"\"SELECT * from videos\r\n                WHERE id in (\"\"\"\r\n                + \",\".join([\"?\"] * len(ids))\r\n                + \")\",\r\n                (*ids,),\r\n            )\r\n        )\r\n    )\r\n```\r\n\r\nbut that actually won't work with your `NOT IN` requirements. You need to query the full resultset to check any row.\r\n\r\nSince you are doing stuff with files/videos in SQLITE you might be interested in my side project: https://github.com/chapmanjacobd/library", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1733198948, "label": "Filter table by a large bunch of ids"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/535#issuecomment-1592052320", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/535", "id": 1592052320, "node_id": "IC_kwDOCGYnMM5e5Mpg", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2023-06-14T22:05:28Z", "updated_at": "2023-06-14T22:05:28Z", "author_association": "CONTRIBUTOR", "body": "piping to `jq` is good enough usually", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1655860104, "label": "rows: --transpose or psql extended view-like functionality"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/529#issuecomment-1592110694", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/529", "id": 1592110694, "node_id": "IC_kwDOCGYnMM5e5a5m", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2023-06-14T23:11:47Z", "updated_at": "2023-06-14T23:12:12Z", "author_association": "CONTRIBUTOR", "body": "sorry i was wrong. `sqlite-utils --raw-lines` works correctly\r\n\r\n```\r\nsqlite-utils --raw-lines :memory: \"SELECT * FROM (VALUES ('test'), ('line2'))\" | cat -A\r\ntest$\r\nline2$\r\n\r\nsqlite-utils --csv --no-headers :memory: \"SELECT * FROM (VALUES ('test'), ('line2'))\" | cat -A\r\ntest$\r\nline2$\r\n```\r\n\r\nI think this was fixed somewhat recently", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1581090327, "label": "Microsoft line endings"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/527#issuecomment-1540900733", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/527", "id": 1540900733, "node_id": "IC_kwDOCGYnMM5b2Ed9", "user": {"value": 167893, "label": "mcarpenter"}, "created_at": "2023-05-09T21:15:05Z", "updated_at": "2023-05-09T21:15:05Z", "author_association": "CONTRIBUTOR", "body": "Sorry, I completely missed your first comment whilst on Easter break.\r\n\r\nThis looks like a good practical compromise before v4. Thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1578790070, "label": "`Table.convert()` skips falsey values"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/525#issuecomment-1435318713", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/525", "id": 1435318713, "node_id": "IC_kwDOCGYnMM5VjTm5", "user": {"value": 167893, "label": "mcarpenter"}, "created_at": "2023-02-17T21:55:01Z", "updated_at": "2023-02-17T21:55:01Z", "author_association": "CONTRIBUTOR", "body": "Meanwhile, a cheap workaround is to invalidate the registered function cache:\r\n``` python\r\ntable.convert(...)\r\ndb._registered_functions = set()\r\ntable.convert(...)\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1575131737, "label": "Repeated calls to `Table.convert()` fail"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/525#issuecomment-1423387341", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/525", "id": 1423387341, "node_id": "IC_kwDOCGYnMM5U1yrN", "user": {"value": 167893, "label": "mcarpenter"}, "created_at": "2023-02-08T23:48:52Z", "updated_at": "2023-02-09T00:17:30Z", "author_association": "CONTRIBUTOR", "body": "PR below", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1575131737, "label": "Repeated calls to `Table.convert()` fail"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/524#issuecomment-1419357290", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/524", "id": 1419357290, "node_id": "IC_kwDOCGYnMM5Umaxq", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2023-02-06T16:21:44Z", "updated_at": "2023-02-06T16:21:44Z", "author_association": "CONTRIBUTOR", "body": "SQLite doesn't have a native `DATETIME` type. It stores dates internally as strings and then has [functions](https://www.sqlite.org/lang_datefunc.html) to work with date-like strings. Yes it's weird.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1572766460, "label": "Transformation type `--type DATETIME`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/523#issuecomment-1407264466", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/523", "id": 1407264466, "node_id": "IC_kwDOCGYnMM5T4SbS", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-28T02:41:14Z", "updated_at": "2023-01-28T02:41:14Z", "author_association": "CONTRIBUTOR", "body": "I also often then run another little script to cast all empty strings to null, but i save that for another issue if this gets accepted.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1560651350, "label": "Feature request: trim all leading and trailing white space for all columns for all tables in a database"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/520#issuecomment-1421571810", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/520", "id": 1421571810, "node_id": "IC_kwDOCGYnMM5Uu3bi", "user": {"value": 167893, "label": "mcarpenter"}, "created_at": "2023-02-07T22:43:09Z", "updated_at": "2023-02-07T22:43:09Z", "author_association": "CONTRIBUTOR", "body": "Hey, isn't this essentially the same issue as #448 ?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1516644980, "label": "rows_from_file() raises confusing error if file-like object is not in binary mode"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/511#issuecomment-1304320521", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/511", "id": 1304320521, "node_id": "IC_kwDOCGYnMM5NvloJ", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-04T22:54:09Z", "updated_at": "2022-11-04T22:59:54Z", "author_association": "CONTRIBUTOR", "body": "I ran `PRAGMA integrity_check` and it returned `ok`. but then I tried restoring from a backup and I didn't get this `IntegrityError: constraint failed` error. So I think it was just something wrong with my database. If it happens again I will first try to reindex and see if that fixes the issue", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1436539554, "label": "[insert_all, upsert_all] IntegrityError: constraint failed"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/511#issuecomment-1304078945", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/511", "id": 1304078945, "node_id": "IC_kwDOCGYnMM5Nuqph", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-04T19:38:36Z", "updated_at": "2022-11-04T20:13:17Z", "author_association": "CONTRIBUTOR", "body": "Even more bizarre, the source db only has one record and the target table has no conflicting record:\r\n\r\n```\r\n875 0.3s lb:/ (main|\u271a2) [0|0]\ud83c\udf3a sqlite-utils tube_71.db 'select * from media where path = \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\"' | jq\r\n[\r\n  {\r\n    \"size\": null,\r\n    \"time_created\": null,\r\n    \"play_count\": 1,\r\n    \"language\": null,\r\n    \"view_count\": null,\r\n    \"width\": null,\r\n    \"height\": null,\r\n    \"fps\": null,\r\n    \"average_rating\": null,\r\n    \"live_status\": null,\r\n    \"age_limit\": null,\r\n    \"uploader\": null,\r\n    \"time_played\": 0,\r\n    \"path\": \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\",\r\n    \"id\": \"088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz/074 - Home Away from Home, Rainy Day Robot, Odie the Amazing DVDRip XviD [PhZ].mkv\",\r\n    \"ie_key\": \"ArchiveOrg\",\r\n    \"playlist_path\": \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\",\r\n    \"duration\": 1424.05,\r\n    \"tags\": null,\r\n    \"title\": \"074 - Home Away from Home, Rainy Day Robot, Odie the Amazing DVDRip XviD [PhZ].mkv\"\r\n  }\r\n]\r\n875 0.3s lb:/ (main|\u271a2) [0|0]\ud83e\udd67 sqlite-utils video.db 'select * from media where path = \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\"' | jq\r\n[]\r\n```\r\n\r\nI've been able to use this code successfully several times before so not sure what's causing the issue.\r\n\r\nI guess the way that I'm handling multiple databases is an issue, though it hasn't ever inserted into the source db, not sure what's different. The only reasonable explanation is that it is trying to insert into the source db from the source db for some reason? Or maybe sqlite3 is checking the source db for primary key violation because the table name is the same", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1436539554, "label": "[insert_all, upsert_all] IntegrityError: constraint failed"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/510#issuecomment-1318777114", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/510", "id": 1318777114, "node_id": "IC_kwDOCGYnMM5OmvEa", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-17T15:09:47Z", "updated_at": "2022-11-17T15:09:47Z", "author_association": "CONTRIBUTOR", "body": "why close? is the only problem that the _config table that incorrectly says 4 for fts5? if so, that's still something that should be fixed", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1434911255, "label": "Cannot enable FTS5 despite it being available"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/507#issuecomment-1297859539", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/507", "id": 1297859539, "node_id": "IC_kwDOCGYnMM5NW8PT", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-01T00:40:16Z", "updated_at": "2022-11-01T00:40:16Z", "author_association": "CONTRIBUTOR", "body": "Ideally people could fix their data if they run into this issue.\r\n\r\nIf you are using filenames try [convmv](https://linux.die.net/man/1/convmv)\r\n\r\n```\r\nconvmv --preserve-mtimes -f utf8 -t utf8 --notest -i -r .\r\n```\r\n\r\nmaybe this script will also help: \r\n\r\n```py\r\nimport argparse, shutil\r\nfrom pathlib import Path\r\n\r\nimport ftfy\r\n\r\nfrom xklb import utils\r\nfrom xklb.utils import log\r\n\r\n\r\ndef parse_args() -> argparse.Namespace:\r\n    parser = argparse.ArgumentParser()\r\n    parser.add_argument(\"paths\", nargs='*')\r\n    parser.add_argument(\"--verbose\", \"-v\", action=\"count\", default=0)\r\n    args = parser.parse_args()\r\n\r\n    log.info(utils.dict_filter_bool(args.__dict__))\r\n    return args\r\n\r\n\r\ndef rename_invalid_paths() -> None:\r\n    args = parse_args()\r\n\r\n    for path in args.paths:\r\n        log.info(path)\r\n        for p in sorted([str(p) for p in Path(path).rglob(\"*\")], key=len):\r\n            fixed = ftfy.fix_text(p, uncurl_quotes=False).replace(\"\\r\\n\", \"\\n\").replace(\"\\r\", \"\\n\").replace(\"\\n\", \"\")\r\n            if p != fixed:\r\n                try:\r\n                    shutil.move(p, fixed)\r\n                except FileNotFoundError:\r\n                    log.warning(\"FileNotFound. %s\", p)\r\n                else:\r\n                    log.info(fixed)\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    rename_invalid_paths()\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1430325103, "label": "conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/50#issuecomment-1303660293", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/50", "id": 1303660293, "node_id": "IC_kwDOCGYnMM5NtEcF", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-04T14:38:36Z", "updated_at": "2022-11-04T14:38:36Z", "author_association": "CONTRIBUTOR", "body": "where did you see the limit as 999? I believe the limit has been 32766 for quite some time. If you could detect which one this could speed up batch insert of some types of data significantly", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 473083260, "label": "\"Too many SQL variables\" on large inserts"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/493#issuecomment-1264219650", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/493", "id": 1264219650, "node_id": "IC_kwDOCGYnMM5LWnYC", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-01T03:22:50Z", "updated_at": "2022-10-01T03:23:58Z", "author_association": "CONTRIBUTOR", "body": "this is likely what you are looking for: https://stackoverflow.com/a/51076749/697964\r\n\r\nbut yeah I would say just disable smart quotes", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1386562662, "label": "Tiny typographical error in install/uninstall docs"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1264218914", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491", "id": 1264218914, "node_id": "IC_kwDOCGYnMM5LWnMi", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-01T03:18:36Z", "updated_at": "2023-06-14T22:14:24Z", "author_association": "CONTRIBUTOR", "body": "> some good concrete use-cases in mind\r\n\r\nI actually found myself wanting something like this the past couple days. The use-case was databases with slightly different schema but same table names.\r\n\r\nhere is a full script:\r\n\r\n```\r\nimport argparse\r\nfrom pathlib import Path\r\n\r\nfrom sqlite_utils import Database\r\n\r\n\r\ndef connect(args, conn=None, **kwargs) -> Database:\r\n    db = Database(conn or args.database, **kwargs)\r\n    with db.conn:\r\n        db.conn.execute(\"PRAGMA main.cache_size = 8000\")\r\n    return db\r\n\r\n\r\ndef parse_args() -> argparse.Namespace:\r\n    parser = argparse.ArgumentParser()\r\n    parser.add_argument(\"database\")\r\n    parser.add_argument(\"dbs_folder\")\r\n    parser.add_argument(\"--db\", \"-db\", help=argparse.SUPPRESS)\r\n    parser.add_argument(\"--verbose\", \"-v\", action=\"count\", default=0)\r\n    args = parser.parse_args()\r\n\r\n    if args.db:\r\n        args.database = args.db\r\n    Path(args.database).touch()\r\n    args.db = connect(args)\r\n\r\n    return args\r\n\r\n\r\ndef merge_db(args, source_db):\r\n    source_db = str(Path(source_db).resolve())\r\n\r\n    s_db = connect(argparse.Namespace(database=source_db, verbose = args.verbose))\r\n    for table in s_db.table_names():\r\n        data = s_db[table].rows\r\n        args.db[table].insert_all(data, alter=True, replace=True)\r\n\r\n    args.db.conn.commit()\r\n\r\n\r\ndef merge_directory():\r\n    args = parse_args()\r\n    source_dbs = list(Path(args.dbs_folder).glob('*.db'))\r\n    for s_db in source_dbs:\r\n        merge_db(args, s_db)\r\n\r\n\r\nif __name__ == '__main__':\r\n    merge_directory()\r\n```\r\n\r\nedit: I've made some improvements to this and put it on PyPI:\r\n\r\n```\r\n$ pip install xklb\r\n$ lb merge-db -h\r\nusage: library merge-dbs DEST_DB SOURCE_DB ... [--only-target-columns] [--only-new-rows] [--upsert] [--pk PK ...] [--table TABLE ...]\r\n\r\n    Merge-DBs will insert new rows from source dbs to target db, table by table. If primary key(s) are provided,\r\n    and there is an existing row with the same PK, the default action is to delete the existing row and insert the new row\r\n    replacing all existing fields.\r\n\r\n    Upsert mode will update matching PK rows such that if a source row has a NULL field and\r\n    the destination row has a value then the value will be preserved instead of changed to the source row's NULL value.\r\n\r\n    Ignore mode (--only-new-rows) will insert only rows which don't already exist in the destination db\r\n\r\n    Test first by using temp databases as the destination db.\r\n    Try out different modes / flags until you are satisfied with the behavior of the program\r\n\r\n        library merge-dbs --pk path (mktemp --suffix .db) tv.db movies.db\r\n\r\n    Merge database data and tables\r\n\r\n        library merge-dbs --upsert --pk path video.db tv.db movies.db\r\n        library merge-dbs --only-target-columns --only-new-rows --table media,playlists --pk path audio-fts.db audio.db\r\n\r\n        library merge-dbs --pk id --only-tables subreddits reddit/81_New_Music.db audio.db\r\n        library merge-dbs --only-new-rows --pk subreddit,path --only-tables reddit_posts reddit/81_New_Music.db audio.db -v\r\n\r\npositional arguments:\r\n  database\r\n  source_dbs\r\n```\r\n\r\nAlso if you want to dedupe a table based on a \"business key\" which isn't explicitly your primary key(s) you can run this:\r\n\r\n```\r\n$ lb dedupe-db -h\r\nusage: library dedupe-dbs DATABASE TABLE --bk BUSINESS_KEYS [--pk PRIMARY_KEYS] [--only-columns COLUMNS]\r\n\r\n    Dedupe your database (not to be confused with the dedupe subcommand)\r\n\r\n    It should not need to be said but *backup* your database before trying this tool!\r\n\r\n    Dedupe-DB will help remove duplicate rows based on non-primary-key business keys\r\n\r\n        library dedupe-db ./video.db media --bk path\r\n\r\n    If --primary-keys is not provided table metadata primary keys will be used\r\n    If --only-columns is not provided all non-primary and non-business key columns will be upserted\r\n\r\npositional arguments:\r\n  database\r\n  table\r\n\r\noptions:\r\n  -h, --help            show this help message and exit\r\n  --skip-0\r\n  --only-columns ONLY_COLUMNS\r\n                        Comma separated column names to upsert\r\n  --primary-keys PRIMARY_KEYS, --pk PRIMARY_KEYS\r\n                        Comma separated primary keys\r\n  --business-keys BUSINESS_KEYS, --bk BUSINESS_KEYS\r\n                        Comma separated business keys\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1383646615, "label": "Ability to merge databases and tables"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258712931", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491", "id": 1258712931, "node_id": "IC_kwDOCGYnMM5LBm9j", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-09-26T22:31:58Z", "updated_at": "2022-09-26T22:31:58Z", "author_association": "CONTRIBUTOR", "body": "Right. The backup command will copy tables completely, but in the case of conflicting table names, the destination gets overwritten silently. That might not be what you want here. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1383646615, "label": "Ability to merge databases and tables"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258508215", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491", "id": 1258508215, "node_id": "IC_kwDOCGYnMM5LA0-3", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-09-26T19:22:14Z", "updated_at": "2022-09-26T19:22:14Z", "author_association": "CONTRIBUTOR", "body": "This might be fairly straightforward using SQLite's backup utility: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.backup\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1383646615, "label": "Ability to merge databases and tables"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1256858763", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491", "id": 1256858763, "node_id": "IC_kwDOCGYnMM5K6iSL", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-09-24T04:50:59Z", "updated_at": "2022-09-24T04:52:08Z", "author_association": "CONTRIBUTOR", "body": "Instead of outputting binary data to stdout the interface might be better like this\r\n\r\n```\r\nsqlite-utils merge animals.db cats.db dogs.db\r\n```\r\n\r\nsimilar to `zip`, `ogr2ogr`, etc\r\n\r\nActually I think this might already be possible within `ogr2ogr`. I don't believe spatial data is a requirement though it might add an `ogc_id` column or something\r\n\r\n```\r\ncp cats.db animals.db\r\nogr2ogr -append animals.db dogs.db\r\nogr2ogr -append animals.db another.db\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1383646615, "label": "Ability to merge databases and tables"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/467#issuecomment-1224382336", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/467", "id": 1224382336, "node_id": "IC_kwDOCGYnMM5I-peA", "user": {"value": 50527, "label": "jefftriplett"}, "created_at": "2022-08-23T17:16:13Z", "updated_at": "2022-08-23T17:16:13Z", "author_association": "CONTRIBUTOR", "body": "> Should passing `alter=True` also drop any columns that aren't included in the new table structure?\r\n> \r\n> It could even spot column types that aren't correct and fix those.\r\n> \r\n> Is that consistent with the expectations set by how `alter=True` works elsewhere?\r\n\r\nI would lean towards not dropping them (or making a `drop=True` or `drop_columns=True`or `drop_missing_columns=True`) to work with existing tables easier. \r\n\r\nI do like that sqlite-utils mostly just works with existing tables but it's also nice to add to existing fields in a few cases. \r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1348169997, "label": "Mechanism for ensuring a table has all the columns"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190277829", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456", "id": 1190277829, "node_id": "IC_kwDOCGYnMM5G8jLF", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-20T13:19:15Z", "updated_at": "2022-07-20T13:19:15Z", "author_association": "CONTRIBUTOR", "body": "hadley wickham's melt and reshape could be good inspo: http://had.co.nz/reshape/introduction.pdf", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1310243385, "label": "feature request: pivot command"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190272780", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456", "id": 1190272780, "node_id": "IC_kwDOCGYnMM5G8h8M", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-20T13:14:54Z", "updated_at": "2022-07-20T13:14:54Z", "author_association": "CONTRIBUTOR", "body": "for example, i have data on votes that look like this:\r\n\r\n| ballot_id | option_id | choice |\r\n|-|-|-|\r\n| 1 | 1 | 0 | \r\n| 1 | 2 | 1 |\r\n| 1 | 3 | 0 |\r\n| 1 | 4 | 1 |\r\n| 2 | 1 | 1 |\r\n| 2 | 2 | 0 |\r\n| 2 | 3 | 1 |\r\n| 2 | 4 | 0 |\r\n\r\nand i want to reshape from this long form to this wide form:\r\n\r\n| ballot_id | option_id_1 | option_id_2 | option_id_3 | option_id_ 4|\r\n|-|-|-|-| -|\r\n| 1 | 0 | 1 | 0 | 1 | \r\n| 2 | 1 | 0 | 1| 0 | \r\n\r\ni could do such a think like this.\r\n\r\n```sql\r\nselect ballot_id, \r\nsum(choice) filter (where option_id = 1) as option_id_1,\r\nsum(choice) filter (where option_id = 2) as option_id_2,\r\nsum(choice) filter (where option_id = 3) as option_id_3,\r\nsum(choice) filter (where option_id = 4) as option_id_4\r\nfrom vote\r\ngroup by ballot_id\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1310243385, "label": "feature request: pivot command"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/449#issuecomment-1179579878", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/449", "id": 1179579878, "node_id": "IC_kwDOCGYnMM5GTvXm", "user": {"value": 1690072, "label": "davidleejy"}, "created_at": "2022-07-09T17:41:32Z", "updated_at": "2022-07-09T17:41:50Z", "author_association": "CONTRIBUTOR", "body": "Learnt that the types in Sqlite-utils differ somewhat from those in Sqlite. I've changed my test to account for this difference and the test has passed successfully. I will submit a PR.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1279863844, "label": "Utilities for duplicating tables and creating a table with the results of a query"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/449#issuecomment-1174027079", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/449", "id": 1174027079, "node_id": "IC_kwDOCGYnMM5F-jtH", "user": {"value": 1690072, "label": "davidleejy"}, "created_at": "2022-07-04T17:33:04Z", "updated_at": "2022-07-04T17:48:43Z", "author_association": "CONTRIBUTOR", "body": "I've written the code and test. Would you be able to advise how to compare table columns in a pytest function properly? Experiencing a challenge when comparing columns.\r\n\r\nTest:\r\n```python\r\ndef test_duplicate(fresh_db):\r\n    table = fresh_db.create_table(\r\n        \"table1\",\r\n        {\r\n            \"text_col\": str,\r\n            \"float_col\": float,\r\n            \"int_col\": int,\r\n            \"bool_col\": bool,\r\n            \"bytes_col\": bytes,\r\n            \"datetime_col\": datetime.datetime,\r\n        },\r\n    )\r\n    dt = datetime.datetime.now()\r\n    b = bytes('hello world', 'utf-8')\r\n    data = {\"text_col\": \"Cleo\", \r\n            \"float_col\": 3.14,\r\n            \"int_col\": -2,\r\n            \"bool_col\": True,\r\n            \"bytes_col\": b,\r\n            \"datetime_col\": str(dt)}\r\n    table1 = fresh_db[\"table1\"]\r\n    row_id = table1.insert(data).last_rowid\r\n    table1.duplicate('table2')\r\n    table2 = fresh_db[\"table2\"]\r\n    assert data == table2.get(row_id)\r\n    assert table1.columns == table2.columns    # FAILS HERE\r\n```\r\n\r\nResult:\r\n![Screenshot 2022-07-05 at 1 31 55 AM](https://user-images.githubusercontent.com/1690072/177198814-daac48c9-5746-49d0-a14a-14fe181c5a2f.png)\r\n\r\nFailure is due to column types being named differently -- e.g. 'FLOAT' vs 'REAL', 'INTEGER' vs 'INT'. How should I go about comparing columns while accounting for equivalent types?\r\n\r\nOr did I miss out something in my duplication code correctly? Here's how I did it: in `db.py`, I've added the following code:\r\n```python\r\nclass Table(Queryable):\r\n    [...]\r\n    def duplicate(\r\n        self, \r\n        name_new: str\r\n    ) -> \"Table\":\r\n        \"\"\"\r\n        Duplicate this table in this database.\r\n\r\n        :param name_new: Name of new table.\r\n        \"\"\"\r\n        assert self.exists()\r\n        with self.db.conn:\r\n            sql = \"CREATE TABLE [{new_table}] AS SELECT * FROM [{table}];\".format(\r\n                new_table = name_new,\r\n                table = self.name,\r\n            )\r\n            self.db.execute(sql)\r\n        return self.db[name_new]\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1279863844, "label": "Utilities for duplicating tables and creating a table with the results of a query"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1297703307", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/448", "id": 1297703307, "node_id": "IC_kwDOCGYnMM5NWWGL", "user": {"value": 167893, "label": "mcarpenter"}, "created_at": "2022-10-31T21:23:51Z", "updated_at": "2022-10-31T21:27:32Z", "author_association": "CONTRIBUTOR", "body": "The Windows aspect is a red herring: OP's sample above produces the same error on Linux. (Though I don't know what's going on with the CI).\r\n\r\nThe same error can also be obtained by passing an `io` from a file opened in non-binary mode (`'r'` as opposed to `'rb'`) to `rows_from_file()`. This is how I got here.\r\n\r\nThe fix for my case is easy: open the file in mode `'rb'`. The analagous fix for OP's problem also works: use `BytesIO` in place of `StringIO`.\r\n\r\nMinimal test case (derived from [utils.py](https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/utils.py#L304)):\r\n\r\n``` python\r\nimport io\r\nfrom typing import cast\r\n\r\n#fp = io.StringIO(\"id,name\\n1,Cleo\") # error\r\nfp = io.BytesIO(bytes(\"id,name\\n1,Cleo\", encoding='utf-8')) # okay\r\nreader = io.BufferedReader(cast(io.RawIOBase, fp))\r\nreader.peek(1) # exception thrown here\r\n```\r\nI see the signature of `rows_from_file()` correctly has `fp: BinaryIO` but I guess you'd need either a runtime type check for that (not all `io`s have `mode()`), or to catch the `AttributeError` on `peek()` to produce a better error for users. Neither option is ideal.\r\n\r\nSome thoughts on testing binary-ness of `io`s in this SO question: https://stackoverflow.com/questions/44584829/how-to-determine-if-file-is-opened-in-binary-or-text-mode", "reactions": "{\"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1279144769, "label": "Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto'"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1444474487", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/433", "id": 1444474487, "node_id": "IC_kwDOCGYnMM5WGO53", "user": {"value": 167893, "label": "mcarpenter"}, "created_at": "2023-02-24T20:57:43Z", "updated_at": "2023-02-24T22:22:18Z", "author_association": "CONTRIBUTOR", "body": "I think I see what is happening here, although I haven't quite work out a fix yet. Usually:\r\n\r\n* `click.progressbar.render_progress()` renders the cursor invisible on each invocation (update of the bar)\r\n* When the progress bar goes out of scope, the `__exit()__` method is invoked, which calls `render_finish()` to make the cursor re-appear.\r\n\r\n(See terminal escape sequences `BEFORE_BAR` and `AFTER_BAR` in click).\r\n\r\nHowever the sqlite-utils `utils.file_progress` context manager wraps `click.progressbar` and yields an instance of a helper class:\r\n\r\n``` python\r\n@contextlib.contextmanager     \r\ndef file_progress(file, silent=False, **kwargs):\r\n   ...\r\n        with click.progressbar(length=file_length, **kwargs) as bar:\r\n            yield UpdateWrapper(file, bar.update) \r\n```\r\n\r\nThe yielded `UpdateWrapper` goes out of scope quickly and `click.progressbar.__exit__()` is called. The cursor is made un-invisible. Hoewever `bar` is still live and so when the caller iterates on the yielded wrapper this invokes the bar's update method, calling `render_progress()`, each time printing the \"make cursor invisible\" escape code. The `progressbar.__exit__` function is not called again, so the cursor doesn't re-appear.\r\n\r\n", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1239034903, "label": "CLI eats my cursor"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1252898131", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/433", "id": 1252898131, "node_id": "IC_kwDOCGYnMM5KrbVT", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-09-20T20:51:21Z", "updated_at": "2022-09-20T20:56:07Z", "author_association": "CONTRIBUTOR", "body": "When I run `reset` it fixes my terminal. I suspect it is related to the progress bar\r\n\r\nhttps://linux.die.net/man/1/reset\r\n\r\n```\r\n950 1s /m/d/03_Downloads \ud83d\udc11 echo $TERM\r\nxterm-kitty\r\n\u2593\u2591\u2592\u2591 /m/d/03_Downloads \ud83c\udf0f kitty -v\r\nkitty 0.26.2 created by Kovid Goyal\r\n$ sqlite-utils insert test.db facility facility-boundary-us-all.csv --csv\r\nblah blah blah (no offense)\r\n$ <no cursor>\r\n$ reset\r\n$ <cursor lives again (resurrection [explicit])>\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1239034903, "label": "CLI eats my cursor"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/423#issuecomment-1189010812", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/423", "id": 1189010812, "node_id": "IC_kwDOCGYnMM5G3t18", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-19T12:47:39Z", "updated_at": "2022-07-19T12:47:39Z", "author_association": "CONTRIBUTOR", "body": "just ran into this!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1199158210, "label": ".extract() doesn't set foreign key when extracted columns contain NULL value"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059647114", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059647114, "node_id": "IC_kwDOCGYnMM4_KO6K", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-03-05T01:54:24Z", "updated_at": "2022-03-05T01:54:24Z", "author_association": "CONTRIBUTOR", "body": "I haven't tried this, but it looks like Pandas has a method for this: https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065477258", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/411", "id": 1065477258, "node_id": "IC_kwDOCGYnMM4_geSK", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-03-11T20:14:59Z", "updated_at": "2022-03-11T20:14:59Z", "author_association": "CONTRIBUTOR", "body": "Good call on adding this to `create-table`, especially for stored columns. Having the stored/virtual split might make this tricky to implement, but I haven't gone any farther than thinking about what the CLI looks like. I'm going to try making the SQL side work first and figure that'll tell me more about what it needs.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160034488, "label": "Support for generated columns"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/409#issuecomment-1264223554", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/409", "id": 1264223554, "node_id": "IC_kwDOCGYnMM5LWoVC", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-01T03:42:50Z", "updated_at": "2022-10-01T03:42:50Z", "author_association": "CONTRIBUTOR", "body": "oh weird. it inserts into db2", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1149661489, "label": "`with db:` for transactions"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/409#issuecomment-1264223363", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/409", "id": 1264223363, "node_id": "IC_kwDOCGYnMM5LWoSD", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-01T03:41:45Z", "updated_at": "2022-10-01T03:41:45Z", "author_association": "CONTRIBUTOR", "body": "```\r\npytest xklb/check.py --pdb\r\n\r\nxklb/check.py:11: in test_transaction\r\n    assert list(db2[\"t\"].rows) == []\r\nE   AssertionError: assert [{'foo': 1}] == []\r\nE    +  where [{'foo': 1}] = list(<generator object Queryable.rows_where at 0x7f2d84d1f0d0>)\r\nE    +    where <generator object Queryable.rows_where at 0x7f2d84d1f0d0> = <Table t (foo)>.rows\r\n>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\r\n\r\n>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\r\n> /home/xk/github/xk/lb/xklb/check.py(11)test_transaction()\r\n      9     with db1.conn:\r\n     10         db1[\"t\"].insert({\"foo\": 1})\r\n---> 11         assert list(db2[\"t\"].rows) == []\r\n     12     assert list(db2[\"t\"].rows) == [{\"foo\": 1}]\r\n```\r\n\r\nIt fails because it is already inserted.\r\n\r\nbtw if you put these two lines in you pyproject.toml you can get `ipdb` in pytest\r\n\r\n```\r\n[tool.pytest.ini_options]\r\naddopts = \"--pdbcls=IPython.terminal.debugger:TerminalPdb --ignore=tests/data --capture=tee-sys --log-cli-level=ERROR\"\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1149661489, "label": "`with db:` for transactions"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/403#issuecomment-1033332570", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/403", "id": 1033332570, "node_id": "IC_kwDOCGYnMM49l2da", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-02-09T04:22:43Z", "updated_at": "2022-02-09T04:22:43Z", "author_association": "CONTRIBUTOR", "body": "dddoooope", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1126692066, "label": "Document how to add a primary key to a rowid table using `sqlite-utils transform --pk`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/403#issuecomment-1032126353", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/403", "id": 1032126353, "node_id": "IC_kwDOCGYnMM49hP-R", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-02-08T01:45:15Z", "updated_at": "2022-02-08T01:45:31Z", "author_association": "CONTRIBUTOR", "body": "you can hack something like this to achieve this result:\r\n\r\n`sqlite-utils convert my_database my_table rowid \"{'id': value}\" --multi`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1126692066, "label": "Document how to add a primary key to a rowid table using `sqlite-utils transform --pk`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1035057014", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402", "id": 1035057014, "node_id": "IC_kwDOCGYnMM49sbd2", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-10T15:30:28Z", "updated_at": "2022-02-10T15:30:40Z", "author_association": "CONTRIBUTOR", "body": "Yeah, the CLI experience is probably where any kind of multi-column, configured setup is going to fall apart. Sticking with GIS examples, one way I might think about this is using the [fiona CLI](https://fiona.readthedocs.io/en/latest/cli.html):\r\n\r\n```sh\r\n# assuming a database is already created and has SpatiaLite\r\nfio cat boundary.shp | sqlite-utils insert boundaries --conversion geometry GeometryGeoJSON -\r\n```\r\n\r\nAnyway, very interested to see where you land here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1125297737, "label": "Advanced class-based `conversions=` mechanism"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1032732242", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402", "id": 1032732242, "node_id": "IC_kwDOCGYnMM49jj5S", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-08T15:26:59Z", "updated_at": "2022-02-08T15:26:59Z", "author_association": "CONTRIBUTOR", "body": "What if you did something like this:\r\n\r\n```python\r\n\r\nclass Conversion:\r\n    def __init__(self, *args, **kwargs):\r\n        \"Put whatever settings you need here\"\r\n\r\n    def python(self, row, column, value): # not sure on args here\r\n        \"Python step to transform value\"\r\n        return value\r\n\r\n    def sql(self, row, column, value):\r\n        \"Return the actual sql that goes in the insert/update step, and maybe params\"\r\n        # value is the return of self.python()\r\n        return value, []\r\n```\r\n\r\nThis way, you're always passing an instance, which has methods that do the conversion. (Or you're passing a SQL string, as you would now.) The `__init__` could take column names, or SRID, or whatever other setup state you need per row, but the row is getting processed with the `python` and `sql` methods (or whatever you want to call them). This is pretty rough, so do what you will with names and args and such.\r\n\r\nYou'd then use it like this:\r\n\r\n```python\r\n# subclass might be unneeded here, if methods are present\r\nclass LngLatConversion(Conversion):\r\n    def __init__(self, x=\"longitude\", y=\"latitude\"):\r\n        self.x = x\r\n        self.y = y\r\n\r\n    def python(self, row, column, value):\r\n        x = row[self.x]\r\n        y = row[self.y]\r\n        return x, y\r\n\r\n    def sql(self, row, column, value):\r\n        # value is now a tuple, returned above\r\n        s = \"GeomFromText(POINT(? ?))\"\r\n        return s, value\r\n\r\ntable.insert_all(rows, conversions={\"point\": LngLatConversion(\"lng\", \"lat\"))}\r\n```\r\n\r\nI haven't thought through all the implementation details here, and it'll probably break in ways I haven't foreseen, but wanted to get this idea out of my head. Hope it helps.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1125297737, "label": "Advanced class-based `conversions=` mechanism"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1031791783", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402", "id": 1031791783, "node_id": "IC_kwDOCGYnMM49f-Sn", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-07T18:37:40Z", "updated_at": "2022-02-07T18:37:40Z", "author_association": "CONTRIBUTOR", "body": "I've never used it either, but it's interesting, right? Feel like I should try it for something. \r\n\r\nI'm trying to get my head around how this conversions feature might work, because I really like the idea of it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1125297737, "label": "Advanced class-based `conversions=` mechanism"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1031779460", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402", "id": 1031779460, "node_id": "IC_kwDOCGYnMM49f7SE", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-07T18:24:56Z", "updated_at": "2022-02-07T18:24:56Z", "author_association": "CONTRIBUTOR", "body": "I wonder if there's any overlap with the goals here and the `sqlite3` module's concept of adapters and converters: https://docs.python.org/3/library/sqlite3.html#sqlite-and-python-types\r\n\r\nI'm not sure that's _exactly_ what we're talking about here, but it might be a parallel with some useful ideas to borrow.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1125297737, "label": "Advanced class-based `conversions=` mechanism"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1077671779", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399", "id": 1077671779, "node_id": "IC_kwDOCGYnMM5AO_dj", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-03-24T14:11:33Z", "updated_at": "2022-03-24T14:11:43Z", "author_association": "CONTRIBUTOR", "body": "Coming back to this. I was about to add a utility function to [datasette-geojson]() to convert lat/lng columns to geometries. Thankfully I googled first. There's a SpatiaLite function for this: [MakePoint](https://www.gaia-gis.it/gaia-sins/spatialite-sql-latest.html#p0).\r\n\r\n```sql\r\nselect MakePoint(longitude, latitude) as geometry from places;\r\n```\r\n\r\nI'm not sure if that would work with `conversions`, since it needs two columns, but it's an option for tables that already have latitude, longitude columns.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1124731464, "label": "Make it easier to insert geometries, with documentation and maybe code"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030741289", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399", "id": 1030741289, "node_id": "IC_kwDOCGYnMM49b90p", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-06T03:03:43Z", "updated_at": "2022-02-06T03:03:43Z", "author_association": "CONTRIBUTOR", "body": "> I wonder if there are any interesting non-geospatial canned conversions that it would be worth including?\r\n\r\nOff the top of my head:\r\n\r\n- Un-nesting JSON objects into columns\r\n- Splitting arrays\r\n- Normalizing dates and times\r\n- URL munging with `urlparse`\r\n- Converting strings to numbers\r\n\r\nSome of this is easy enough with SQL functions, some is easier in Python. Maybe that's where having pre-built classes gets really handy, because it saves you from thinking about which way it's implemented.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1124731464, "label": "Make it easier to insert geometries, with documentation and maybe code"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740826", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399", "id": 1030740826, "node_id": "IC_kwDOCGYnMM49b9ta", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-06T02:59:10Z", "updated_at": "2022-02-06T02:59:10Z", "author_association": "CONTRIBUTOR", "body": "All this said, I don't think it's unreasonable to point people to dedicated tools like `geojson-to-sqlite`. If I'm dealing with a bunch of GeoJSON or Shapefiles, I need to something to read those anyway (or I need to figure out virtual tables). But something like this might make it easier to build those libraries, or standardize the underlying parts.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1124731464, "label": "Make it easier to insert geometries, with documentation and maybe code"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740653", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399", "id": 1030740653, "node_id": "IC_kwDOCGYnMM49b9qt", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-06T02:57:17Z", "updated_at": "2022-02-06T02:57:17Z", "author_association": "CONTRIBUTOR", "body": "I like the idea of having stock conversions you could import. I'd actually move them to a dedicated module (call it `sqlite_utils.conversions` or something), because it's different from other utilities. Maybe they even take configuration, or they're composable.\r\n\r\n```python\r\nfrom sqlite_utils.conversions import LongitudeLatitude\r\n\r\ndb[\"places\"].insert(\r\n    {\r\n        \"name\": \"London\",\r\n        \"lng\": -0.118092,\r\n        \"lat\": 51.509865,\r\n    },\r\n    conversions={\"point\": LongitudeLatitude(\"lng\", \"lat\")},\r\n)\r\n```\r\n\r\nI would definitely use that for every CSV I get with lat/lng columns where I actually need GeoJSON.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1124731464, "label": "Make it easier to insert geometries, with documentation and maybe code"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1038336591", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/398", "id": 1038336591, "node_id": "IC_kwDOCGYnMM4948JP", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-13T18:48:21Z", "updated_at": "2022-02-13T18:49:49Z", "author_association": "CONTRIBUTOR", "body": "Been chipping away at this between other things and realized `sqlite-utils init-spatialite` is probably unnecessary. Any of the other commands requires running `db.init_spatialite` to have the extension functions available, and that will do everything `init-spatialite` would do.\r\n\r\nI think it's probably worth keeping a SpatiaLite flag on `create-database` in case you wanted to create all the spatial metadata up front. Otherwise, it's going to get added the first time you run `add-geometry-column` or `create-spatial-index`, which is probably fine in most cases.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1124237013, "label": "Add SpatiaLite helpers to CLI"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030629879", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/398", "id": 1030629879, "node_id": "IC_kwDOCGYnMM49bin3", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-02-05T13:57:33Z", "updated_at": "2022-02-05T19:49:38Z", "author_association": "CONTRIBUTOR", "body": "I'm mostly using [geojson-to-sqlite](https://github.com/simonw/geojson-to-sqlite) at the moment. Even with shapefiles, I'm usually converting to GeoJSON and projecting to EPSG:4326 (with [ogr2ogr](https://gdal.org/programs/ogr2ogr.html)) first. \r\n\r\nI think an open question here is how much you want to leave to external libraries and how much you want here. My thinking has been that adding Spatialite helpers here would make external stuff easier, but it would be nice to have some standard way to insert geometries.\r\n\r\nI'm in the middle of adding GeoJSON and Spatialite support to [geocode-sqlite](https://github.com/eyeseast/geocode-sqlite), and that will probably use WKT. Since that's all points, I think I can just make the string inline. But for polygons, I'd generally use Shapely, which probably isn't a dependency you want to add to sqlite-utils.\r\n\r\nI've also been trying to get some of the approaches [here](https://www.gaia-gis.it/fossil/libspatialite/wiki?name=Supporting+GeoJSON) to work, but haven't had any success so far.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1124237013, "label": "Add SpatiaLite helpers to CLI"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1009548580", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1009548580, "node_id": "IC_kwDOCGYnMM48LH0k", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-11T02:43:34Z", "updated_at": "2022-01-11T02:43:34Z", "author_association": "CONTRIBUTOR", "body": "thanks so much! always a pleasure to see how you work through these things", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008275546", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008275546, "node_id": "IC_kwDOCGYnMM48GRBa", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-09T11:01:15Z", "updated_at": "2022-01-09T13:37:51Z", "author_association": "CONTRIBUTOR", "body": "i don\u2019t want to be such a partisan for analyze, but the query planner deciding *not* to use an index based on information collected by analyze is not necessarily a bug, but could be the correct choice.\r\n\r\n<s>the original poster in that stack overflow doesn\u2019t say there\u2019s a performance regression </s>", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008166084", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008166084, "node_id": "IC_kwDOCGYnMM48F2TE", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:32:47Z", "updated_at": "2022-01-08T22:32:47Z", "author_association": "CONTRIBUTOR", "body": "or using \u201c pragma optimize\u201d", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164786", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008164786, "node_id": "IC_kwDOCGYnMM48F1-y", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:24:19Z", "updated_at": "2022-01-08T22:24:19Z", "author_association": "CONTRIBUTOR", "body": "the out-of-date scenario you describe could be addressed by automatically adding an analyze to the insert or convert commands if they implicate an index", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164116", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008164116, "node_id": "IC_kwDOCGYnMM48F10U", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:18:57Z", "updated_at": "2022-01-08T22:18:57Z", "author_association": "CONTRIBUTOR", "body": "the table with the query ran so bad was about 50k. \r\n\r\ni think the scenario should not be worse than no stats. \r\n\r\ni also did not know that sqlite was so different from postgres and needed an explicit analyze call.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008161965", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008161965, "node_id": "IC_kwDOCGYnMM48F1St", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:02:56Z", "updated_at": "2022-01-08T22:02:56Z", "author_association": "CONTRIBUTOR", "body": "for options 2 and 3, i would worry about discoverablity. \r\n\r\nin other db\u2019s it is not necessary to explicitly call analyze for most indices. ie for postgres\r\n\r\n> The system regularly collects statistics on all of a table's columns. Newly-created non-expression indexes can immediately use these statistics to determine an index's usefulness.\r\n\r\ni suppose i would propose raising a warning if the stats table is created that explains what is going on and informs users about a \u2014no-analyze argument.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007636709", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1007636709, "node_id": "IC_kwDOCGYnMM48D1Dl", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-07T18:28:33Z", "updated_at": "2022-01-07T18:29:43Z", "author_association": "CONTRIBUTOR", "body": "i added an index to one table with sqlite-utils, and then a query that used to take about 1 second started taking hundreds of seconds. \r\n\r\nrunning analyze got me back to sub second speed.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/353#issuecomment-991405755", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/353", "id": 991405755, "node_id": "IC_kwDOCGYnMM47F6a7", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-11T01:38:29Z", "updated_at": "2021-12-11T01:38:29Z", "author_association": "CONTRIBUTOR", "body": "wow! that's awesome! thanks so much, @simonw!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077102934, "label": "Allow passing a file of code to \"sqlite-utils convert\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/348#issuecomment-983155079", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/348", "id": 983155079, "node_id": "IC_kwDOCGYnMM46mcGH", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2021-12-01T00:28:40Z", "updated_at": "2021-12-01T00:28:40Z", "author_association": "CONTRIBUTOR", "body": "I'd use this. Right now, I tend to do `touch my.db` and then `enable-wal` or whatever else, but I'm never sure if that's a bad idea.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1067771698, "label": "Command for creating an empty database"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/278#issuecomment-864621099", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/278", "id": 864621099, "node_id": "MDEyOklzc3VlQ29tbWVudDg2NDYyMTA5OQ==", "user": {"value": 601708, "label": "mcint"}, "created_at": "2021-06-20T22:39:57Z", "updated_at": "2021-06-20T22:39:57Z", "author_association": "CONTRIBUTOR", "body": "Fair. I looked into it, it looks like it could be done, but it would be _a bit ugly_. I can upload and link a gist of my exploration. **Click** can parse a first argument while still recognizing it as a sub-command keyword. From there, the program could:\r\n1. ignore it preemptively if it matches a sub-command\r\n2. and/or check if a (db) file exists at the path.\r\n\r\nIt would then also need to set a shared db argument variable.\r\n\r\nClick also makes it easy to parse arguments from environment variables.  If you're amenable, I may submit a patch for only that, which would update each sub-command to check for a DB/SQLITE_UTILS_DB environment variable. The goal would be usage that looks like: `DB=./convenient.db sqlite-utils [operation] [args]`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 923697888, "label": "Support db as first parameter before subcommand, or as environment variable"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/272#issuecomment-861944202", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/272", "id": 861944202, "node_id": "MDEyOklzc3VlQ29tbWVudDg2MTk0NDIwMg==", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2021-06-16T01:41:03Z", "updated_at": "2021-06-16T01:41:03Z", "author_association": "CONTRIBUTOR", "body": "So, I do things like this a lot, too. I like the idea of piping in from stdin. Something like this would be nice to do in a makefile:\r\n\r\n```sh\r\ncat file.csv | sqlite-utils --csv --table data - 'SELECT * FROM data WHERE col=\"whatever\"' > filtered.csv\r\n```\r\n\r\nIf you assumed that you're always piping out the same format you're piping in, the option names don't have to change. Depends how much you want to change formats.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 921878733, "label": "Idea: import CSV to memory, run SQL, export in a single command"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-964205475", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 964205475, "node_id": "IC_kwDOCGYnMM45eJuj", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-11-09T14:31:29Z", "updated_at": "2021-11-09T14:31:29Z", "author_association": "CONTRIBUTOR", "body": "i was just reaching for a tool to do this this morning", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-1032120014", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 1032120014, "node_id": "IC_kwDOCGYnMM49hObO", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-02-08T01:32:34Z", "updated_at": "2022-02-08T01:32:34Z", "author_association": "CONTRIBUTOR", "body": "if you are curious about prior art, https://github.com/jsnell/json-to-multicsv is really good!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-953911245", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 953911245, "node_id": "IC_kwDOCGYnMM4424fN", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2021-10-28T14:37:55Z", "updated_at": "2021-10-28T14:37:55Z", "author_association": "CONTRIBUTOR", "body": "I've been thinking about this a bit lately, doing a project that involves moving a lot of data in and out of SQLite files, datasette and GeoJSON. This has me leaning toward the idea that something like [`datasette query`](https://github.com/simonw/datasette/issues/1356) would be a better place to do async queries.\r\n\r\nI know there's a lot of overlap in sqlite-utils and datasette, and maybe keeping sqlite-utils synchronous would let datasette be entirely async and give a cleaner separation of implementations.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787121933", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787121933, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzEyMTkzMw==", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2021-02-27T19:18:57Z", "updated_at": "2021-02-27T19:18:57Z", "author_association": "CONTRIBUTOR", "body": "I think HTTPX gets it exactly right, with a clear separation between sync and async clients, each with a basically identical API. (I'm about to switch [feed-to-sqlite](https://github.com/eyeseast/feed-to-sqlite) over to it, from Requests, to eventually make way for async support.)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/145#issuecomment-683382252", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/145", "id": 683382252, "node_id": "MDEyOklzc3VlQ29tbWVudDY4MzM4MjI1Mg==", "user": {"value": 96218, "label": "simonwiles"}, "created_at": "2020-08-30T06:27:25Z", "updated_at": "2020-08-30T06:27:52Z", "author_association": "CONTRIBUTOR", "body": "Note: had to adjust the test above because trying to exhaust a `SQLITE_MAX_VARIABLE_NUMBER` of 250000 in 99 records requires 2526 columns, and trips the ` \"Rows can have a maximum of {} columns\".format(SQLITE_MAX_VARS)` check even before it trips the default `SQLITE_MAX_COLUMN` value (2000).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688659182, "label": "Bug when first record contains fewer columns than subsequent records"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/139#issuecomment-682815377", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/139", "id": 682815377, "node_id": "MDEyOklzc3VlQ29tbWVudDY4MjgxNTM3Nw==", "user": {"value": 96218, "label": "simonwiles"}, "created_at": "2020-08-28T16:14:58Z", "updated_at": "2020-08-28T16:14:58Z", "author_association": "CONTRIBUTOR", "body": "Thanks!  And yeah, I had updating the docs on my list too :)  Will try to get to it this afternoon (budgeting time is fraught with uncertainty at the moment!).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 686978131, "label": "insert_all(..., alter=True) should work for new columns introduced after the first 100 records"}, "performed_via_github_app": null}