{"html_url": "https://github.com/simonw/datasette/issues/1177#issuecomment-1074017633", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1177", "id": 1074017633, "node_id": "IC_kwDOBm6k_c5ABDVh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T15:08:51Z", "updated_at": "2022-03-21T15:08:51Z", "author_association": "OWNER", "body": "Related:\r\n- #1062 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 780153562, "label": "Ability to stream all rows as newline-delimited JSON"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1074019047", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1074019047, "node_id": "IC_kwDOBm6k_c5ABDrn", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T15:09:56Z", "updated_at": "2022-03-21T15:09:56Z", "author_association": "OWNER", "body": "I should research how much overhead creating a new connection costs - it may be that an easy way to solve this is to create A dedicated connection for the query and then close that connection at the end.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1660#issuecomment-1074136176", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1660", "id": 1074136176, "node_id": "IC_kwDOBm6k_c5ABgRw", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T16:38:46Z", "updated_at": "2022-03-21T16:38:46Z", "author_association": "OWNER", "body": "I'm going to refactor this stuff out and document it so it can be easily used by plugins:\r\n\r\nhttps://github.com/simonw/datasette/blob/4a4164b81191dec35e423486a208b05a9edc65e4/datasette/views/base.py#L69-L103", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1170144879, "label": "Refactor and simplify Datasette routing and views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1675#issuecomment-1074142617", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1675", "id": 1074142617, "node_id": "IC_kwDOBm6k_c5ABh2Z", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T16:45:27Z", "updated_at": "2022-03-21T16:45:27Z", "author_association": "OWNER", "body": "Though at that point `check_permission` is such a light wrapper around `self.ds.permission_allowed()` that there's little point in it existing at all.\r\n\r\nSo maybe `check_permisions()` becomes `ds.permissions_allowed()`.\r\n\r\n`permission_allowed()` v.s. `permissions_allowed()` is a bit of a subtle naming difference, but I think it works.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175648453, "label": "Extract out `check_permissions()` from `BaseView"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1675#issuecomment-1074143209", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1675", "id": 1074143209, "node_id": "IC_kwDOBm6k_c5ABh_p", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T16:46:05Z", "updated_at": "2022-03-21T16:46:05Z", "author_association": "OWNER", "body": "The other difference though is that `ds.permission_allowed(...)` works against an actor, while `check_permission()` works against a request (though just to access `request.actor`).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175648453, "label": "Extract out `check_permissions()` from `BaseView"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1675#issuecomment-1074141457", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1675", "id": 1074141457, "node_id": "IC_kwDOBm6k_c5ABhkR", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T16:44:09Z", "updated_at": "2022-03-21T16:44:09Z", "author_association": "OWNER", "body": "A slightly odd thing about these methods is that they either fail silently or they raise a `Forbidden` exception.\r\n\r\nMaybe they should instead return `True` or `False` and the calling code could decide if it wants to raise the exception? That would make them more usable and a little less surprising.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175648453, "label": "Extract out `check_permissions()` from `BaseView"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1675#issuecomment-1074158890", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1675", "id": 1074158890, "node_id": "IC_kwDOBm6k_c5ABl0q", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T16:57:15Z", "updated_at": "2022-03-21T16:57:15Z", "author_association": "OWNER", "body": "Idea: `ds.permission_allowed()` continues to just return `True` or `False`.\r\n\r\nA new `ds.ensure_permissions(...)` method is added which raises a `Forbidden` exception if a check fails (hence the different name)`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175648453, "label": "Extract out `check_permissions()` from `BaseView"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1675#issuecomment-1074156779", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1675", "id": 1074156779, "node_id": "IC_kwDOBm6k_c5ABlTr", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T16:55:08Z", "updated_at": "2022-03-21T16:56:02Z", "author_association": "OWNER", "body": "One benefit of the current design of `check_permissions` that raises an exception is that the exception includes information on WHICH of the permission checks failed. Returning just `True` or `False` loses that information.\r\n\r\nI could return an object which evaluates to `False` but also carries extra information? Bit weird, I've never seen anything like that in other Python code.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175648453, "label": "Extract out `check_permissions()` from `BaseView"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1675#issuecomment-1074161523", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1675", "id": 1074161523, "node_id": "IC_kwDOBm6k_c5ABmdz", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T16:59:55Z", "updated_at": "2022-03-21T17:00:03Z", "author_association": "OWNER", "body": "Also calling that function `permissions_allowed()` is confusing because there is a plugin hook with a similar name already: https://docs.datasette.io/en/stable/plugin_hooks.html#permission-allowed-datasette-actor-action-resource", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175648453, "label": "Extract out `check_permissions()` from `BaseView"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1675#issuecomment-1074177827", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1675", "id": 1074177827, "node_id": "IC_kwDOBm6k_c5ABqcj", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T17:14:31Z", "updated_at": "2022-03-21T17:14:31Z", "author_association": "OWNER", "body": "Updated documentation: https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/docs/internals.rst#await-ensure_permissionsactor-permissions\r\n\r\n> This method allows multiple permissions to be checked at onced. It raises a `datasette.Forbidden` exception if any of the checks are denied before one of them is explicitly granted.\r\n> \r\n> This is useful when you need to check multiple permissions at once. For example, an actor should be able to view a table if either one of the following checks returns `True` or not a single one of them returns `False`:\r\n\r\nThat's pretty hard to understand! I'm going to open a separate issue to reconsider if this is a useful enough abstraction given how confusing it is.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175648453, "label": "Extract out `check_permissions()` from `BaseView"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1676#issuecomment-1074178865", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1676", "id": 1074178865, "node_id": "IC_kwDOBm6k_c5ABqsx", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T17:15:27Z", "updated_at": "2022-03-21T17:15:27Z", "author_association": "OWNER", "body": "This method here: https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/datasette/app.py#L632-L664", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175690070, "label": "Reconsider ensure_permissions() logic, can it be less confusing?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1676#issuecomment-1074180312", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1676", "id": 1074180312, "node_id": "IC_kwDOBm6k_c5ABrDY", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T17:16:45Z", "updated_at": "2022-03-21T17:16:45Z", "author_association": "OWNER", "body": "When looking at this code earlier I assumed that the following would check each permission in turn and fail if any of them failed:\r\n```python\r\nawait self.ds.ensure_permissions(\r\n request.actor,\r\n [\r\n (\"view-table\", (database, table)),\r\n (\"view-database\", database),\r\n \"view-instance\",\r\n ]\r\n)\r\n```\r\nBut it's not quite that simple: if any of them fail, it fails... but if an earlier one returns `True` the whole stack passes even if there would have been a failure later on!\r\n\r\nIf that is indeed the right abstraction, I need to work to make the documentation as clear as possible.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175690070, "label": "Reconsider ensure_permissions() logic, can it be less confusing?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1677#issuecomment-1074184240", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1677", "id": 1074184240, "node_id": "IC_kwDOBm6k_c5ABsAw", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T17:20:17Z", "updated_at": "2022-03-21T17:20:17Z", "author_association": "OWNER", "body": "https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/datasette/views/base.py#L69-L77\r\n\r\nThis is weirdly different from how `check_permissions()` used to work, in that it doesn't differentiate between `None` and `False`.\r\n\r\nhttps://github.com/simonw/datasette/blob/4a4164b81191dec35e423486a208b05a9edc65e4/datasette/views/base.py#L79-L103", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175694248, "label": "Remove `check_permission()` from `BaseView`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/276#issuecomment-1074479768", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/276", "id": 1074479768, "node_id": "IC_kwDOBm6k_c5AC0KY", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T22:22:20Z", "updated_at": "2022-03-21T22:22:20Z", "author_association": "OWNER", "body": "I'm closing this issue because this is now solved by a number of neat plugins:\r\n\r\n- https://datasette.io/plugins/datasette-geojson-map shows the geometry from SpatiaLite columns on a map\r\n- https://datasette.io/plugins/datasette-leaflet-geojson can be used to display inline maps next to each column", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 324835838, "label": "Handle spatialite geometry columns better"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/339#issuecomment-1074479932", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/339", "id": 1074479932, "node_id": "IC_kwDOBm6k_c5AC0M8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T22:22:34Z", "updated_at": "2022-03-21T22:22:34Z", "author_association": "OWNER", "body": "Closing this as obsolete since Datasette no longer uses Sanic.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 340396247, "label": "Expose SANIC_RESPONSE_TIMEOUT config option in a sensible way"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1660#issuecomment-1074287177", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1660", "id": 1074287177, "node_id": "IC_kwDOBm6k_c5ACFJJ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T18:51:42Z", "updated_at": "2022-03-21T18:51:42Z", "author_association": "OWNER", "body": "`BaseView` is looking a LOT slimmer now that I've moved all of the permissions stuff out of it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1170144879, "label": "Refactor and simplify Datasette routing and views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1678#issuecomment-1074302559", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1678", "id": 1074302559, "node_id": "IC_kwDOBm6k_c5ACI5f", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:04:03Z", "updated_at": "2022-03-21T19:04:03Z", "author_association": "OWNER", "body": "Documentation: https://docs.datasette.io/en/latest/internals.html#await-check-visibility-actor-action-resource-none", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175715988, "label": "Make `check_visibility()` a documented API"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1660#issuecomment-1074321862", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1660", "id": 1074321862, "node_id": "IC_kwDOBm6k_c5ACNnG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:19:01Z", "updated_at": "2022-03-21T19:19:01Z", "author_association": "OWNER", "body": "I've simplified this a ton now. I'm going to keep working on this in the long-term but I think this issue can be closed.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1170144879, "label": "Refactor and simplify Datasette routing and views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074331743", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074331743, "node_id": "IC_kwDOBm6k_c5ACQBf", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:30:05Z", "updated_at": "2022-03-21T19:30:05Z", "author_association": "OWNER", "body": "https://github.com/simonw/datasette/blob/1a7750eb29fd15dd2eea3b9f6e33028ce441b143/datasette/app.py#L118-L122 sets it to 50ms for facet suggestion but that's not going to pass `ms < 50`:\r\n\r\n```python\r\n Setting(\r\n \"facet_suggest_time_limit_ms\",\r\n 50,\r\n \"Time limit for calculating a suggested facet\",\r\n ),\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074332325", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074332325, "node_id": "IC_kwDOBm6k_c5ACQKl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:30:44Z", "updated_at": "2022-03-21T19:30:44Z", "author_association": "OWNER", "body": "So it looks like even for facet suggestion `n=1000` always - it's never reduced to `n=1`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074332718", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074332718, "node_id": "IC_kwDOBm6k_c5ACQQu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:31:10Z", "updated_at": "2022-03-21T19:31:10Z", "author_association": "OWNER", "body": "How long does it take for SQLite to execute 1000 opcodes anyway?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074337997", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074337997, "node_id": "IC_kwDOBm6k_c5ACRjN", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:37:08Z", "updated_at": "2022-03-21T19:37:08Z", "author_association": "OWNER", "body": "This is weird:\r\n```python\r\nimport sqlite3\r\n\r\ndb = sqlite3.connect(\":memory:\")\r\n\r\ni = 0\r\n\r\ndef count():\r\n global i\r\n i += 1\r\n\r\n\r\ndb.set_progress_handler(count, 1)\r\n\r\ndb.execute(\"\"\"\r\nwith recursive counter(x) as (\r\n select 0\r\n union\r\n select x + 1 from counter\r\n)\r\nselect * from counter limit 10000;\r\n\"\"\")\r\n\r\nprint(i)\r\n```\r\nOutputs `24`. But if you try the same thing in the SQLite console:\r\n```\r\nsqlite> .stats vmstep\r\nsqlite> with recursive counter(x) as (\r\n ...> select 0\r\n ...> union\r\n ...> select x + 1 from counter\r\n ...> )\r\n ...> select * from counter limit 10000;\r\n...\r\nVM-steps: 200007\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074341924", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074341924, "node_id": "IC_kwDOBm6k_c5ACSgk", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:42:08Z", "updated_at": "2022-03-21T19:42:08Z", "author_association": "OWNER", "body": "Here's the Python-C implementation of `set_progress_handler`: https://github.com/python/cpython/blob/4674fd4e938eb4a29ccd5b12c15455bd2a41c335/Modules/_sqlite/connection.c#L1177-L1201\r\n\r\nIt calls `sqlite3_progress_handler(self->db, n, progress_callback, ctx);`\r\n\r\nhttps://www.sqlite.org/c3ref/progress_handler.html says:\r\n\r\n> The parameter N is the approximate number of [virtual machine instructions](https://www.sqlite.org/opcode.html) that are evaluated between successive invocations of the callback X\r\n\r\nSo maybe VM-steps and virtual machine instructions are different things?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074347023", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074347023, "node_id": "IC_kwDOBm6k_c5ACTwP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:48:59Z", "updated_at": "2022-03-21T19:48:59Z", "author_association": "OWNER", "body": "Posed a question about that here: https://sqlite.org/forum/forumpost/de9ff10fa7", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1676#issuecomment-1074378472", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1676", "id": 1074378472, "node_id": "IC_kwDOBm6k_c5ACbbo", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T20:18:10Z", "updated_at": "2022-03-21T20:18:10Z", "author_association": "OWNER", "body": "Maybe there is a better name for this method that helps emphasize its cascading nature.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175690070, "label": "Reconsider ensure_permissions() logic, can it be less confusing?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074439309", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074439309, "node_id": "IC_kwDOBm6k_c5ACqSN", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:28:58Z", "updated_at": "2022-03-21T21:28:58Z", "author_association": "OWNER", "body": "David Raymond solved it there: https://sqlite.org/forum/forumpost/330c8532d8a88bcd\r\n\r\n> Don't forget to step through the results. All .execute() has done is prepared it.\r\n>\r\n> db.execute(query).fetchall()\r\n\r\nSure enough, adding that gets the VM steps number up to 190,007 which is close enough that I'm happy.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074446576", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074446576, "node_id": "IC_kwDOBm6k_c5ACsDw", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:38:27Z", "updated_at": "2022-03-21T21:38:27Z", "author_association": "OWNER", "body": "OK here's a microbenchmark script:\r\n```python\r\nimport sqlite3\r\nimport timeit\r\n\r\ndb = sqlite3.connect(\":memory:\")\r\ndb_with_progress_handler_1 = sqlite3.connect(\":memory:\")\r\ndb_with_progress_handler_1000 = sqlite3.connect(\":memory:\")\r\n\r\ndb_with_progress_handler_1.set_progress_handler(lambda: None, 1)\r\ndb_with_progress_handler_1000.set_progress_handler(lambda: None, 1000)\r\n\r\ndef execute_query(db):\r\n cursor = db.execute(\"\"\"\r\n with recursive counter(x) as (\r\n select 0\r\n union\r\n select x + 1 from counter\r\n )\r\n select * from counter limit 10000;\r\n \"\"\")\r\n list(cursor.fetchall())\r\n\r\n\r\nprint(\"Without progress_handler\")\r\nprint(timeit.timeit(lambda: execute_query(db), number=100))\r\n\r\nprint(\"progress_handler every 1000 ops\")\r\nprint(timeit.timeit(lambda: execute_query(db_with_progress_handler_1000), number=100))\r\n\r\nprint(\"progress_handler every 1 op\")\r\nprint(timeit.timeit(lambda: execute_query(db_with_progress_handler_1), number=100))\r\n```\r\nResults:\r\n```\r\n% python3 bench.py\r\nWithout progress_handler\r\n0.8789225700311363\r\nprogress_handler every 1000 ops\r\n0.8829826560104266\r\nprogress_handler every 1 op\r\n2.8892734259716235\r\n```\r\n\r\nSo running every 1000 ops makes almost no difference at all, but running every single op is a 3.2x performance degradation.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074458506", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074458506, "node_id": "IC_kwDOBm6k_c5ACu-K", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:53:47Z", "updated_at": "2022-03-21T21:53:47Z", "author_association": "OWNER", "body": "Oh interesting, it turns out there is ONE place in the code that sets the `ms` to less than 20 - this test fixture: https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/tests/fixtures.py#L224-L226", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074454687", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074454687, "node_id": "IC_kwDOBm6k_c5ACuCf", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:48:02Z", "updated_at": "2022-03-21T21:48:02Z", "author_association": "OWNER", "body": "Here's another microbenchmark that measures how many nanoseconds it takes to run 1,000 vmops:\r\n\r\n```python\r\nimport sqlite3\r\nimport time\r\n\r\ndb = sqlite3.connect(\":memory:\")\r\n\r\ni = 0\r\nout = []\r\n\r\ndef count():\r\n global i\r\n i += 1000\r\n out.append(((i, time.perf_counter_ns())))\r\n\r\ndb.set_progress_handler(count, 1000)\r\n\r\nprint(\"Start:\", time.perf_counter_ns())\r\nall = db.execute(\"\"\"\r\nwith recursive counter(x) as (\r\n select 0\r\n union\r\n select x + 1 from counter\r\n)\r\nselect * from counter limit 10000;\r\n\"\"\").fetchall()\r\nprint(\"End:\", time.perf_counter_ns())\r\n\r\nprint()\r\nprint(\"So how long does it take to execute 1000 ops?\")\r\n\r\nprev_time_ns = None\r\nfor i, time_ns in out:\r\n if prev_time_ns is not None:\r\n print(time_ns - prev_time_ns, \"ns\")\r\n prev_time_ns = time_ns\r\n```\r\nRunning it:\r\n```\r\n% python nanobench.py\r\nStart: 330877620374821\r\nEnd: 330877632515822\r\n\r\nSo how long does it take to execute 1000 ops?\r\n47290 ns\r\n49573 ns\r\n48226 ns\r\n45674 ns\r\n53238 ns\r\n47313 ns\r\n52346 ns\r\n48689 ns\r\n47092 ns\r\n87596 ns\r\n69999 ns\r\n52522 ns\r\n52809 ns\r\n53259 ns\r\n52478 ns\r\n53478 ns\r\n65812 ns\r\n```\r\n87596ns is 0.087596ms - so even a measure rate of every 1000 ops is easily finely grained enough to capture differences of less than 0.1ms.\r\n\r\nIf anything I could bump that default 1000 up - and I can definitely eliminate the `if ms < 50` branch entirely.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074459746", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074459746, "node_id": "IC_kwDOBm6k_c5ACvRi", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:55:45Z", "updated_at": "2022-03-21T21:55:45Z", "author_association": "OWNER", "body": "I'm going to change the original logic to set n=1 for times that are `<= 20ms` - and update the comments to make it more obvious what is happening.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1671#issuecomment-1074465536", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1671", "id": 1074465536, "node_id": "IC_kwDOBm6k_c5ACwsA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T22:04:31Z", "updated_at": "2022-03-21T22:04:31Z", "author_association": "OWNER", "body": "Oh this is fascinating! I replicated the bug (thanks for the steps to reproduce) and it looks like this is down to the following:\r\n\r\n\"image\"\r\n\r\nAgainst views, `where has_expired = 1` returns different results from `where has_expired = '1'`\r\n\r\nThis doesn't happen against tables because of SQLite's [type affinity](https://www.sqlite.org/datatype3.html#type_affinity) mechanism, which handles the type conversion automatically.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1174655187, "label": "Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1671#issuecomment-1074470568", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1671", "id": 1074470568, "node_id": "IC_kwDOBm6k_c5ACx6o", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T22:11:14Z", "updated_at": "2022-03-21T22:12:49Z", "author_association": "OWNER", "body": "I wonder if this will be a problem with generated columns, or with SQLite strict tables?\r\n\r\nMy hunch is that strict tables will continue to work without any changes, because https://www.sqlite.org/stricttables.html says nothing about their impact on comparison operations. I should test this to make absolutely sure though.\r\n\r\nGenerated columns have a type, so my hunch is they will continue to work fine too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1174655187, "label": "Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1671#issuecomment-1074468450", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1671", "id": 1074468450, "node_id": "IC_kwDOBm6k_c5ACxZi", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T22:08:35Z", "updated_at": "2022-03-21T22:10:00Z", "author_association": "OWNER", "body": "Relevant section of the SQLite documentation: [3.2. Affinity Of Expressions](https://www.sqlite.org/datatype3.html#affinity_of_expressions):\r\n\r\n> When an expression is a simple reference to a column of a real table (not a [VIEW](https://www.sqlite.org/lang_createview.html) or subquery) then the expression has the same affinity as the table column.\r\n\r\nIn your example, `has_expired` is no longer a simple reference to a column of a real table, hence the bug.\r\n\r\nThen [4.2. Type Conversions Prior To Comparison](https://www.sqlite.org/datatype3.html#type_conversions_prior_to_comparison) fills in the rest:\r\n\r\n> SQLite may attempt to convert values between the storage classes INTEGER, REAL, and/or TEXT before performing a comparison. Whether or not any conversions are attempted before the comparison takes place depends on the type affinity of the operands. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1174655187, "label": "Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1671#issuecomment-1074478299", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1671", "id": 1074478299, "node_id": "IC_kwDOBm6k_c5ACzzb", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T22:20:26Z", "updated_at": "2022-03-21T22:20:26Z", "author_association": "OWNER", "body": "Thinking about options for fixing this...\r\n\r\nThe following query works fine:\r\n```sql\r\nselect * from test_view where cast(has_expired as text) = '1'\r\n```\r\nI don't want to start using this for every query, because one of the goals of Datasette is to help people who are learning SQL:\r\n- #1613\r\n\r\nIf someone clicks on \"View and edit SQL\" from a filtered table page I don't want them to have to wonder why that `cast` is there.\r\n\r\nBut... for querying views, the `cast` turns out to be necessary.\r\n\r\nSo one fix would be to get the SQL generating logic to use casts like this any time it is operating against a view.\r\n\r\nAn even better fix would be to detect which columns in a view come from a table and which ones might not, and only use casts for the columns that aren't definitely from a table.\r\n\r\nThe trick I was exploring here might be able to help with that:\r\n- #1293 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1174655187, "label": "Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073450588", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073450588, "node_id": "IC_kwDOCGYnMM4_-45c", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:32:58Z", "updated_at": "2022-03-21T03:32:58Z", "author_association": "OWNER", "body": "Then I ran this to convert `2016-03-27` etc to `2016/03/27` so I could see which ones were later converted:\r\n\r\n sqlite-utils convert test-dates.db dates date 'value.replace(\"-\", \"/\")'\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073448904", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073448904, "node_id": "IC_kwDOCGYnMM4_-4fI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:28:12Z", "updated_at": "2022-03-21T03:30:37Z", "author_association": "OWNER", "body": "Generating a test database using a pattern from https://www.geekytidbits.com/date-range-table-sqlite/\r\n```\r\nsqlite-utils create-database test-dates.db\r\nsqlite-utils create-table test-dates.db dates id integer date text --pk id\r\nsqlite-utils test-dates.db \"WITH RECURSIVE\r\n cnt(x) AS (\r\n SELECT 0\r\n UNION ALL\r\n SELECT x+1 FROM cnt\r\n LIMIT (SELECT ((julianday('2016-04-01') - julianday('2016-03-15'))) + 1)\r\n )\r\ninsert into dates (date) select date(julianday('2016-03-15'), '+' || x || ' days') as date FROM cnt;\"\r\n```\r\nAfter running that:\r\n```\r\n% sqlite-utils rows test-dates.db dates\r\n[{\"id\": 1, \"date\": \"2016-03-15\"},\r\n {\"id\": 2, \"date\": \"2016-03-16\"},\r\n {\"id\": 3, \"date\": \"2016-03-17\"},\r\n {\"id\": 4, \"date\": \"2016-03-18\"},\r\n {\"id\": 5, \"date\": \"2016-03-19\"},\r\n {\"id\": 6, \"date\": \"2016-03-20\"},\r\n {\"id\": 7, \"date\": \"2016-03-21\"},\r\n {\"id\": 8, \"date\": \"2016-03-22\"},\r\n {\"id\": 9, \"date\": \"2016-03-23\"},\r\n {\"id\": 10, \"date\": \"2016-03-24\"},\r\n {\"id\": 11, \"date\": \"2016-03-25\"},\r\n {\"id\": 12, \"date\": \"2016-03-26\"},\r\n {\"id\": 13, \"date\": \"2016-03-27\"},\r\n {\"id\": 14, \"date\": \"2016-03-28\"},\r\n {\"id\": 15, \"date\": \"2016-03-29\"},\r\n {\"id\": 16, \"date\": \"2016-03-30\"},\r\n {\"id\": 17, \"date\": \"2016-03-31\"},\r\n {\"id\": 18, \"date\": \"2016-04-01\"}]\r\n```\r\nThen to make one of them invalid:\r\n\r\n sqlite-utils test-dates.db \"update dates set date = '//' where id = 10\"", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073451659", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073451659, "node_id": "IC_kwDOCGYnMM4_-5KL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:35:01Z", "updated_at": "2022-03-21T03:35:01Z", "author_association": "OWNER", "body": "I confirmed that if it fails for any value ALL values are left alone, since it runs in a transaction.\r\n\r\nHere's the code that does that:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/433813612ff9b4b501739fd7543bef0040dd51fe/sqlite_utils/db.py#L2523-L2526", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453230", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073453230, "node_id": "IC_kwDOCGYnMM4_-5iu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:40:37Z", "updated_at": "2022-03-21T03:40:37Z", "author_association": "OWNER", "body": "I think the options here should be:\r\n\r\n- On error, raise an exception and revert the transaction (the current default)\r\n- On error, leave the value as-is\r\n- On error, set the value to `None`\r\n\r\nThese need to be indicated by parameters to the `r.parsedate()` function.\r\n\r\nSome design options:\r\n\r\n- `ignore=True` to ignore errors - but how does it know if it should leave the value or set it to `None`? This is similar to other `ignore=True` parameters elsewhere in the Python API.\r\n- `errors=\"ignore\"`, `errors=\"set-null\"` - I don't like magic string values very much, but this is similar to Python's `str.encode(errors=)` mechanism\r\n- `errors=r.IGNORE` - using constants, which at least avoids magic strings. The other one could be `errors=r.SET_NULL`\r\n- `error=lambda v: None` or `error=lambda v: v` - this is a bit confusing though, introducing another callback that gets to have a go at converting the error if the first callback failed? And what happens if that lambda itself raises an error?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453370", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073453370, "node_id": "IC_kwDOCGYnMM4_-5k6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:41:06Z", "updated_at": "2022-03-21T03:41:06Z", "author_association": "OWNER", "body": "I'm going to try the `errors=r.IGNORE` option and see what that looks like once implemented.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073455905", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073455905, "node_id": "IC_kwDOCGYnMM4_-6Mh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:44:47Z", "updated_at": "2022-03-21T03:45:00Z", "author_association": "OWNER", "body": "This is quite nice:\r\n```\r\n% sqlite-utils convert test-dates.db dates date \"r.parsedate(value, errors=r.IGNORE)\"\r\n [####################################] 100%\r\n% sqlite-utils rows test-dates.db dates \r\n[{\"id\": 1, \"date\": \"2016-03-15\"},\r\n {\"id\": 2, \"date\": \"2016-03-16\"},\r\n {\"id\": 3, \"date\": \"2016-03-17\"},\r\n {\"id\": 4, \"date\": \"2016-03-18\"},\r\n {\"id\": 5, \"date\": \"2016-03-19\"},\r\n {\"id\": 6, \"date\": \"2016-03-20\"},\r\n {\"id\": 7, \"date\": \"2016-03-21\"},\r\n {\"id\": 8, \"date\": \"2016-03-22\"},\r\n {\"id\": 9, \"date\": \"2016-03-23\"},\r\n {\"id\": 10, \"date\": \"//\"},\r\n {\"id\": 11, \"date\": \"2016-03-25\"},\r\n {\"id\": 12, \"date\": \"2016-03-26\"},\r\n {\"id\": 13, \"date\": \"2016-03-27\"},\r\n {\"id\": 14, \"date\": \"2016-03-28\"},\r\n {\"id\": 15, \"date\": \"2016-03-29\"},\r\n {\"id\": 16, \"date\": \"2016-03-30\"},\r\n {\"id\": 17, \"date\": \"2016-03-31\"},\r\n {\"id\": 18, \"date\": \"2016-04-01\"}]\r\n% sqlite-utils convert test-dates.db dates date \"r.parsedate(value, errors=r.SET_NULL)\"\r\n [####################################] 100%\r\n% sqlite-utils rows test-dates.db dates \r\n[{\"id\": 1, \"date\": \"2016-03-15\"},\r\n {\"id\": 2, \"date\": \"2016-03-16\"},\r\n {\"id\": 3, \"date\": \"2016-03-17\"},\r\n {\"id\": 4, \"date\": \"2016-03-18\"},\r\n {\"id\": 5, \"date\": \"2016-03-19\"},\r\n {\"id\": 6, \"date\": \"2016-03-20\"},\r\n {\"id\": 7, \"date\": \"2016-03-21\"},\r\n {\"id\": 8, \"date\": \"2016-03-22\"},\r\n {\"id\": 9, \"date\": \"2016-03-23\"},\r\n {\"id\": 10, \"date\": null},\r\n {\"id\": 11, \"date\": \"2016-03-25\"},\r\n {\"id\": 12, \"date\": \"2016-03-26\"},\r\n {\"id\": 13, \"date\": \"2016-03-27\"},\r\n {\"id\": 14, \"date\": \"2016-03-28\"},\r\n {\"id\": 15, \"date\": \"2016-03-29\"},\r\n {\"id\": 16, \"date\": \"2016-03-30\"},\r\n {\"id\": 17, \"date\": \"2016-03-31\"},\r\n {\"id\": 18, \"date\": \"2016-04-01\"}]\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456155", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073456155, "node_id": "IC_kwDOCGYnMM4_-6Qb", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:45:37Z", "updated_at": "2022-03-21T03:45:37Z", "author_association": "OWNER", "body": "Prototype:\r\n```diff\r\ndiff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py\r\nindex 8255b56..0a3693e 100644\r\n--- a/sqlite_utils/cli.py\r\n+++ b/sqlite_utils/cli.py\r\n@@ -2583,7 +2583,11 @@ def _generate_convert_help():\r\n \"\"\"\r\n ).strip()\r\n recipe_names = [\r\n- n for n in dir(recipes) if not n.startswith(\"_\") and n not in (\"json\", \"parser\")\r\n+ n\r\n+ for n in dir(recipes)\r\n+ if not n.startswith(\"_\")\r\n+ and n not in (\"json\", \"parser\")\r\n+ and callable(getattr(recipes, n))\r\n ]\r\n for name in recipe_names:\r\n fn = getattr(recipes, name)\r\ndiff --git a/sqlite_utils/recipes.py b/sqlite_utils/recipes.py\r\nindex 6918661..569c30d 100644\r\n--- a/sqlite_utils/recipes.py\r\n+++ b/sqlite_utils/recipes.py\r\n@@ -1,17 +1,38 @@\r\n from dateutil import parser\r\n import json\r\n \r\n+IGNORE = object()\r\n+SET_NULL = object()\r\n \r\n-def parsedate(value, dayfirst=False, yearfirst=False):\r\n+\r\n+def parsedate(value, dayfirst=False, yearfirst=False, errors=None):\r\n \"Parse a date and convert it to ISO date format: yyyy-mm-dd\"\r\n- return (\r\n- parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).date().isoformat()\r\n- )\r\n+ try:\r\n+ return (\r\n+ parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst)\r\n+ .date()\r\n+ .isoformat()\r\n+ )\r\n+ except parser.ParserError:\r\n+ if errors is IGNORE:\r\n+ return value\r\n+ elif errors is SET_NULL:\r\n+ return None\r\n+ else:\r\n+ raise\r\n \r\n \r\n-def parsedatetime(value, dayfirst=False, yearfirst=False):\r\n+def parsedatetime(value, dayfirst=False, yearfirst=False, errors=None):\r\n \"Parse a datetime and convert it to ISO datetime format: yyyy-mm-ddTHH:MM:SS\"\r\n- return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat()\r\n+ try:\r\n+ return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat()\r\n+ except parser.ParserError:\r\n+ if errors is IGNORE:\r\n+ return value\r\n+ elif errors is SET_NULL:\r\n+ return None\r\n+ else:\r\n+ raise\r\n \r\n \r\n def jsonsplit(value, delimiter=\",\", type=str):\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456222", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073456222, "node_id": "IC_kwDOCGYnMM4_-6Re", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:45:52Z", "updated_at": "2022-03-21T03:45:52Z", "author_association": "OWNER", "body": "Needs tests and documentation.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/415#issuecomment-1073463375", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/415", "id": 1073463375, "node_id": "IC_kwDOCGYnMM4_-8BP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T04:02:36Z", "updated_at": "2022-03-21T04:02:36Z", "author_association": "OWNER", "body": "Thanks for the really clear steps to reproduce!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1171599874, "label": "Convert with `--multi` and `--dry-run` flag does not work"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/415#issuecomment-1073468996", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/415", "id": 1073468996, "node_id": "IC_kwDOCGYnMM4_-9ZE", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T04:14:42Z", "updated_at": "2022-03-21T04:14:42Z", "author_association": "OWNER", "body": "I can fix this like so:\r\n```\r\n% sqlite-utils convert demo.db demo foo '{\"foo\": \"bar\"}' --multi --dry-run\r\nabc\r\n --- becomes:\r\n{\"foo\": \"bar\"}\r\n\r\nWould affect 1 row\r\n```\r\nDiff is this:\r\n```diff\r\ndiff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py\r\nindex 0cf0468..b2a0440 100644\r\n--- a/sqlite_utils/cli.py\r\n+++ b/sqlite_utils/cli.py\r\n@@ -2676,7 +2676,10 @@ def convert(\r\n raise click.ClickException(str(e))\r\n if dry_run:\r\n # Pull first 20 values for first column and preview them\r\n- db.conn.create_function(\"preview_transform\", 1, lambda v: fn(v) if v else v)\r\n+ preview = lambda v: fn(v) if v else v\r\n+ if multi:\r\n+ preview = lambda v: json.dumps(fn(v), default=repr) if v else v\r\n+ db.conn.create_function(\"preview_transform\", 1, preview)\r\n sql = \"\"\"\r\n select\r\n [{column}] as value,\r\n```", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1171599874, "label": "Convert with `--multi` and `--dry-run` flag does not work"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/417#issuecomment-1074243540", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/417", "id": 1074243540, "node_id": "IC_kwDOCGYnMM5AB6fU", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T18:08:03Z", "updated_at": "2022-03-21T18:08:03Z", "author_association": "OWNER", "body": "I've not really thought about standards as much here as I should. It looks like there are two competing specs for newline-delimited JSON!\r\n\r\nhttp://ndjson.org/ is the one I've been using in `sqlite-utils` - and https://github.com/ndjson/ndjson-spec#31-serialization says:\r\n\r\n> The JSON texts MUST NOT contain newlines or carriage returns.\r\n\r\nhttps://jsonlines.org/ is the other one. It is slightly less clear, but it does say this:\r\n\r\n> 2. Each Line is a Valid JSON Value\r\n>\r\n> The most common values will be objects or arrays, but any JSON value is permitted.\r\n\r\nMy interpretation of both of these is that newlines in the middle of a JSON object shouldn't be allowed.\r\n\r\nSo what's `jq` doing here? It looks to me like that `jq` format is its own thing - it's not actually compatible with either of those two loose specs described above.\r\n\r\nThe `jq` docs seem to call this \"whitespace-separated JSON\": https://stedolan.github.io/jq/manual/v1.6/#Invokingjq\r\n\r\nThe thing I like about newline-delimited JSON is that it's really trivial to parse - loop through each line, run it through `json.loads()` and that's it. No need to try and unwrap JSON objects that might span multiple lines.\r\n\r\nUnless someone has written a robust Python implementation of a `jq`-compatible whitespace-separated JSON parser, I'm inclined to leave this as is. I'd be fine adding some documentation that helps point people towards `jq -c` though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175744654, "label": "insert fails on JSONL with whitespace"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/417#issuecomment-1074256603", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/417", "id": 1074256603, "node_id": "IC_kwDOCGYnMM5AB9rb", "user": {"value": 9954, "label": "blaine"}, "created_at": "2022-03-21T18:19:41Z", "updated_at": "2022-03-21T18:19:41Z", "author_association": "NONE", "body": "That makes sense; just a little hint that points folks towards doing the right thing might be helpful!\r\n\r\nfwiw, the reason I was using jq in the first place was just a quick way to extract one attribute from an actual JSON array. When I initially imported it, I got a table with a bunch of embedded JSON values, rather than a native table, because each array entry had two attributes, one with the data I _actually_ wanted. Not sure how common a use-case this is, though (and easily fixed, aside from the jq weirdness!)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175744654, "label": "insert fails on JSONL with whitespace"}, "performed_via_github_app": null}