{"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1045069481", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1045069481, "node_id": "IC_kwDOBm6k_c4-Sn6p", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-18T19:34:41Z", "updated_at": "2022-03-05T21:32:22Z", "author_association": "OWNER", "body": "I think I got format extraction working! https://regex101.com/r/A0bW1D/1\r\n\r\n ^/(?P[^/]+)/(?P(?:[^\\/\\-\\.]*|(?:\\-/)*|(?:\\-\\.)*|(?:\\-\\-)*)*?)(?:(?\\w+))?$\r\n\r\nI had to make that crazy inner one even more complicated to stop it from capturing `.` that was not part of `-.`.\r\n\r\n (?:[^\\/\\-\\.]*|(?:\\-/)*|(?:\\-\\.)*|(?:\\-\\-)*)*\r\n\r\nVisualized:\r\n\r\n\"image\"\r\n\r\nSo now I have a regex which can extract out the dot-encoded table name AND spot if there is an optional `.format` at the end:\r\n\r\n\"image\"\r\n\r\nIf I end up using this in Datasette it's going to need VERY comprehensive unit tests and inline documentation.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1645#issuecomment-1059633902", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1645", "id": 1059633902, "node_id": "IC_kwDOBm6k_c4_KLru", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:03:06Z", "updated_at": "2022-03-05T01:03:06Z", "author_association": "OWNER", "body": "I agree: this is bad.\r\n\r\nIdeally, content served from `/static/` would apply best practices for static content serving - which to my mind means the following:\r\n\r\n- Where possible, serve with a far-future cache expiry header and use an asset URL that changes when the file itself changes\r\n- For assets without that, support conditional GET to avoid transferring the whole asset if it hasn't changed\r\n- Some kind of sensible mechanism for setting cache TTLs on assets that don't have a unique-file-per-version - in particular assets that might be served from plugins.\r\n\r\nDatasette half-implemented the first of these: if you view source on https://latest.datasette.io/ you'll see it links to `/-/static/app.css?cead5a` - which in the template looks like this:\r\n\r\nhttps://github.com/simonw/datasette/blob/dd94157f8958bdfe9f45575add934ccf1aba6d63/datasette/templates/base.html#L5\r\n\r\nI had forgotten I had implemented this! Here is how it is calculated:\r\n\r\nhttps://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/app.py#L510-L516\r\n\r\nSo `app.css` right now could be safely served with a far-future cache header... only it isn't:\r\n```\r\n~ % curl -i 'https://latest.datasette.io/-/static/app.css?cead5a' \r\nHTTP/2 200 \r\ncontent-type: text/css\r\nx-databases: _memory, _internal, fixtures, extra_database\r\nx-cloud-trace-context: 9ddc825620eb53d30fc127d1c750f342\r\ndate: Sat, 05 Mar 2022 01:01:53 GMT\r\nserver: Google Frontend\r\ncontent-length: 16178\r\n```\r\nThe larger question though is what to do about other assets. I'm particularly interested in plugin assets, since visualization plugins like `datasette-vega` and `datasette-cluster-map` ship with large amounts of JavaScript and I'd really like that to be sensibly cached by default.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1154399841, "label": "Sensible `cache-control` headers for static assets, including those served by plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1645#issuecomment-1059634412", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1645", "id": 1059634412, "node_id": "IC_kwDOBm6k_c4_KLzs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:04:53Z", "updated_at": "2022-03-05T01:04:53Z", "author_association": "OWNER", "body": "The existing `app_css_hash` already isn't good enough, because I built that before `table.js` existed, and that file should obviously be smartly cached too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1154399841, "label": "Sensible `cache-control` headers for static assets, including those served by plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1645#issuecomment-1059634688", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1645", "id": 1059634688, "node_id": "IC_kwDOBm6k_c4_KL4A", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:06:08Z", "updated_at": "2022-03-05T01:06:08Z", "author_association": "OWNER", "body": "It sounds like you can workaround this with Varnish configuration for the moment, but I'm going to bump this up the list of things to fix - it's particularly relevant now as I'd like to get a solution in place before Datasette 1.0, since it's likely to be beneficial to plugins and hence should be part of the stable, documented plugin interface.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1154399841, "label": "Sensible `cache-control` headers for static assets, including those served by plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1642#issuecomment-1059635969", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1642", "id": 1059635969, "node_id": "IC_kwDOBm6k_c4_KMMB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:11:17Z", "updated_at": "2022-03-05T01:11:17Z", "author_association": "OWNER", "body": "`pip install datasette` in a fresh virtual environment doesn't show any warnings.\r\n\r\nNeither does `pip install -e '.'` in a fresh checkout. Or `pip install -e '.[test]'`.\r\n\r\nClosing this as can't reproduce.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1152072027, "label": "Dependency issue with asgiref and uvicorn"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1640#issuecomment-1059636420", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1640", "id": 1059636420, "node_id": "IC_kwDOBm6k_c4_KMTE", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:13:26Z", "updated_at": "2022-03-05T01:13:26Z", "author_association": "OWNER", "body": "Hah, this is certainly unexpected.\r\n\r\nIt looks like this is the code in question: https://github.com/simonw/datasette/blob/a6ff123de5464806441f6a6f95145c9a83b7f20b/datasette/utils/asgi.py#L259-L266\r\n\r\nYou're right: it assumes that the file it is serving won't change length while it is serving it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1148725876, "label": "Support static assets where file length may change, e.g. logs"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1640#issuecomment-1059638778", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1640", "id": 1059638778, "node_id": "IC_kwDOBm6k_c4_KM36", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:19:00Z", "updated_at": "2022-03-05T01:19:00Z", "author_association": "OWNER", "body": "The reason I implemented it like this was to support things like the `curl` progress bar if users decide to serve up large files using the `--static` mechanism.\r\n\r\nHere's the code that hooks it up to the URL resolver:\r\n\r\nhttps://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/app.py#L1001-L1005\r\n\r\nWhich uses this function:\r\n\r\nhttps://github.com/simonw/datasette/blob/a6ff123de5464806441f6a6f95145c9a83b7f20b/datasette/utils/asgi.py#L285-L310\r\n\r\nOne option here would be to support a workaround that looks something like this:\r\n\r\n http://localhost:8001/my-static/log.txt?_unknown_size=1`\r\n\r\nThe URL routing code could then look out for that `?_unknown_size=1` option and, if it's present, omit the `content-length` header entirely.\r\n\r\nIt's a bit of a cludge, but it would be pretty straight-forward to implement.\r\n\r\nWould that work for you @broccolihighkicks?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1148725876, "label": "Support static assets where file length may change, e.g. logs"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646247", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059646247, "node_id": "IC_kwDOCGYnMM4_KOsn", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:51:03Z", "updated_at": "2022-03-05T01:51:03Z", "author_association": "OWNER", "body": "I considered two ways of doing this.\r\n\r\nFirst, have methods such as `db.query_df()` and `table.rows_df` which do the same as `.query()` and `table.rows` but return a DataFrame instead of a generator of dictionaries.\r\n\r\nSecond, have a compatibility class that is imported separately such as:\r\n```python\r\nfrom sqlite_utils.pandas import Database\r\n```\r\nThen have the `.query()` and `.rows` and other similar methods return dataframes.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646543", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059646543, "node_id": "IC_kwDOCGYnMM4_KOxP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:52:47Z", "updated_at": "2022-03-05T01:52:47Z", "author_association": "OWNER", "body": "I built a prototype of that second option and it looks pretty good:\r\n\r\n\"image\"\r\n\r\nHere's the `pandas.py` prototype:\r\n\r\n```python\r\nfrom .db import Database as _Database, Table as _Table, View as _View\r\nimport pandas as pd\r\nfrom typing import (\r\n Iterable,\r\n Union,\r\n Optional,\r\n)\r\n\r\n\r\nclass Database(_Database):\r\n def query(\r\n self, sql: str, params: Optional[Union[Iterable, dict]] = None\r\n ) -> pd.DataFrame:\r\n return pd.DataFrame(super().query(sql, params))\r\n\r\n def table(self, table_name: str, **kwargs) -> Union[\"Table\", \"View\"]:\r\n \"Return a table object, optionally configured with default options.\"\r\n klass = View if table_name in self.view_names() else Table\r\n return klass(self, table_name, **kwargs)\r\n\r\n\r\nclass PandasQueryable:\r\n def rows_where(\r\n self,\r\n where: str = None,\r\n where_args: Optional[Union[Iterable, dict]] = None,\r\n order_by: str = None,\r\n select: str = \"*\",\r\n limit: int = None,\r\n offset: int = None,\r\n ) -> pd.DataFrame:\r\n return pd.DataFrame(\r\n super().rows_where(\r\n where,\r\n where_args,\r\n order_by=order_by,\r\n select=select,\r\n limit=limit,\r\n offset=offset,\r\n )\r\n )\r\n\r\n\r\nclass Table(PandasQueryable, _Table):\r\n pass\r\n\r\n\r\nclass View(PandasQueryable, _View):\r\n pass\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646645", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059646645, "node_id": "IC_kwDOCGYnMM4_KOy1", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T01:53:10Z", "updated_at": "2022-03-05T01:53:10Z", "author_association": "OWNER", "body": "I'm not an experienced enough Pandas user to know if this design is right or not. I'm going to leave this open for a while and solicit some feedback.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059647114", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059647114, "node_id": "IC_kwDOCGYnMM4_KO6K", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-03-05T01:54:24Z", "updated_at": "2022-03-05T01:54:24Z", "author_association": "CONTRIBUTOR", "body": "I haven't tried this, but it looks like Pandas has a method for this: https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649193", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059649193, "node_id": "IC_kwDOCGYnMM4_KPap", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T02:00:02Z", "updated_at": "2022-03-05T02:00:02Z", "author_association": "OWNER", "body": "Yeah, I imagine there are plenty of ways to do this with Pandas already - I'm opportunistically looking for a way to provide better integration with the rest of the Pandas situation from the work I've done in `sqlite-utils` already.\r\n\r\nMight be that this isn't worth doing at all.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649213", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059649213, "node_id": "IC_kwDOCGYnMM4_KPa9", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T02:00:10Z", "updated_at": "2022-03-05T02:00:10Z", "author_association": "OWNER", "body": "Requested feedback on Twitter here :https://twitter.com/simonw/status/1499927075930578948", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649803", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059649803, "node_id": "IC_kwDOCGYnMM4_KPkL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T02:02:41Z", "updated_at": "2022-03-05T02:02:41Z", "author_association": "OWNER", "body": "It looks like the existing `pd.read_sql_query()` method has an optional dependency on SQLAlchemy:\r\n\r\n```\r\n...\r\nimport pandas as pd\r\npd.read_sql_query(db.conn, \"select * from articles\")\r\n# ImportError: Using URI string without sqlalchemy installed.\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059650190", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059650190, "node_id": "IC_kwDOCGYnMM4_KPqO", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T02:04:43Z", "updated_at": "2022-03-05T02:04:54Z", "author_association": "OWNER", "body": "To be honest, I'm having second thoughts about this now mainly because the idiom for turning a generator of dicts into a DataFrame is SO simple:\r\n\r\n```python\r\ndf = pd.DataFrame(db.query(\"select * from articles\"))\r\n```\r\nGiven it's that simple, I'm questioning if there's any value to adding this to `sqlite-utils` at all. This likely becomes a documentation thing instead!", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059651056", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059651056, "node_id": "IC_kwDOCGYnMM4_KP3w", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T02:09:38Z", "updated_at": "2022-03-05T02:09:38Z", "author_association": "OWNER", "body": "OK, so reading results from existing `sqlite-utils` into a Pandas DataFrame turns out to be trivial.\r\n\r\nHow about writing a DataFrame to a database table?\r\n\r\nThat feels like it could a lot more useful.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059651306", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059651306, "node_id": "IC_kwDOCGYnMM4_KP7q", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T02:10:49Z", "updated_at": "2022-03-05T02:10:49Z", "author_association": "OWNER", "body": "I could teach `.insert_all()` and `.upsert_all()` to optionally accept a DataFrame. A challenge there is `mypy` - if Pandas is an optional dependency, is it possibly to declare types that accept a Union that includes DataFrame?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059652538", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059652538, "node_id": "IC_kwDOCGYnMM4_KQO6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T02:13:17Z", "updated_at": "2022-03-05T02:13:17Z", "author_association": "OWNER", "body": "> It looks like the existing `pd.read_sql_query()` method has an optional dependency on SQLAlchemy:\r\n> \r\n> ```\r\n> ...\r\n> import pandas as pd\r\n> pd.read_sql_query(db.conn, \"select * from articles\")\r\n> # ImportError: Using URI string without sqlalchemy installed.\r\n> ```\r\nHah, no I was wrong about this: SQLAlchemy is not needed for SQLite to work, I just had the arguments the wrong way round:\r\n```python\r\npd.read_sql_query(\"select * from articles\", db.conn)\r\n# Shows a DateFrame\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059652834", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412", "id": 1059652834, "node_id": "IC_kwDOCGYnMM4_KQTi", "user": {"value": 596279, "label": "zaneselvans"}, "created_at": "2022-03-05T02:14:40Z", "updated_at": "2022-03-05T02:14:40Z", "author_association": "NONE", "body": "We do a lot of `df.to_sql()` to write into sqlite, mostly in [this moddule](https://github.com/catalyst-cooperative/pudl/blob/main/src/pudl/load.py#L25)", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160182768, "label": "Optional Pandas integration"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059802318", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059802318, "node_id": "IC_kwDOBm6k_c4_K0zO", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T17:34:33Z", "updated_at": "2022-03-05T17:34:33Z", "author_association": "OWNER", "body": "Wrote documentation:\r\n\r\n\"Dash\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1647#issuecomment-1059804577", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1647", "id": 1059804577, "node_id": "IC_kwDOBm6k_c4_K1Wh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T17:49:46Z", "updated_at": "2022-03-05T17:49:46Z", "author_association": "OWNER", "body": "My best guess is that this is an undocumented change in SQLite 3.38 - I get that test failure with that SQLite version.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160407071, "label": "Test failures with SQLite 3.37.0+ due to column affinity case"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1647#issuecomment-1059807598", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1647", "id": 1059807598, "node_id": "IC_kwDOBm6k_c4_K2Fu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T18:06:56Z", "updated_at": "2022-03-05T18:08:00Z", "author_association": "OWNER", "body": "Had a look through the commits in https://github.com/sqlite/sqlite/compare/version-3.37.2...version-3.38.0 but couldn't see anything obvious that might have caused this.\r\n\r\nReally wish I had a good mechanism for running the test suite against different SQLite versions!\r\n\r\nMay have to revisit this old trick: https://til.simonwillison.net/sqlite/ld-preload", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160407071, "label": "Test failures with SQLite 3.37.0+ due to column affinity case"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1647#issuecomment-1059819628", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1647", "id": 1059819628, "node_id": "IC_kwDOBm6k_c4_K5Bs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T19:28:54Z", "updated_at": "2022-03-05T19:28:54Z", "author_association": "OWNER", "body": "OK, using that trick worked for testing this:\r\n\r\n docker run -it -p 8001:8001 ubuntu\r\n\r\nThen inside that container:\r\n\r\n apt-get install -y python3 build-essential tcl wget python3-pip git python3.8-venv\r\n\r\nFor each version of SQLite I wanted to test I needed to figure out the tarball URL - for example, for `3.38.0` I navigated to https://www.sqlite.org/src/timeline?t=version-3.38.0 and clicked the \"checkin\" link and copied the tarball link:\r\nhttps://www.sqlite.org/src/tarball/40fa792d/SQLite-40fa792d.tar.gz\r\n\r\nThen to build it (the `CPPFLAGS` took some trial and error):\r\n```\r\ncd /tmp\r\nwget https://www.sqlite.org/src/tarball/40fa792d/SQLite-40fa792d.tar.gz\r\ntar -xzvf SQLite-40fa792d.tar.gz\r\ncd SQLite-40fa792d\r\nCPPFLAGS=\"-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1\" ./configure\r\nmake\r\n```\r\nThen to test with Datasette:\r\n```\r\ncd /tmp\r\ngit clone https://github.com/simonw/datasette\r\ncd datasette\r\npython3 -m venv venv\r\nsource venv/bin/activate\r\npip install wheel # So bdist_wheel works in next step\r\npip install -e '.[test]'\r\nLD_PRELOAD=/tmp/SQLite-40fa792d/.libs/libsqlite3.so pytest\r\n```\r\n\r\nAfter some trial and error I proved that those tests passed with 3.36.0:\r\n```\r\ncd /tmp\r\nwget https://www.sqlite.org/src/tarball/5c9a6c06/SQLite-5c9a6c06.tar.gz\r\ntar -xzvf SQLite-5c9a6c06.tar.gz\r\ncd SQLite-5c9a6c06\r\nCPPFLAGS=\"-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1\" ./configure\r\nmake\r\ncd /tmp/datasette\r\nLD_PRELOAD=/tmp/SQLite-5c9a6c06/.libs/libsqlite3.so pytest tests/test_internals_database.py\r\n```\r\nBUT failed with 3.37.0:\r\n```\r\n# 3.37.0\r\ncd /tmp\r\nwget https://www.sqlite.org/src/tarball/bd41822c/SQLite-bd41822c.tar.gz\r\ntar -xzvf SQLite-bd41822c.tar.gz\r\ncd SQLite-bd41822c\r\nCPPFLAGS=\"-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1\" ./configure\r\nmake\r\ncd /tmp/datasette\r\nLD_PRELOAD=/tmp/SQLite-bd41822c/.libs/libsqlite3.so pytest tests/test_internals_database.py\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160407071, "label": "Test failures with SQLite 3.37.0+ due to column affinity case"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1647#issuecomment-1059821674", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1647", "id": 1059821674, "node_id": "IC_kwDOBm6k_c4_K5hq", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T19:44:32Z", "updated_at": "2022-03-05T19:44:32Z", "author_association": "OWNER", "body": "I thought I'd need to introduce https://dirty-equals.helpmanual.io/types/string/ to help write tests for this, but I think I've found a good alternative that doesn't need a new dependency.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160407071, "label": "Test failures with SQLite 3.37.0+ due to column affinity case"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059822151", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059822151, "node_id": "IC_kwDOBm6k_c4_K5pH", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T19:48:35Z", "updated_at": "2022-03-05T19:48:35Z", "author_association": "OWNER", "body": "Those new docs: https://github.com/simonw/datasette/blob/d1cb73180b4b5a07538380db76298618a5fc46b6/docs/internals.rst#dash-encoding", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059822391", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059822391, "node_id": "IC_kwDOBm6k_c4_K5s3", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T19:50:12Z", "updated_at": "2022-03-05T19:50:12Z", "author_association": "OWNER", "body": "I'm going to move this work to a PR.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1647#issuecomment-1059823119", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1647", "id": 1059823119, "node_id": "IC_kwDOBm6k_c4_K54P", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T19:56:27Z", "updated_at": "2022-03-05T19:56:27Z", "author_association": "OWNER", "body": "Updated this TIL with extra patterns I figured out: https://til.simonwillison.net/sqlite/ld-preload", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1160407071, "label": "Test failures with SQLite 3.37.0+ due to column affinity case"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059836599", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059836599, "node_id": "IC_kwDOBm6k_c4_K9K3", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T21:52:10Z", "updated_at": "2022-03-05T21:52:10Z", "author_association": "OWNER", "body": "Blogged about this here: https://simonwillison.net/2022/Mar/5/dash-encoding/", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059850369", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059850369, "node_id": "IC_kwDOBm6k_c4_LAiB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T23:28:56Z", "updated_at": "2022-03-05T23:28:56Z", "author_association": "OWNER", "body": "Lots of great conversations about the dash encoding implementation on Twitter: https://twitter.com/simonw/status/1500228316309061633\r\n\r\n@dracos helped me figure out a simpler regex: https://twitter.com/dracos/status/1500236433809973248\r\n\r\n`^/(?P[^/]+)/(?P
[^\\/\\-\\.]*|\\-/|\\-\\.|\\-\\-)*(?P\\.\\w+)?$`\r\n\r\n![image](https://user-images.githubusercontent.com/9599/156903088-c01933ae-4713-4e91-8d71-affebf70b945.png)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059851259", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059851259, "node_id": "IC_kwDOBm6k_c4_LAv7", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T23:35:47Z", "updated_at": "2022-03-05T23:35:59Z", "author_association": "OWNER", "body": "This [comment from glyph](https://twitter.com/glyph/status/1500244937312329730) got me thinking:\r\n\r\n> Have you considered replacing % with some other character and then using percent-encoding?\r\n\r\nWhat happens if a table name includes a `%` character and that ends up getting mangled by a misbehaving proxy?\r\n\r\nI should consider `%` in the escaping system too. And maybe go with that suggestion of using percent-encoding directly but with a different character.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059853526", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059853526, "node_id": "IC_kwDOBm6k_c4_LBTW", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T23:49:59Z", "updated_at": "2022-03-05T23:49:59Z", "author_association": "OWNER", "body": "I want to try regular percentage encoding, except that it also encodes both the `-` and the `.` characters, AND it uses `-` instead of `%` as the encoding character.\r\n\r\nShould check what it does with emoji too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1439#issuecomment-1059854864", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1439", "id": 1059854864, "node_id": "IC_kwDOBm6k_c4_LBoQ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-05T23:59:05Z", "updated_at": "2022-03-05T23:59:05Z", "author_association": "OWNER", "body": "OK, for that percentage thing: the Python core implementation of URL percentage escaping deliberately ignores two of the characters we want to escape: `.` and `-`:\r\n\r\nhttps://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L780-L783\r\n\r\n```python\r\n_ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'\r\n b'abcdefghijklmnopqrstuvwxyz'\r\n b'0123456789'\r\n b'_.-~')\r\n```\r\nIt also defaults to skipping `/` (passed as a `safe=` parameter to various things).\r\n\r\nI'm going to try borrowing and modifying the core of the Python implementation: https://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L795-L814\r\n```python\r\nclass _Quoter(dict):\r\n \"\"\"A mapping from bytes numbers (in range(0,256)) to strings.\r\n String values are percent-encoded byte values, unless the key < 128, and\r\n in either of the specified safe set, or the always safe set.\r\n \"\"\"\r\n # Keeps a cache internally, via __missing__, for efficiency (lookups\r\n # of cached keys don't call Python code at all).\r\n def __init__(self, safe):\r\n \"\"\"safe: bytes object.\"\"\"\r\n self.safe = _ALWAYS_SAFE.union(safe)\r\n\r\n def __repr__(self):\r\n return f\"\"\r\n\r\n def __missing__(self, b):\r\n # Handle a cache miss. Store quoted string in cache and return.\r\n res = chr(b) if b in self.safe else '%{:02X}'.format(b)\r\n self[b] = res\r\n return res\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 973139047, "label": "Rethink how .ext formats (v.s. ?_format=) works before 1.0"}, "performed_via_github_app": null}