{"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590153892", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590153892, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDE1Mzg5Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T03:10:45Z", "updated_at": "2020-02-24T03:13:03Z", "author_association": "OWNER", "body": "Some more detailed notes I made earlier:\r\n\r\nDatasette would run a single write thread per database. That thread gets an exclusive connection, and a queue. Plugins can add functions to the queue which will be called and given access to that connection.\r\n\r\nThe write thread for that database is created the first time a write is attempted.\r\n\r\nQuestion: should that thread have its own asyncio loop so that async techniques like httpx can be used within the thread? I think not at first - only investigate this if it turns out to be necessary in the future.\r\n\r\nThis thread will run as part of the Datasette process. This means there is always a risk that the thread will die in the middle of something because the server got restarted - so use transactions to limit risk of damage to database should that happen.\r\n\r\nI don\u2019t want web responses blocking waiting for stuff to happen here - so every task put on that queue will have a task ID, and that ID will be returned such that client code can poll for its completion.\r\n\r\nCould the request block for up to 0.5s just in case the write is really fast, then return a polling token if it isn't finished yet? Looks possible - `Queue.get` can block with a timeout.\r\n\r\nThere will be a `/-/writes` page which shows currently queued writes - so each one needs a human-readable description of some sort. (You can access a deque called `q.queue` to see what\u2019s in there)\r\n\r\nStretch goal: It would be cool if write operations could optionally handle their own progress reports. That way I can do some really nice UI around what\u2019s going on with these things.\r\n\r\nThis mechanism has a ton of potential. It may even be how we handle things like Twitter imports and suchlike - queued writing tasks.\r\n\r\nOne catch with this approach: if a plugin is reading from APIs etc it shouldn't block writes to the database while it is doing so. So sticking a function in the queue that does additional time consuming stuff is actually an anti pattern. Instead, plugins should schedule their API access in the main event loop and occasionally write just the updates they need to make to that write queue.\r\n\r\n### Implementation notes\r\n\r\nMaybe each item in the queue is a `(callable, uuid, reply_queue)` triple. You can do a blocking `.get()` on the `reply_queue` if you want to wait for the answer. The execution framework could look for the return value from `callable()` and automatically send it to `reply_queue`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590154309", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590154309, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDE1NDMwOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T03:14:10Z", "updated_at": "2020-02-24T03:14:10Z", "author_association": "OWNER", "body": "Some prior art: Charles Leifer implemented a `SqliteQueueDatabase` class that automatically queues writes for you: https://charlesleifer.com/blog/multi-threaded-sqlite-without-the-operationalerrors/", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/676#issuecomment-590209074", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/676", "id": 590209074, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDIwOTA3NA==", "user": {"value": 58088336, "label": "tunguyenatwork"}, "created_at": "2020-02-24T08:20:15Z", "updated_at": "2020-02-24T08:20:15Z", "author_association": "NONE", "body": "Awesome, thank you so much. I\u2019ll try it out and let you know.\n\nOn Sun, Feb 23, 2020 at 1:44 PM Simon Willison \nwrote:\n\n> You can try this right now like so:\n>\n> pip install https://github.com/simonw/datasette/archive/search-raw.zip\n>\n> Then use the following:\n>\n> ?_search=foo*&_searchmode=raw`\n>\n> \u2014\n> You are receiving this because you authored the thread.\n> Reply to this email directly, view it on GitHub\n> ,\n> or unsubscribe\n> \n> .\n>\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 568091133, "label": "?_searchmode=raw option for running FTS searches without escaping characters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590399600", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590399600, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDM5OTYwMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T15:56:10Z", "updated_at": "2020-02-24T15:56:23Z", "author_association": "OWNER", "body": "## Implementation plan\r\n\r\nMethod on Database class called `execute_write(sql)`\r\n\r\nWhich calls `.execute_write_fn(fn)` - so you can instead create a function that applies a whole batch of writes and pass that instead if you need to\r\n\r\nThrows an error of database isn't mutable.\r\n\r\nAdd `._writer_thread` thread property to Database - we start that thread the first time we need it. It blocks on `._writer_queue.get()`\r\n\r\nWe write to that queue with `WriteTask(fn, uuid, reply_queue)` namedtuples - then time-out block awaiting reply for 0.5s\r\n\r\nHave a `.write_status(uuid)` method that checks if `uuid` has completed\r\n\r\nThis should be enough to get it all working. MVP can skip the .5s timeout entirely\r\n\r\nBut... what about that progress bar supporting stretch goal?\r\n\r\nFor that let's have each write operation that's currently in progress have total and done integer properties. So I guess we can add those to the `WriteTask`.\r\n\r\nShould we have the ability to see what the currently executing write is? Seems useful.\r\n\r\nHopefully I can integrate https://github.com/tqdm/tqdm such that it calculates ETAs without actually trying to print to the console.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/675#issuecomment-590405736", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/675", "id": 590405736, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDQwNTczNg==", "user": {"value": 141844, "label": "aviflax"}, "created_at": "2020-02-24T16:06:27Z", "updated_at": "2020-02-24T16:06:27Z", "author_association": "NONE", "body": "> So yeah - if you're happy to design this I think it would be worth us adding.\r\n\r\nGreat! I\u2019ll give it a go.\r\n\r\n\r\n\r\n> Small design suggestion: allow `--copy` to be applied multiple times\u2026\r\n\r\nMakes a ton of sense, will do.\r\n\r\n> Also since Click arguments can take multiple options I don't think you need to have the `:` in there - although if it better matches Docker's own UI it might be more consistent to have it.\r\n\r\nGreat point. I double checked the docs for `docker cp` and in that context the colon is used to delimit a container and a path, while spaces are used to separate the source and target.\r\n\r\nThe usage string is:\r\n\r\n```text\r\ndocker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-\r\ndocker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH\r\n```\r\n\r\nso in fact it\u2019ll be more consistent to use a space to delimit the source and destination paths, like so:\r\n\r\n```shell\r\n$ datasette package --copy /the/source/path /the/target/path data.db\r\n```\r\n\r\nand I suppose the short-form version of the option should be `cp` like so:\r\n\r\n```shell\r\n$ datasette package -cp /the/source/path /the/target/path data.db\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 567902704, "label": "--cp option for datasette publish and datasette package for shipping additional files and directories"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590417366", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590417366, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDQxNzM2Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T16:27:10Z", "updated_at": "2020-02-24T16:27:10Z", "author_association": "OWNER", "body": "I wonder if I even need the `reply_queue` mechanism? Are the replies from writes generally even interesting?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590417619", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590417619, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDQxNzYxOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T16:27:36Z", "updated_at": "2020-02-24T16:27:36Z", "author_association": "OWNER", "body": "Error handling could be tricky. Exceptions thrown in threads don't show up anywhere by default - I would need to explicitly catch them and decide what to do with them.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590430988", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590430988, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDQzMDk4OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T16:50:48Z", "updated_at": "2020-02-24T16:50:48Z", "author_association": "OWNER", "body": "I'm dropping the progress bar idea. This mechanism is supposed to guarantee exclusive access to the single write connection, which means it should be targeted by operations that are as short as possible. An operation running long enough to need a progress bar is too long!\r\n\r\nAny implementation of progress bars for long running write operations needs to happen elsewhere in the stack.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590436368", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590436368, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDQzNjM2OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T17:00:21Z", "updated_at": "2020-02-24T17:00:21Z", "author_association": "OWNER", "body": "Interesting challenge: I would like to be able to \"await\" on `queue.get()` (with a timeout).\r\n\r\nProblem is: `queue.Queue()` is designed for threading and cannot be awaited. `asyncio.Queue` can be awaited but is not meant to be used with threads.\r\n\r\nhttps://stackoverflow.com/a/32894169 suggests using Janus, a thread-aware asyncio queue: https://github.com/aio-libs/janus", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590511601", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590511601, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDUxMTYwMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T19:38:27Z", "updated_at": "2020-02-24T19:38:27Z", "author_association": "OWNER", "body": "I tested this using the following code in a view (after `from sqlite_utils import Database`):\r\n```python\r\n db = next(iter(self.ds.databases.values()))\r\n db.execute_write_fn(lambda conn: Database(conn)[\"counter\"].insert({\"id\": 1, \"count\": 0}, pk=\"id\", ignore=True))\r\n db.execute_write(\"update counter set count = count + 1 where id = 1\")\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590517338", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590517338, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDUxNzMzOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T19:51:21Z", "updated_at": "2020-02-24T19:51:21Z", "author_association": "OWNER", "body": "I filed a question / feature request with Janus about supporting timeouts for `.get()` against async queues here: https://github.com/aio-libs/janus/issues/240\r\n\r\nI'm going to move ahead without needing that ability though. I figure SQLite writes are _fast_, and plugins can be trusted to implement just fast writes. So I'm going to support either fire-and-forget writes (they get added to the queue and a task ID is returned) or have the option to block awaiting the completion of the write (using Janus) but let callers decide which version they want. I may add optional timeouts some time in the future.\r\n\r\nI am going to make both `execute_write()` and `execute_write_fn()` awaitable functions though, for consistency with `.execute()` and to give me flexibility to change how they work in the future.\r\n\r\nI'll also add a `block=True` option to both of them which causes the function to wait for the write to be successfully executed - defaults to `False` (fire-and-forget mode).\r\n", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/682#issuecomment-590517744", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/682", "id": 590517744, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDUxNzc0NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T19:52:16Z", "updated_at": "2020-02-24T19:52:16Z", "author_association": "OWNER", "body": "Moving further development to a pull request: https://github.com/simonw/datasette/pull/683", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569613563, "label": "Mechanism for writing to database via a queue"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590518182", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590518182, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDUxODE4Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T19:53:12Z", "updated_at": "2020-02-24T19:53:12Z", "author_association": "OWNER", "body": "Next steps are from comment https://github.com/simonw/datasette/issues/682#issuecomment-590517338\r\n> I'm going to move ahead without needing that ability though. I figure SQLite writes are _fast_, and plugins can be trusted to implement just fast writes. So I'm going to support either fire-and-forget writes (they get added to the queue and a task ID is returned) or have the option to block awaiting the completion of the write (using Janus) but let callers decide which version they want. I may add optional timeouts some time in the future.\r\n> \r\n> I am going to make both `execute_write()` and `execute_write_fn()` awaitable functions though, for consistency with `.execute()` and to give me flexibility to change how they work in the future.\r\n> \r\n> I'll also add a `block=True` option to both of them which causes the function to wait for the write to be successfully executed - defaults to `False` (fire-and-forget mode).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/675#issuecomment-590539805", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/675", "id": 590539805, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDUzOTgwNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T20:44:59Z", "updated_at": "2020-02-24T20:45:08Z", "author_association": "OWNER", "body": "Design looks great to me.\r\n\r\nI'm not keen on two letter short versions (`-cp`) - I'd rather either have a single character or no short form at all. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 567902704, "label": "--cp option for datasette publish and datasette package for shipping additional files and directories"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/681#issuecomment-590543398", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/681", "id": 590543398, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDU0MzM5OA==", "user": {"value": 2181410, "label": "clausjuhl"}, "created_at": "2020-02-24T20:53:56Z", "updated_at": "2020-02-24T20:53:56Z", "author_association": "NONE", "body": "Excellent. I'll implement the simple plugin-solution now. And will have a go at a more mature plugin later. Thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 569317377, "label": "Cashe-header missing in http-response"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590592581", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590592581, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDU5MjU4MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:00:44Z", "updated_at": "2020-02-24T23:01:09Z", "author_association": "OWNER", "body": "I've been testing this out by running one-off demo plugins. I saved the following in a file called `write-plugins/log_asgi.py` (it's a hacked around copy of [asgi-log-to-sqlite](https://github.com/simonw/asgi-log-to-sqlite)) and then running `datasette data.db --plugins-dir=write-plugins/`:\r\n```python\r\nfrom datasette import hookimpl\r\nimport sqlite_utils\r\nimport time\r\n\r\n\r\nclass AsgiLogToSqliteViaWriteQueue:\r\n lookup_columns = (\r\n \"path\",\r\n \"user_agent\",\r\n \"referer\",\r\n \"accept_language\",\r\n \"content_type\",\r\n \"query_string\",\r\n )\r\n\r\n def __init__(self, app, db):\r\n self.app = app\r\n self.db = db\r\n self._tables_ensured = False\r\n\r\n async def ensure_tables(self):\r\n def _ensure_tables(conn):\r\n db = sqlite_utils.Database(conn)\r\n for column in self.lookup_columns:\r\n table = \"{}s\".format(column)\r\n if not db[table].exists():\r\n db[table].create({\"id\": int, \"name\": str}, pk=\"id\")\r\n if \"requests\" not in db.table_names():\r\n db[\"requests\"].create(\r\n {\r\n \"start\": float,\r\n \"method\": str,\r\n \"path\": int,\r\n \"query_string\": int,\r\n \"user_agent\": int,\r\n \"referer\": int,\r\n \"accept_language\": int,\r\n \"http_status\": int,\r\n \"content_type\": int,\r\n \"client_ip\": str,\r\n \"duration\": float,\r\n \"body_size\": int,\r\n },\r\n foreign_keys=self.lookup_columns,\r\n )\r\n await self.db.execute_write_fn(_ensure_tables)\r\n\r\n async def __call__(self, scope, receive, send):\r\n if not self._tables_ensured:\r\n self._tables_ensured = True\r\n await self.ensure_tables()\r\n\r\n response_headers = []\r\n body_size = 0\r\n http_status = None\r\n\r\n async def wrapped_send(message):\r\n nonlocal body_size, response_headers, http_status\r\n if message[\"type\"] == \"http.response.start\":\r\n response_headers = message[\"headers\"]\r\n http_status = message[\"status\"]\r\n\r\n if message[\"type\"] == \"http.response.body\":\r\n body_size += len(message[\"body\"])\r\n\r\n await send(message)\r\n\r\n start = time.time()\r\n await self.app(scope, receive, wrapped_send)\r\n end = time.time()\r\n\r\n path = str(scope[\"path\"])\r\n query_string = None\r\n if scope.get(\"query_string\"):\r\n query_string = \"?{}\".format(scope[\"query_string\"].decode(\"utf8\"))\r\n\r\n request_headers = dict(scope.get(\"headers\") or [])\r\n\r\n referer = header(request_headers, \"referer\")\r\n user_agent = header(request_headers, \"user-agent\")\r\n accept_language = header(request_headers, \"accept-language\")\r\n\r\n content_type = header(dict(response_headers), \"content-type\")\r\n\r\n def _log_to_database(conn):\r\n db = sqlite_utils.Database(conn)\r\n db[\"requests\"].insert(\r\n {\r\n \"start\": start,\r\n \"method\": scope[\"method\"],\r\n \"path\": lookup(db, \"paths\", path),\r\n \"query_string\": lookup(db, \"query_strings\", query_string),\r\n \"user_agent\": lookup(db, \"user_agents\", user_agent),\r\n \"referer\": lookup(db, \"referers\", referer),\r\n \"accept_language\": lookup(db, \"accept_languages\", accept_language),\r\n \"http_status\": http_status,\r\n \"content_type\": lookup(db, \"content_types\", content_type),\r\n \"client_ip\": scope.get(\"client\", (None, None))[0],\r\n \"duration\": end - start,\r\n \"body_size\": body_size,\r\n },\r\n alter=True,\r\n foreign_keys=self.lookup_columns,\r\n )\r\n\r\n await self.db.execute_write_fn(_log_to_database)\r\n\r\n\r\ndef header(d, name):\r\n return d.get(name.encode(\"utf8\"), b\"\").decode(\"utf8\") or None\r\n\r\n\r\ndef lookup(db, table, value):\r\n return db[table].lookup({\"name\": value}) if value else None\r\n\r\n\r\n@hookimpl\r\ndef asgi_wrapper(datasette):\r\n def wrap_with_class(app):\r\n return AsgiLogToSqliteViaWriteQueue(\r\n app, next(iter(datasette.databases.values()))\r\n )\r\n\r\n return wrap_with_class\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590593120", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590593120, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDU5MzEyMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:02:30Z", "updated_at": "2020-02-24T23:02:30Z", "author_association": "OWNER", "body": "I'm going to muck around with a couple more demo plugins - in particular one derived from [datasette-upload-csvs](https://github.com/simonw/datasette-upload-csvs) - to make sure I'm comfortable with this API - then add a couple of tests and merge it with documentation that warns \"this is still an experimental feature and may change\".", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/675#issuecomment-590593247", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/675", "id": 590593247, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDU5MzI0Nw==", "user": {"value": 141844, "label": "aviflax"}, "created_at": "2020-02-24T23:02:52Z", "updated_at": "2020-02-24T23:02:52Z", "author_association": "NONE", "body": "> Design looks great to me.\r\n\r\nExcellent, thanks!\r\n\r\n> I'm not keen on two letter short versions (`-cp`) - I'd rather either have a single character or no short form at all.\r\n\r\nHmm, well, anyone running `datasette package` is probably at least somewhat familiar with UNIX CLIs\u2026 so how about `--cp` as a middle ground?\r\n\r\n```shell\r\n$ datasette package --cp /the/source/path /the/target/path data.db\r\n```\r\n\r\nI think I like it. Easy to remember!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 567902704, "label": "--cp option for datasette publish and datasette package for shipping additional files and directories"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590598248", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590598248, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDU5ODI0OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:18:50Z", "updated_at": "2020-02-24T23:18:50Z", "author_association": "OWNER", "body": "I'm not convinced by the return value of the `.execute_write_fn()` method:\r\n\r\nhttps://github.com/simonw/datasette/blob/ab2348280206bde1390b931ae89d372c2f74b87e/datasette/database.py#L79-L83\r\n\r\nDo I really need that `WriteResponse` class or can I do something nicer?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590598689", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590598689, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDU5ODY4OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:20:11Z", "updated_at": "2020-02-24T23:20:11Z", "author_association": "OWNER", "body": "I think `if block` it makes sense to return the return value of the function that was executed. Without it all I really need to do is return the `uuid` so something could theoretically poll for completion later on.\r\n\r\nBut is it weird having a function that returns different types depending on if you passed `block=True` or not? Should they be differently named functions?\r\n\r\nI'm OK with the `block=True` pattern changing the return value I think.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590599257", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590599257, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDU5OTI1Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:21:56Z", "updated_at": "2020-02-24T23:22:35Z", "author_association": "OWNER", "body": "Also: are UUIDs really necessary here or could I use a simpler form of task identifier? Like an in-memory counter variable that starts at 0 and increments every time this instance of Datasette issues a new task ID?\r\n\r\nThe neat thing about UUIDs is that I don't have to worry if there are multiple Datasette instances accepting writes behind a load balancer. That seems pretty unlikely (especially considering SQLite databases encourage only one process to be writing at a time)... but I am experimenting with PostgreSQL support in #670 so it's probably worth ensuring these task IDs really are globally unique.\r\n\r\nI'm going to stick with UUIDs. They're short-lived enough that their size doesn't really matter.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590606825", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590606825, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDYwNjgyNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:47:38Z", "updated_at": "2020-02-24T23:47:38Z", "author_association": "OWNER", "body": "Another demo plugin: `delete_table.py`\r\n```python\r\nfrom datasette import hookimpl\r\nfrom datasette.utils import escape_sqlite\r\nfrom starlette.responses import HTMLResponse\r\nfrom starlette.endpoints import HTTPEndpoint\r\n\r\n\r\nclass DeleteTableApp(HTTPEndpoint):\r\n def __init__(self, scope, receive, send, datasette):\r\n self.datasette = datasette\r\n super().__init__(scope, receive, send)\r\n\r\n async def post(self, request):\r\n formdata = await request.form()\r\n database = formdata[\"database\"]\r\n db = self.datasette.databases[database]\r\n await db.execute_write(\"drop table {}\".format(escape_sqlite(formdata[\"table\"])))\r\n return HTMLResponse(\"Table has been deleted.\")\r\n\r\n\r\n@hookimpl\r\ndef asgi_wrapper(datasette):\r\n def wrap_with_asgi_auth(app):\r\n async def wrapped_app(scope, recieve, send):\r\n if scope[\"path\"] == \"/-/delete-table\":\r\n await DeleteTableApp(scope, recieve, send, datasette)\r\n else:\r\n await app(scope, recieve, send)\r\n\r\n return wrapped_app\r\n\r\n return wrap_with_asgi_auth\r\n```\r\nThen I saved this as `table.html` in the `write-templates/` directory:\r\n```html+django\r\n{% extends \"default:table.html\" %}\r\n\r\n{% block content %}\r\n
\r\n

\r\n \r\n \r\n \r\n

\r\n
\r\n{{ super() }}\r\n{% endblock %}\r\n```\r\n(Needs CSRF protection added)\r\n\r\nI ran Datasette like this:\r\n\r\n $ datasette --plugins-dir=write-plugins/ data.db --template-dir=write-templates/\r\n\r\nResult: I can delete tables!\r\n\r\n\"data__everything__30_132_rows_-_Mozilla_Firefox\"\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590607385", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590607385, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDYwNzM4NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:49:37Z", "updated_at": "2020-02-24T23:49:37Z", "author_association": "OWNER", "body": "Here's the `upload_csv.py` plugin file I've been playing with:\r\n```python\r\nfrom datasette import hookimpl\r\nfrom starlette.responses import PlainTextResponse, HTMLResponse\r\nfrom starlette.endpoints import HTTPEndpoint\r\nimport csv as csv_std\r\nimport codecs\r\nimport sqlite_utils\r\n\r\n\r\nclass UploadApp(HTTPEndpoint):\r\n def __init__(self, scope, receive, send, datasette):\r\n self.datasette = datasette\r\n super().__init__(scope, receive, send)\r\n\r\n def get_database(self):\r\n # For the moment just use the first one that's not immutable\r\n mutable = [db for db in self.datasette.databases.values() if db.is_mutable]\r\n return mutable[0]\r\n\r\n async def get(self, request):\r\n return HTMLResponse(\r\n await self.datasette.render_template(\r\n \"upload_csv.html\", {\"database_name\": self.get_database().name}\r\n )\r\n )\r\n\r\n async def post(self, request):\r\n formdata = await request.form()\r\n csv = formdata[\"csv\"]\r\n # csv.file is a SpooledTemporaryFile, I can read it directly\r\n filename = csv.filename\r\n # TODO: Support other encodings:\r\n reader = csv_std.reader(codecs.iterdecode(csv.file, \"utf-8\"))\r\n headers = next(reader)\r\n docs = (dict(zip(headers, row)) for row in reader)\r\n if filename.endswith(\".csv\"):\r\n filename = filename[:-4]\r\n # Import data into a table of that name using sqlite-utils\r\n db = self.get_database()\r\n\r\n def fn(conn):\r\n writable_conn = sqlite_utils.Database(db.path)\r\n writable_conn[filename].insert_all(docs, alter=True)\r\n return writable_conn[filename].count\r\n\r\n # Without block=True we may attempt 'select count(*) from ...'\r\n # before the table has been created by the write thread\r\n count = await db.execute_write_fn(fn, block=True)\r\n\r\n return HTMLResponse(\r\n await self.datasette.render_template(\r\n \"upload_csv_done.html\",\r\n {\r\n \"database\": self.get_database().name,\r\n \"table\": filename,\r\n \"num_docs\": count,\r\n },\r\n )\r\n )\r\n\r\n\r\n@hookimpl\r\ndef asgi_wrapper(datasette):\r\n def wrap_with_asgi_auth(app):\r\n async def wrapped_app(scope, recieve, send):\r\n if scope[\"path\"] == \"/-/upload-csv\":\r\n await UploadApp(scope, recieve, send, datasette)\r\n else:\r\n await app(scope, recieve, send)\r\n\r\n return wrapped_app\r\n\r\n return wrap_with_asgi_auth\r\n```\r\nI also dropped copies of the two template files from https://github.com/simonw/datasette-upload-csvs/tree/699e6ca591f36264bfc8e590d877e6852f274beb/datasette_upload_csvs/templates into my `write-templates/` directory.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/683#issuecomment-590608228", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/683", "id": 590608228, "node_id": "MDEyOklzc3VlQ29tbWVudDU5MDYwODIyOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-24T23:52:35Z", "updated_at": "2020-02-24T23:52:35Z", "author_association": "OWNER", "body": "I'm going to punt on the ability to introspect the write queue and poll for completion using a UUID for the moment. Can add those later.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 570101428, "label": ".execute_write() and .execute_write_fn() methods on Database"}, "performed_via_github_app": null}