home / github

Menu
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 569613563

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

id ▼ html_url issue_url node_id user created_at updated_at author_association body reactions issue performed_via_github_app
590153892 https://github.com/simonw/datasette/issues/682#issuecomment-590153892 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDE1Mzg5Mg== simonw 9599 2020-02-24T03:10:45Z 2020-02-24T03:13:03Z OWNER Some more detailed notes I made earlier: Datasette would run a single write thread per database. That thread gets an exclusive connection, and a queue. Plugins can add functions to the queue which will be called and given access to that connection. The write thread for that database is created the first time a write is attempted. Question: should that thread have its own asyncio loop so that async techniques like httpx can be used within the thread? I think not at first - only investigate this if it turns out to be necessary in the future. This thread will run as part of the Datasette process. This means there is always a risk that the thread will die in the middle of something because the server got restarted - so use transactions to limit risk of damage to database should that happen. I don’t want web responses blocking waiting for stuff to happen here - so every task put on that queue will have a task ID, and that ID will be returned such that client code can poll for its completion. Could the request block for up to 0.5s just in case the write is really fast, then return a polling token if it isn't finished yet? Looks possible - `Queue.get` can block with a timeout. There will be a `/-/writes` page which shows currently queued writes - so each one needs a human-readable description of some sort. (You can access a deque called `q.queue` to see what’s in there) Stretch goal: It would be cool if write operations could optionally handle their own progress reports. That way I can do some really nice UI around what’s going on with these things. This mechanism has a ton of potential. It may even be how we handle things like Twitter imports and suchlike - queued writing tasks. One catch with this approach: if a plugin is reading from APIs etc it shouldn't block writes to the database while it is doing so. So sticking a function in the queue that does additional time consuming stuff is actually an anti pattern. Instead, plugins should schedule their API access in the main event loop and o… {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590154309 https://github.com/simonw/datasette/issues/682#issuecomment-590154309 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDE1NDMwOQ== simonw 9599 2020-02-24T03:14:10Z 2020-02-24T03:14:10Z OWNER Some prior art: Charles Leifer implemented a `SqliteQueueDatabase` class that automatically queues writes for you: https://charlesleifer.com/blog/multi-threaded-sqlite-without-the-operationalerrors/ {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590399600 https://github.com/simonw/datasette/issues/682#issuecomment-590399600 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDM5OTYwMA== simonw 9599 2020-02-24T15:56:10Z 2020-02-24T15:56:23Z OWNER ## Implementation plan Method on Database class called `execute_write(sql)` Which calls `.execute_write_fn(fn)` - so you can instead create a function that applies a whole batch of writes and pass that instead if you need to Throws an error of database isn't mutable. Add `._writer_thread` thread property to Database - we start that thread the first time we need it. It blocks on `._writer_queue.get()` We write to that queue with `WriteTask(fn, uuid, reply_queue)` namedtuples - then time-out block awaiting reply for 0.5s Have a `.write_status(uuid)` method that checks if `uuid` has completed This should be enough to get it all working. MVP can skip the .5s timeout entirely But... what about that progress bar supporting stretch goal? For that let's have each write operation that's currently in progress have total and done integer properties. So I guess we can add those to the `WriteTask`. Should we have the ability to see what the currently executing write is? Seems useful. Hopefully I can integrate https://github.com/tqdm/tqdm such that it calculates ETAs without actually trying to print to the console. {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590417366 https://github.com/simonw/datasette/issues/682#issuecomment-590417366 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDQxNzM2Ng== simonw 9599 2020-02-24T16:27:10Z 2020-02-24T16:27:10Z OWNER I wonder if I even need the `reply_queue` mechanism? Are the replies from writes generally even interesting? {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590417619 https://github.com/simonw/datasette/issues/682#issuecomment-590417619 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDQxNzYxOQ== simonw 9599 2020-02-24T16:27:36Z 2020-02-24T16:27:36Z OWNER Error handling could be tricky. Exceptions thrown in threads don't show up anywhere by default - I would need to explicitly catch them and decide what to do with them. {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590430988 https://github.com/simonw/datasette/issues/682#issuecomment-590430988 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDQzMDk4OA== simonw 9599 2020-02-24T16:50:48Z 2020-02-24T16:50:48Z OWNER I'm dropping the progress bar idea. This mechanism is supposed to guarantee exclusive access to the single write connection, which means it should be targeted by operations that are as short as possible. An operation running long enough to need a progress bar is too long! Any implementation of progress bars for long running write operations needs to happen elsewhere in the stack. {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590436368 https://github.com/simonw/datasette/issues/682#issuecomment-590436368 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDQzNjM2OA== simonw 9599 2020-02-24T17:00:21Z 2020-02-24T17:00:21Z OWNER Interesting challenge: I would like to be able to "await" on `queue.get()` (with a timeout). Problem is: `queue.Queue()` is designed for threading and cannot be awaited. `asyncio.Queue` can be awaited but is not meant to be used with threads. https://stackoverflow.com/a/32894169 suggests using Janus, a thread-aware asyncio queue: https://github.com/aio-libs/janus {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590511601 https://github.com/simonw/datasette/issues/682#issuecomment-590511601 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDUxMTYwMQ== simonw 9599 2020-02-24T19:38:27Z 2020-02-24T19:38:27Z OWNER I tested this using the following code in a view (after `from sqlite_utils import Database`): ```python db = next(iter(self.ds.databases.values())) db.execute_write_fn(lambda conn: Database(conn)["counter"].insert({"id": 1, "count": 0}, pk="id", ignore=True)) db.execute_write("update counter set count = count + 1 where id = 1") ``` {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590517338 https://github.com/simonw/datasette/issues/682#issuecomment-590517338 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDUxNzMzOA== simonw 9599 2020-02-24T19:51:21Z 2020-02-24T19:51:21Z OWNER I filed a question / feature request with Janus about supporting timeouts for `.get()` against async queues here: https://github.com/aio-libs/janus/issues/240 I'm going to move ahead without needing that ability though. I figure SQLite writes are _fast_, and plugins can be trusted to implement just fast writes. So I'm going to support either fire-and-forget writes (they get added to the queue and a task ID is returned) or have the option to block awaiting the completion of the write (using Janus) but let callers decide which version they want. I may add optional timeouts some time in the future. I am going to make both `execute_write()` and `execute_write_fn()` awaitable functions though, for consistency with `.execute()` and to give me flexibility to change how they work in the future. I'll also add a `block=True` option to both of them which causes the function to wait for the write to be successfully executed - defaults to `False` (fire-and-forget mode). {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  
590517744 https://github.com/simonw/datasette/issues/682#issuecomment-590517744 https://api.github.com/repos/simonw/datasette/issues/682 MDEyOklzc3VlQ29tbWVudDU5MDUxNzc0NA== simonw 9599 2020-02-24T19:52:16Z 2020-02-24T19:52:16Z OWNER Moving further development to a pull request: https://github.com/simonw/datasette/pull/683 {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} Mechanism for writing to database via a queue 569613563  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 30.702ms · About: simonw/datasette-graphql