issue_comments
13 rows where issue = 1400374908
This data as json, CSV (advanced)
Suggested facets: user, author_association, created_at (date), updated_at (date)
id ▼ | html_url | issue_url | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
1270923537 | https://github.com/simonw/datasette/issues/1836#issuecomment-1270923537 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5LwMER | fgregg 536941 | 2022-10-07T00:46:08Z | 2022-10-07T00:46:08Z | CONTRIBUTOR | i thought it was maybe to do with reading through all the files, but that does not seem to be the case if i make a little test file like: ```python # test_read.py import hashlib import sys import pathlib HASH_BLOCK_SIZE = 1024 * 1024 def inspect_hash(path): """Calculate the hash of a database, efficiently.""" m = hashlib.sha256() with path.open("rb") as fp: while True: data = fp.read(HASH_BLOCK_SIZE) if not data: break m.update(data) return m.hexdigest() inspect_hash(pathlib.Path(sys.argv[1])) ``` then a line in the Dockerfile like ```docker RUN python test_read.py nlrb.db && echo "[]" > /etc/inspect.json ``` just produes a layer of `3B` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1270936982 | https://github.com/simonw/datasette/issues/1836#issuecomment-1270936982 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5LwPWW | fgregg 536941 | 2022-10-07T00:52:41Z | 2022-10-07T00:52:41Z | CONTRIBUTOR | it's not that the inspect command is somehow changing the db files. if i set them to only read-only, the "inspect" layer still has the same very large size. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1270988081 | https://github.com/simonw/datasette/issues/1836#issuecomment-1270988081 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5Lwb0x | fgregg 536941 | 2022-10-07T01:19:01Z | 2022-10-07T01:27:35Z | CONTRIBUTOR | okay, some progress!! running some sql against a database file causes that file to get duplicated even if it doesn't apparently change the file. make a little test script like this: ```python # test_sql.py import sqlite3 import sys db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ``` then ```docker RUN python test_sql.py nlrb.db ``` produced a layer that's the same size as `nlrb.db`!! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1270992795 | https://github.com/simonw/datasette/issues/1836#issuecomment-1270992795 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5Lwc-b | fgregg 536941 | 2022-10-07T01:29:15Z | 2022-10-07T01:50:14Z | CONTRIBUTOR | fascinatingly, telling python to open sqlite in read only mode makes this layer have a size of 0 ```python # test_sql_ro.py import sqlite3 import sys db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}?mode=ro', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ``` that's quite weird because setting the file permissions to read only didn't do anything. (on reflection, that chmod isn't doing anything because the dockerfile commands are run as root) | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1271003212 | https://github.com/simonw/datasette/issues/1836#issuecomment-1271003212 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5LwfhM | fgregg 536941 | 2022-10-07T01:52:04Z | 2022-10-07T01:52:04Z | CONTRIBUTOR | and if we try immutable mode, which is how things are opened by `datasette inspect` we duplicate the files!!! ```python # test_sql_immutable.py import sqlite3 import sys db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}?immutable=1', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1271004167 | https://github.com/simonw/datasette/issues/1836#issuecomment-1271004167 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5LwfwH | simonw 9599 | 2022-10-07T01:53:05Z | 2022-10-07T01:53:05Z | OWNER | Oh this is interesting! Is your hunch here that running this line is causing the file to be stored as a second layer? https://github.com/simonw/datasette/blob/5aa359b86907d11b3ee601510775a85a90224da8/datasette/utils/__init__.py#L399 I guess it's possible that running a non-read-only query against the database causes one or two bytes to be changed (maybe a transaction ID or similar?) Modifying the `inspect` command to use `?mode=ro` seems sensible to me. Except.... it should already be opening those files in immutable mode according to this line: https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/cli.py#L172-L173 Here's what opening as a `immutables` does: https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/app.py#L258-L260 https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/database.py#L98 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1271006020 | https://github.com/simonw/datasette/issues/1836#issuecomment-1271006020 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5LwgNE | simonw 9599 | 2022-10-07T01:54:07Z | 2022-10-07T01:54:07Z | OWNER | Just overlapped with your comment here: https://github.com/simonw/datasette/issues/1836#issuecomment-1271003212 - which notes that opening with `?immutable=1` DOES seem to cause the file to be duplicated! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1271008997 | https://github.com/simonw/datasette/issues/1836#issuecomment-1271008997 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5Lwg7l | fgregg 536941 | 2022-10-07T02:00:37Z | 2022-10-07T02:00:49Z | CONTRIBUTOR | yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation! running a test of that now. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1271020193 | https://github.com/simonw/datasette/issues/1836#issuecomment-1271020193 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5Lwjqh | fgregg 536941 | 2022-10-07T02:15:05Z | 2022-10-07T02:21:08Z | CONTRIBUTOR | when i hack the connect method to open non mutable files with "mode=ro" and not "immutable=1" https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/database.py#L79 then: ```bash 870 B RUN /bin/sh -c datasette inspect nlrb.db --inspect-file inspect-data.json ``` the `datasette inspect` layer is only the size of the json file! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1271100651 | https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5Lw3Tr | fgregg 536941 | 2022-10-07T04:38:14Z | 2022-10-07T04:38:14Z | CONTRIBUTOR | > yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation! > > running a test of that now. this completely addressed #1480 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1271103097 | https://github.com/simonw/datasette/issues/1836#issuecomment-1271103097 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5Lw355 | fgregg 536941 | 2022-10-07T04:43:41Z | 2022-10-07T04:43:41Z | CONTRIBUTOR | @simonw, should i open up a new issue for investigating the differences between "immutable=1" and "mode=ro" and possibly switching to "mode=ro". Or would you like to keep that conversation in this issue? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1272344884 | https://github.com/simonw/datasette/issues/1836#issuecomment-1272344884 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5L1nE0 | simonw 9599 | 2022-10-08T15:41:28Z | 2022-10-08T15:41:28Z | OWNER | Lets switch to `mode=ro` when the `inspect` command runs, we can use this issue for that. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 | |
1272357976 | https://github.com/simonw/datasette/issues/1836#issuecomment-1272357976 | https://api.github.com/repos/simonw/datasette/issues/1836 | IC_kwDOBm6k_c5L1qRY | fgregg 536941 | 2022-10-08T16:56:51Z | 2022-10-08T16:56:51Z | CONTRIBUTOR | when you are running from docker, you **always** will want to run as `mode=ro` because the same thing that is causing duplication in the inspect layer will cause duplication in the final container read/write layer when `datasette serve` runs. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | docker image is duplicating db files somehow 1400374908 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);