issue_comments
8 rows where issue = 808008305
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
id ▼ | html_url | issue_url | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
778812684 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778812684 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgxMjY4NA== | simonw 9599 | 2021-02-14T17:45:16Z | 2021-02-14T17:45:16Z | OWNER | Running this could take any CSV (or TSV) file and automatically detect the delimiter. If no header row is detected it could add `unknown1,unknown2` headers: sqlite-utils insert db.db data file.csv --sniff (Using `--sniff` would imply `--csv`) This could be called `--sniffer` instead but I like `--sniff` better. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 | |
778815740 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778815740 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgxNTc0MA== | simonw 9599 | 2021-02-14T18:05:03Z | 2021-02-14T18:05:03Z | OWNER | The challenge here is how to read the first 2048 bytes and then reset the incoming file. The Python docs example looks like this: ```python with open('example.csv', newline='') as csvfile: dialect = csv.Sniffer().sniff(csvfile.read(1024)) csvfile.seek(0) reader = csv.reader(csvfile, dialect) ``` Here's the relevant code in `sqlite-utils`: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L671-L679 The challenge is going to be having the `--sniff` option work with the progress bar. Here's how `file_progress()` works: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/utils.py#L106-L113 If `file.raw` is `stdin` can I do the equivalent of `csvfile.seek(0)` on it? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 | |
778816333 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778816333 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgxNjMzMw== | simonw 9599 | 2021-02-14T18:08:44Z | 2021-02-14T18:08:44Z | OWNER | No, you can't `.seek(0)` on stdin: ``` File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 678, in insert_upsert_implementation json_file.raw.seek(0) OSError: [Errno 29] Illegal seek ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 | |
778817494 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778817494 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgxNzQ5NA== | simonw 9599 | 2021-02-14T18:16:06Z | 2021-02-14T18:16:06Z | OWNER | Types involved: ``` (Pdb) type(json_file.raw) <class '_io.FileIO'> (Pdb) type(json_file) <class 'encodings.utf_8.StreamReader'> ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 | |
778818639 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778818639 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgxODYzOQ== | simonw 9599 | 2021-02-14T18:22:38Z | 2021-02-14T18:22:38Z | OWNER | Maybe I shouldn't be using `StreamReader` at all - https://www.python.org/dev/peps/pep-0400/ suggests that it should be deprecated in favour of `io.TextIOWrapper`. I'm using `StreamReader` due to this line: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L667-L668 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 | |
778821403 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778821403 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgyMTQwMw== | simonw 9599 | 2021-02-14T18:38:16Z | 2021-02-14T18:38:16Z | OWNER | There are two code paths here that matter: - For a regular file, can read the first 2048 bytes, then `.seek(0)` before continuing. That's easy. - `stdin` is harder. I need to read and buffer the first 2048 bytes, then pass an object to `csv.reader()` which will replay that chunk and then play the rest of stdin. I'm a bit stuck on the second one. Ideally I could use something like `itertools.chain()` but I can't find an alternative for file-like objects. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 | |
778824361 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778824361 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgyNDM2MQ== | simonw 9599 | 2021-02-14T18:59:22Z | 2021-02-14T18:59:22Z | OWNER | I think I've got it. I can use `io.BufferedReader()` to get an object I can run `.peek(2048)` on, then wrap THAT in `io.TextIOWrapper`: ```python encoding = encoding or "utf-8" buffered = io.BufferedReader(json_file, buffer_size=4096) decoded = io.TextIOWrapper(buffered, encoding=encoding, line_buffering=True) if pk and len(pk) == 1: pk = pk[0] if csv or tsv: if sniff: # Read first 2048 bytes and use that to detect first_bytes = buffered.peek(2048) print('first_bytes', first_bytes) ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 | |
778827570 | https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778827570 | https://api.github.com/repos/simonw/sqlite-utils/issues/230 | MDEyOklzc3VlQ29tbWVudDc3ODgyNzU3MA== | simonw 9599 | 2021-02-14T19:24:20Z | 2021-02-14T19:24:20Z | OWNER | Here's the implementation in Python: https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/csv.py#L204-L225 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --sniff option for sniffing delimiters 808008305 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);