issue_comments
7 rows where issue = 924990677
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
id ▼ | html_url | issue_url | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
864103005 | https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864103005 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | MDEyOklzc3VlQ29tbWVudDg2NDEwMzAwNQ== | simonw 9599 | 2021-06-18T15:04:15Z | 2021-06-18T15:04:15Z | OWNER | To detect JSON, check to see if the stream starts with `[` or `{` - maybe do something more sophisticated than that. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite-utils memory should handle TSV and JSON in addition to CSV 924990677 | |
864129273 | https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864129273 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | MDEyOklzc3VlQ29tbWVudDg2NDEyOTI3Mw== | simonw 9599 | 2021-06-18T15:47:47Z | 2021-06-18T15:47:47Z | OWNER | Detecting valid JSON is tricky - just because a stream starts with `[` or `{` doesn't mean the entire stream is valid JSON. You need to parse the entire stream to determine that for sure. One way to solve this would be with a custom state machine. Another would be to use the `ijson` streaming parser - annoyingly it throws the same exception class for invalid JSON for different reasons, but the `e.args[0]` for that exception includes human-readable text about the error - if it's anything other than `parse error: premature EOF` then it probably means the JSON was invalid. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite-utils memory should handle TSV and JSON in addition to CSV 924990677 | |
864206308 | https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864206308 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | MDEyOklzc3VlQ29tbWVudDg2NDIwNjMwOA== | simonw 9599 | 2021-06-18T18:25:04Z | 2021-06-18T18:25:04Z | OWNER | Or... since I'm not using a streaming JSON parser at the moment, if I think something is JSON I can load the entire thing into memory to validate it. I still need to detect newline-delimited JSON. For that I can consume the first line of the input to see if it's a valid JSON object, then maybe sniff the second line too? This does mean that if the input is a single line of GIANT JSON it will all be consumed into memory at once, but that's going to happen anyway. So I need a function which, given a file pointer, consumes from it, detects the type, then returns that type AND a file pointer to the beginning of the file again. I can use `io.BufferedReader` for this. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite-utils memory should handle TSV and JSON in addition to CSV 924990677 | |
864207841 | https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864207841 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | MDEyOklzc3VlQ29tbWVudDg2NDIwNzg0MQ== | simonw 9599 | 2021-06-18T18:28:40Z | 2021-06-18T18:28:46Z | OWNER | ```python def detect_format(fp): # ... return "csv", fp, dialect # or return "json", fp, parsed_data # or return "json-nl", fp, docs ``` The mixed return types here are ugly. In all of these cases what we really want is to return a generator of `{...}` objects. So maybe it returns that instead. ```python def filepointer_to_documents(fp): # ... yield from documents ``` I can refactor `sqlite-utils insert` to use this new code too. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite-utils memory should handle TSV and JSON in addition to CSV 924990677 | |
864208476 | https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864208476 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | MDEyOklzc3VlQ29tbWVudDg2NDIwODQ3Ng== | simonw 9599 | 2021-06-18T18:30:08Z | 2021-06-18T23:30:19Z | OWNER | So maybe this is a function which can either be told the format or, if none is provided, it detects one for itself. ```python def rows_from_file(fp, format=None): # ... yield from rows ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite-utils memory should handle TSV and JSON in addition to CSV 924990677 | |
864328927 | https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864328927 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | MDEyOklzc3VlQ29tbWVudDg2NDMyODkyNw== | simonw 9599 | 2021-06-19T00:25:08Z | 2021-06-19T00:25:17Z | OWNER | I tried writing this function with type hints, but eventually gave up: ```python def rows_from_file( fp: BinaryIO, format: Optional[Format] = None, dialect: Optional[Type[csv.Dialect]] = None, encoding: Optional[str] = None, ) -> Generator[dict, None, None]: if format == Format.JSON: decoded = json.load(fp) if isinstance(decoded, dict): decoded = [decoded] if not isinstance(decoded, list): raise RowsFromFileBadJSON("JSON must be a list or a dictionary") yield from decoded elif format == Format.CSV: decoded_fp = io.TextIOWrapper(fp, encoding=encoding or "utf-8-sig") yield from csv.DictReader(decoded_fp) elif format == Format.TSV: yield from rows_from_file( fp, format=Format.CSV, dialect=csv.excel_tab, encoding=encoding ) elif format is None: # Detect the format, then call this recursively buffered = io.BufferedReader(fp, buffer_size=4096) first_bytes = buffered.peek(2048).strip() if first_bytes[0] in (b"[", b"{"): # TODO: Detect newline-JSON yield from rows_from_file(fp, format=Format.JSON) else: dialect = csv.Sniffer().sniff(first_bytes.decode(encoding, "ignore")) yield from rows_from_file( fp, format=Format.CSV, dialect=dialect, encoding=encoding ) else: raise RowsFromFileError("Bad format") ``` mypy said: ``` sqlite_utils/utils.py:157: error: Argument 1 to "BufferedReader" has incompatible type "BinaryIO"; expected "RawIOBase" sqlite_utils/utils.py:163: error: Argument 1 to "decode" of "bytes" has incompatible type "Optional[str]"; expected "str" ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite-utils memory should handle TSV and JSON in addition to CSV 924990677 | |
864330508 | https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864330508 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | MDEyOklzc3VlQ29tbWVudDg2NDMzMDUwOA== | simonw 9599 | 2021-06-19T00:34:24Z | 2021-06-19T00:34:24Z | OWNER | Got this working: % curl 'https://api.github.com/repos/simonw/datasette/issues' | sqlite-utils memory - 'select id from stdin' | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite-utils memory should handle TSV and JSON in addition to CSV 924990677 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);