html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864328927,https://api.github.com/repos/simonw/sqlite-utils/issues/279,864328927,MDEyOklzc3VlQ29tbWVudDg2NDMyODkyNw==,9599,simonw,2021-06-19T00:25:08Z,2021-06-19T00:25:17Z,OWNER,"I tried writing this function with type hints, but eventually gave up: ```python def rows_from_file( fp: BinaryIO, format: Optional[Format] = None, dialect: Optional[Type[csv.Dialect]] = None, encoding: Optional[str] = None, ) -> Generator[dict, None, None]: if format == Format.JSON: decoded = json.load(fp) if isinstance(decoded, dict): decoded = [decoded] if not isinstance(decoded, list): raise RowsFromFileBadJSON(""JSON must be a list or a dictionary"") yield from decoded elif format == Format.CSV: decoded_fp = io.TextIOWrapper(fp, encoding=encoding or ""utf-8-sig"") yield from csv.DictReader(decoded_fp) elif format == Format.TSV: yield from rows_from_file( fp, format=Format.CSV, dialect=csv.excel_tab, encoding=encoding ) elif format is None: # Detect the format, then call this recursively buffered = io.BufferedReader(fp, buffer_size=4096) first_bytes = buffered.peek(2048).strip() if first_bytes[0] in (b""["", b""{""): # TODO: Detect newline-JSON yield from rows_from_file(fp, format=Format.JSON) else: dialect = csv.Sniffer().sniff(first_bytes.decode(encoding, ""ignore"")) yield from rows_from_file( fp, format=Format.CSV, dialect=dialect, encoding=encoding ) else: raise RowsFromFileError(""Bad format"") ``` mypy said: ``` sqlite_utils/utils.py:157: error: Argument 1 to ""BufferedReader"" has incompatible type ""BinaryIO""; expected ""RawIOBase"" sqlite_utils/utils.py:163: error: Argument 1 to ""decode"" of ""bytes"" has incompatible type ""Optional[str]""; expected ""str"" ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",924990677,sqlite-utils memory should handle TSV and JSON in addition to CSV, https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864330508,https://api.github.com/repos/simonw/sqlite-utils/issues/279,864330508,MDEyOklzc3VlQ29tbWVudDg2NDMzMDUwOA==,9599,simonw,2021-06-19T00:34:24Z,2021-06-19T00:34:24Z,OWNER,"Got this working: % curl 'https://api.github.com/repos/simonw/datasette/issues' | sqlite-utils memory - 'select id from stdin' ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",924990677,sqlite-utils memory should handle TSV and JSON in addition to CSV, https://github.com/simonw/sqlite-utils/issues/282#issuecomment-864348954,https://api.github.com/repos/simonw/sqlite-utils/issues/282,864348954,MDEyOklzc3VlQ29tbWVudDg2NDM0ODk1NA==,9599,simonw,2021-06-19T03:34:42Z,2021-06-19T03:35:46Z,OWNER,"I built some prototype code here for something which looks at every row in a CSV import and records the likely types: https://gist.github.com/simonw/465f9356f175d1cf86957947dff501d4 This could be used by the command-line tools to figure out what `table.transform(types=...)` method to use at the end. This is a different approach to the pure SQL version I tried building in https://github.com/simonw/sqlite-utils/issues/179 - I think this is a better approach though, it's less prone to weird idiosyncrasies of SQLite types, and it's also easy for us to add on to the existing CSV import code in a way that won't require scanning the data twice.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925305186,Automatic type detection for CSV data, https://github.com/simonw/sqlite-utils/issues/179#issuecomment-864349066,https://api.github.com/repos/simonw/sqlite-utils/issues/179,864349066,MDEyOklzc3VlQ29tbWVudDg2NDM0OTA2Ng==,9599,simonw,2021-06-19T03:36:04Z,2021-06-19T03:36:04Z,OWNER,This work is going to happen in #282.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",709577625,sqlite-utils transform/insert --detect-types, https://github.com/simonw/sqlite-utils/issues/282#issuecomment-864349123,https://api.github.com/repos/simonw/sqlite-utils/issues/282,864349123,MDEyOklzc3VlQ29tbWVudDg2NDM0OTEyMw==,9599,simonw,2021-06-19T03:36:54Z,2021-06-19T03:36:54Z,OWNER,"I may change the default for `sqlite-utils insert` to detect types if I release `sqlite-utils` 4.0, as a backwards-incompatible change.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925305186,Automatic type detection for CSV data, https://github.com/simonw/sqlite-utils/issues/282#issuecomment-864350407,https://api.github.com/repos/simonw/sqlite-utils/issues/282,864350407,MDEyOklzc3VlQ29tbWVudDg2NDM1MDQwNw==,9599,simonw,2021-06-19T03:52:20Z,2021-06-19T03:52:20Z,OWNER,I'll have an environment variable for `--detect-types` so users who really want that as the default option can turn it on.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925305186,Automatic type detection for CSV data, https://github.com/simonw/sqlite-utils/issues/282#issuecomment-864354627,https://api.github.com/repos/simonw/sqlite-utils/issues/282,864354627,MDEyOklzc3VlQ29tbWVudDg2NDM1NDYyNw==,9599,simonw,2021-06-19T04:42:03Z,2021-06-19T04:42:03Z,OWNER,"Demo: curl -s 'https://api.github.com/users/simonw/repos?per_page=100' | \ sqlite-utils memory - 'select sum(size), sum(stargazers_count) from stdin limit 1' [{""sum(size)"": 2042547, ""sum(stargazers_count)"": 6769}] ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925305186,Automatic type detection for CSV data, https://github.com/simonw/sqlite-utils/issues/284#issuecomment-864358680,https://api.github.com/repos/simonw/sqlite-utils/issues/284,864358680,MDEyOklzc3VlQ29tbWVudDg2NDM1ODY4MA==,9599,simonw,2021-06-19T05:27:13Z,2021-06-19T05:27:13Z,OWNER,How easy is it to detect a `rowid` table? Is it as simple as `.pks` returning `None`? If so the documentation should mention that.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925320167,.transform(types=) turns rowid into a concrete column, https://github.com/simonw/sqlite-utils/issues/284#issuecomment-864358951,https://api.github.com/repos/simonw/sqlite-utils/issues/284,864358951,MDEyOklzc3VlQ29tbWVudDg2NDM1ODk1MQ==,9599,simonw,2021-06-19T05:30:00Z,2021-06-19T05:30:00Z,OWNER,If this can be fixed it will be in the `transform_sql()` method.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925320167,.transform(types=) turns rowid into a concrete column, https://github.com/simonw/sqlite-utils/issues/283#issuecomment-864416086,https://api.github.com/repos/simonw/sqlite-utils/issues/283,864416086,MDEyOklzc3VlQ29tbWVudDg2NDQxNjA4Ng==,9599,simonw,2021-06-19T14:49:06Z,2021-06-19T14:49:13Z,OWNER,"Once again, this is difficult because of the use of a generator here - `rows_from_file()` only yields rows, so there is no obvious mechanism for it to communicate back to the wrapping code that the detected format was CSV or TSV as opposed to JSON. I'm going to change `rows_from_file()` to return a `(generator, detected_format)` tuple.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925319214,memory: Shouldn't detect types for JSON, https://github.com/simonw/sqlite-utils/issues/284#issuecomment-864416785,https://api.github.com/repos/simonw/sqlite-utils/issues/284,864416785,MDEyOklzc3VlQ29tbWVudDg2NDQxNjc4NQ==,9599,simonw,2021-06-19T14:54:41Z,2021-06-19T14:54:41Z,OWNER,"```pycon >>> db = sqlite_utils.Database(memory=True) >>> db[""rowid_table""].insert({""name"": ""Cleo""}) >>> db[""regular_table""].insert({""id"": 1, ""name"": ""Cleo""}, pk=""id"")
>>> db[""rowid_table""].pks ['rowid'] >>> db[""regular_table""].pks ['id'] ``` I think I need an introspection property for working out if a table is a `rowid` table or not.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925320167,.transform(types=) turns rowid into a concrete column, https://github.com/simonw/sqlite-utils/issues/284#issuecomment-864416911,https://api.github.com/repos/simonw/sqlite-utils/issues/284,864416911,MDEyOklzc3VlQ29tbWVudDg2NDQxNjkxMQ==,9599,simonw,2021-06-19T14:55:45Z,2021-06-19T14:55:45Z,OWNER,"https://github.com/simonw/sqlite-utils/blob/dc94f4bb8cfe922bb2f9c89f8f0f29092ea63133/sqlite_utils/db.py#L805-L810 So I can indeed detect a `rowid` table by looking for no `is_pk` columns.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925320167,.transform(types=) turns rowid into a concrete column, https://github.com/simonw/sqlite-utils/issues/285#issuecomment-864417031,https://api.github.com/repos/simonw/sqlite-utils/issues/285,864417031,MDEyOklzc3VlQ29tbWVudDg2NDQxNzAzMQ==,9599,simonw,2021-06-19T14:56:45Z,2021-06-19T14:56:45Z,OWNER,"```pycon >>> db = sqlite_utils.Database(memory=True) >>> db[""rowid_table""].insert({""name"": ""Cleo""})
>>> db[""regular_table""].insert({""id"": 1, ""name"": ""Cleo""}, pk=""id"")
>>> db[""rowid_table""].pks ['rowid'] >>> db[""regular_table""].pks ['id'] ``` But that's because the `.pks` property hides the difference: https://github.com/simonw/sqlite-utils/blob/dc94f4bb8cfe922bb2f9c89f8f0f29092ea63133/sqlite_utils/db.py#L805-L810 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925410305,Introspection property for telling if a table is a rowid table, https://github.com/simonw/sqlite-utils/issues/285#issuecomment-864417133,https://api.github.com/repos/simonw/sqlite-utils/issues/285,864417133,MDEyOklzc3VlQ29tbWVudDg2NDQxNzEzMw==,9599,simonw,2021-06-19T14:57:36Z,2021-06-19T14:57:36Z,OWNER,"So the logic is: ```python [column.name for column in self.columns if column.is_pk] ``` I need to decide on a property name. Existing names are documented here: https://sqlite-utils.datasette.io/en/stable/python-api.html#introspection","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925410305,Introspection property for telling if a table is a rowid table, https://github.com/simonw/sqlite-utils/issues/285#issuecomment-864417493,https://api.github.com/repos/simonw/sqlite-utils/issues/285,864417493,MDEyOklzc3VlQ29tbWVudDg2NDQxNzQ5Mw==,9599,simonw,2021-06-19T15:00:43Z,2021-06-19T15:00:43Z,OWNER,"I have to be careful about the language I use here. Here's the official definition: https://www.sqlite.org/rowidtable.html > A ""rowid table"" is any table in an SQLite schema that > > - is *not* a [virtual table](https://www.sqlite.org/vtab.html), and > - is *not* a [WITHOUT ROWID](https://www.sqlite.org/withoutrowid.html) table. > > Most tables in a typical SQLite database schema are rowid tables. > > Rowid tables are distinguished by the fact that they all have a unique, non-NULL, signed 64-bit integer [rowid](https://www.sqlite.org/lang_createtable.html#rowid) that is used as the access key for the data in the underlying [B-tree](https://www.sqlite.org/fileformat2.html#btree) storage engine. So it's not correct to call a table a ""rowid table"" only if it is missing its own primary keys. Maybe `table.has_rowid` is the right language to use here? No, that's no good - because tables with their own primary keys usually also have a rowid.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925410305,Introspection property for telling if a table is a rowid table, https://github.com/simonw/sqlite-utils/issues/285#issuecomment-864417765,https://api.github.com/repos/simonw/sqlite-utils/issues/285,864417765,MDEyOklzc3VlQ29tbWVudDg2NDQxNzc2NQ==,9599,simonw,2021-06-19T15:02:42Z,2021-06-19T15:02:42Z,OWNER,"Some options: - `table.rowid_only` - `table.rowid_as_pk` - `table.no_pks` - `table.no_pk` - `table.uses_rowid` - `table.use_rowid`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925410305,Introspection property for telling if a table is a rowid table, https://github.com/simonw/sqlite-utils/issues/285#issuecomment-864417808,https://api.github.com/repos/simonw/sqlite-utils/issues/285,864417808,MDEyOklzc3VlQ29tbWVudDg2NDQxNzgwOA==,9599,simonw,2021-06-19T15:03:00Z,2021-06-19T15:03:00Z,OWNER,I think I like `table.uses_rowid` best - it reads well.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925410305,Introspection property for telling if a table is a rowid table, https://github.com/simonw/sqlite-utils/issues/285#issuecomment-864418188,https://api.github.com/repos/simonw/sqlite-utils/issues/285,864418188,MDEyOklzc3VlQ29tbWVudDg2NDQxODE4OA==,9599,simonw,2021-06-19T15:05:53Z,2021-06-19T15:05:53Z,OWNER,"```python @property def uses_rowid(self): return not any(column for column in self.columns if column.is_pk) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925410305,Introspection property for telling if a table is a rowid table, https://github.com/simonw/sqlite-utils/issues/285#issuecomment-864418795,https://api.github.com/repos/simonw/sqlite-utils/issues/285,864418795,MDEyOklzc3VlQ29tbWVudDg2NDQxODc5NQ==,9599,simonw,2021-06-19T15:11:05Z,2021-06-19T15:11:14Z,OWNER,"Actually I'm going to go with `use_rowid` instead - because the table doesn't inherently use a rowid itself, but you should use one if you want to query it in a way that gives you back a primary key.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925410305,Introspection property for telling if a table is a rowid table, https://github.com/simonw/sqlite-utils/issues/284#issuecomment-864419283,https://api.github.com/repos/simonw/sqlite-utils/issues/284,864419283,MDEyOklzc3VlQ29tbWVudDg2NDQxOTI4Mw==,9599,simonw,2021-06-19T15:15:34Z,2021-06-19T15:15:34Z,OWNER,"I think this code is at fault: https://github.com/simonw/sqlite-utils/blob/5b257949d996fe43dc5d218d4308b88796a90740/sqlite_utils/db.py#L1017-L1023 It's using `.pks` which adds `rowid` if it's missing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",925320167,.transform(types=) turns rowid into a concrete column, https://github.com/simonw/sqlite-utils/issues/272#issuecomment-864476167,https://api.github.com/repos/simonw/sqlite-utils/issues/272,864476167,MDEyOklzc3VlQ29tbWVudDg2NDQ3NjE2Nw==,9599,simonw,2021-06-19T23:36:48Z,2021-06-19T23:36:48Z,OWNER,Wrote this up on my blog here: https://simonwillison.net/2021/Jun/19/sqlite-utils-memory/ - with a video demo here: https://www.youtube.com/watch?v=OUjd0rkc678,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",921878733,"Idea: import CSV to memory, run SQL, export in a single command",