html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1297703307,https://api.github.com/repos/simonw/sqlite-utils/issues/448,1297703307,IC_kwDOCGYnMM5NWWGL,167893,mcarpenter,2022-10-31T21:23:51Z,2022-10-31T21:27:32Z,CONTRIBUTOR,"The Windows aspect is a red herring: OP's sample above produces the same error on Linux. (Though I don't know what's going on with the CI). The same error can also be obtained by passing an `io` from a file opened in non-binary mode (`'r'` as opposed to `'rb'`) to `rows_from_file()`. This is how I got here. The fix for my case is easy: open the file in mode `'rb'`. The analagous fix for OP's problem also works: use `BytesIO` in place of `StringIO`. Minimal test case (derived from [utils.py](https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/utils.py#L304)): ``` python import io from typing import cast #fp = io.StringIO(""id,name\n1,Cleo"") # error fp = io.BytesIO(bytes(""id,name\n1,Cleo"", encoding='utf-8')) # okay reader = io.BufferedReader(cast(io.RawIOBase, fp)) reader.peek(1) # exception thrown here ``` I see the signature of `rows_from_file()` correctly has `fp: BinaryIO` but I guess you'd need either a runtime type check for that (not all `io`s have `mode()`), or to catch the `AttributeError` on `peek()` to produce a better error for users. Neither option is ideal. Some thoughts on testing binary-ness of `io`s in this SO question: https://stackoverflow.com/questions/44584829/how-to-determine-if-file-is-opened-in-binary-or-text-mode","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1279144769,Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto', https://github.com/simonw/sqlite-utils/issues/239#issuecomment-1236214402,https://api.github.com/repos/simonw/sqlite-utils/issues/239,1236214402,IC_kwDOCGYnMM5JryKC,9599,simonw,2022-09-03T23:46:02Z,2022-09-03T23:46:02Z,OWNER,Yeah having a version of this that can setup m2m relationships would definitely be interesting.,"{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",816526538,sqlite-utils extract could handle nested objects, https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190449764,https://api.github.com/repos/simonw/sqlite-utils/issues/456,1190449764,IC_kwDOCGYnMM5G9NJk,45919695,jcmkk3,2022-07-20T15:45:54Z,2022-07-20T15:45:54Z,NONE,"> hadley wickham's melt and reshape could be good inspo: http://had.co.nz/reshape/introduction.pdf Note that Hadley has since implemented `pivot_longer` and `pivot_wider` instead of the previous verbs/functions that he used. Those can be found in the tidyr package and are probably the best reference which includes all of the learnings from years of user feedback. https://tidyr.tidyverse.org/articles/pivot.html","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1310243385,feature request: pivot command, https://github.com/simonw/datasette/issues/1727#issuecomment-1112889800,https://api.github.com/repos/simonw/datasette/issues/1727,1112889800,IC_kwDOBm6k_c5CVVnI,9599,simonw,2022-04-29T05:29:38Z,2022-04-29T05:29:38Z,OWNER,"OK, I just got the most incredible result with that! I started up a container running `bash` like this, from my `datasette` checkout. I'm mapping port 8005 on my laptop to port 8001 inside the container because laptop port 8001 was already doing something else: ``` docker run -it --rm --name my-running-script -p 8005:8001 -v ""$PWD"":/usr/src/myapp \ -w /usr/src/myapp nogil/python bash ``` Then in `bash` I ran the following commands to install Datasette and its dependencies: ``` pip install -e '.[test]' pip install datasette-pretty-traces # For debug tracing ``` Then I started Datasette against my `github.db` database (from github-to-sqlite.dogsheep.net/github.db) like this: ``` datasette github.db -h 0.0.0.0 --setting trace_debug 1 ``` I hit the following two URLs to compare the parallel v.s. not parallel implementations: - `http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10` - `http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10&_noparallel=1` And... the parallel one beat the non-parallel one decisively, on multiple page refreshes! Not parallel: 77ms Parallel: 47ms So yeah, I'm very confident this is a problem with the GIL. And I am absolutely **stunned** that @colesbury's fork ran Datasette (which has some reasonably tricky threading and async stuff going on) out of the box!","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/sqlite-utils/issues/159#issuecomment-1111506339,https://api.github.com/repos/simonw/sqlite-utils/issues/159,1111506339,IC_kwDOCGYnMM5CQD2j,154364,dracos,2022-04-27T21:35:13Z,2022-04-27T21:35:13Z,NONE,"Just stumbled across this, wondering why none of my deletes were working.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",702386948,.delete_where() does not auto-commit (unlike .insert() or .upsert()), https://github.com/simonw/datasette/issues/1419#issuecomment-893133496,https://api.github.com/repos/simonw/datasette/issues/1419,893133496,IC_kwDOBm6k_c41PCK4,9599,simonw,2021-08-05T03:22:44Z,2021-08-05T03:22:44Z,OWNER,"I ran into this exact same problem today! I only just learned how to use filter on aggregates: https://til.simonwillison.net/sqlite/sqlite-aggregate-filter-clauses A workaround I used is to add this to the deploy command: datasette publish cloudrun ... --install=pysqlite3-binary This will install the https://pypi.org/project/pysqlite3-binary for package which bundles a more recent SQLite version.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",959710008,`publish cloudrun` should deploy a more recent SQLite version, https://github.com/simonw/datasette/issues/1388#issuecomment-875738149,https://api.github.com/repos/simonw/datasette/issues/1388,875738149,MDEyOklzc3VlQ29tbWVudDg3NTczODE0OQ==,9599,simonw,2021-07-07T16:14:29Z,2021-07-07T16:14:29Z,OWNER,This sounds like a valuable feature for people running Datasette behind a proxy.,"{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",939051549,Serve using UNIX domain socket, https://github.com/simonw/datasette/issues/1258#issuecomment-808651088,https://api.github.com/repos/simonw/datasette/issues/1258,808651088,MDEyOklzc3VlQ29tbWVudDgwODY1MTA4OA==,9599,simonw,2021-03-27T04:41:52Z,2021-03-27T04:42:14Z,OWNER,"Right now they look like this: ```yaml databases: fixtures: queries: neighborhood_search: params: - text ``` In addition to being able to specify defaults, I'd also like to add other things in the future - most significantly the ability to specify a different input widget (e.g. textarea v.s. single-line input) So maybe this looks like: ```yaml params: - name: text default: """" - name: age widget: number ```","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",828858421,Allow canned query params to specify default values, https://github.com/simonw/datasette/issues/1262#issuecomment-802099264,https://api.github.com/repos/simonw/datasette/issues/1262,802099264,MDEyOklzc3VlQ29tbWVudDgwMjA5OTI2NA==,9599,simonw,2021-03-18T16:43:09Z,2021-03-18T16:43:09Z,OWNER,"I often find myself wanting this too, when I'm exploring a new dataset. i agree with Bob that this is a good candidate for a plugin. The plugin system isn't quite setup for this yet though - there isn't an obvious mechanism for adding extra sort orders or other interface elements that manipulate the query used by the table view in some way. I'm going to promote this issue to status of a plugin hook feature request - I have a hunch that a plugin hook that enables `order by random()` could enable a lot of other useful plugin features too.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",834602299,Plugin hook that could support 'order by random()' for table view, https://github.com/simonw/datasette/issues/782#issuecomment-782765665,https://api.github.com/repos/simonw/datasette/issues/782,782765665,MDEyOklzc3VlQ29tbWVudDc4Mjc2NTY2NQ==,9599,simonw,2021-02-20T23:34:41Z,2021-02-20T23:34:41Z,OWNER,"OK, I'm back to the ""top level object as the default"" side of things now - it's pretty much unanimous at this point, and it's certainly true that it's not a decision you'll even regret.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",627794879,Redesign default .json format, https://github.com/simonw/datasette/issues/1101#issuecomment-755133937,https://api.github.com/repos/simonw/datasette/issues/1101,755133937,MDEyOklzc3VlQ29tbWVudDc1NTEzMzkzNw==,9599,simonw,2021-01-06T07:25:48Z,2021-01-06T07:26:43Z,OWNER,"Idea: instead of returning a dictionary, `register_output_renderer` could return an object. The object could have the following properties: - `.extension` - the extension to use - `.can_render(...)` - says if it can render this - `.can_stream(...)` - says if streaming is supported - `async .stream_rows(rows_iterator, send)` - method that loops through all rows and uses `send` to send them to the response in the correct format I can then deprecate the existing `dict` return type for 1.0.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,register_output_renderer() should support streaming data, https://github.com/simonw/datasette/issues/417#issuecomment-751504136,https://api.github.com/repos/simonw/datasette/issues/417,751504136,MDEyOklzc3VlQ29tbWVudDc1MTUwNDEzNg==,212369,drewda,2020-12-27T19:02:06Z,2020-12-27T19:02:06Z,NONE,"Very much looking forward to seeing this functionality come together. This is probably out-of-scope for an initial release, but in the future it could be useful to also think of how to run this is a container'ized context. For example, an immutable datasette container that points to an S3 bucket of SQLite DBs or CSVs. Or an immutable datasette container pointing to a NFS volume elsewhere on a Kubernetes cluster.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",421546944,Datasette Library, https://github.com/simonw/datasette/issues/749#issuecomment-737563699,https://api.github.com/repos/simonw/datasette/issues/749,737563699,MDEyOklzc3VlQ29tbWVudDczNzU2MzY5OQ==,9599,simonw,2020-12-02T23:45:42Z,2020-12-02T23:45:42Z,OWNER,"I asked about this on Twitter - https://twitter.com/steren/status/1334281184965140483 > You simply need to send the `Transfer-Encoding: chunked` header.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",610829227,Cloud Run fails to serve database files larger than 32MB, https://github.com/simonw/datasette/issues/670#issuecomment-696163452,https://api.github.com/repos/simonw/datasette/issues/670,696163452,MDEyOklzc3VlQ29tbWVudDY5NjE2MzQ1Mg==,652285,snth,2020-09-21T14:46:10Z,2020-09-21T14:46:10Z,NONE,I'm currently using PostgREST to serve OpenAPI APIs off Postgresql databases. I would like to try out datasette once this becomes available on Postgres.,"{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",564833696,Prototoype for Datasette on PostgreSQL, https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615932007,https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4,615932007,MDEyOklzc3VlQ29tbWVudDYxNTkzMjAwNw==,9599,simonw,2020-04-18T19:27:55Z,2020-04-18T19:27:55Z,MEMBER,"Research thread: https://twitter.com/simonw/status/1249049694984011776 > I want to build some software that lets people store their own data in their own S3 bucket, but if possible I'd like not to have to teach people the incantations needed to get their bucket setup and minimum-permission credentials figures out https://testdriven.io/blog/storing-django-static-and-media-files-on-amazon-s3/ looks useful","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",602533539,Upload all my photos to a secure S3 bucket, https://github.com/simonw/sqlite-utils/issues/86#issuecomment-586729798,https://api.github.com/repos/simonw/sqlite-utils/issues/86,586729798,MDEyOklzc3VlQ29tbWVudDU4NjcyOTc5OA==,9599,simonw,2020-02-16T17:11:02Z,2020-02-16T17:11:02Z,OWNER,I filed a bug in the Python issue tracker here: https://bugs.python.org/issue39652,"{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",564579430,Problem with square bracket in CSV column name, https://github.com/simonw/datasette/issues/662#issuecomment-580028669,https://api.github.com/repos/simonw/datasette/issues/662,580028669,MDEyOklzc3VlQ29tbWVudDU4MDAyODY2OQ==,9599,simonw,2020-01-30T00:30:19Z,2020-01-30T00:30:19Z,OWNER,I just shipped 0.34: https://datasette.readthedocs.io/en/stable/changelog.html#v0-34,"{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",556814876,Escape_fts5_query-hookimplementation does not work with queries to standard tables, https://github.com/simonw/datasette/issues/417#issuecomment-473312514,https://api.github.com/repos/simonw/datasette/issues/417,473312514,MDEyOklzc3VlQ29tbWVudDQ3MzMxMjUxNA==,9599,simonw,2019-03-15T14:42:07Z,2019-03-17T22:12:30Z,OWNER,"A neat ability of Datasette Library would be if it can work against other files that have been dropped into the folder. In particular: if a user drops a CSV file into the folder, how about automatically converting that CSV file to SQLite using [sqlite-utils](https://github.com/simonw/sqlite-utils)?","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",421546944,Datasette Library,