issue_comments
563 rows where author_association = "CONTRIBUTOR" sorted by issue
This data as json, CSV (advanced)
created_at (date) >30 ✖
- 2022-11-16 17
- 2022-10-07 11
- 2023-01-25 8
- 2022-02-03 7
- 2020-06-23 6
- 2020-07-01 6
- 2022-09-26 6
- 2022-09-27 6
- 2022-01-08 5
- 2023-01-29 5
- 2019-06-22 4
- 2020-05-10 4
- 2021-01-05 4
- 2021-09-08 4
- 2022-01-13 4
- 2022-03-24 4
- 2022-04-28 4
- 2022-10-01 4
- 2022-11-14 4
- 2022-11-15 4
- 2022-11-17 4
- 2023-06-14 4
- 2019-10-11 3
- 2020-03-26 3
- 2020-07-08 3
- 2021-02-12 3
- 2021-03-23 3
- 2021-12-13 3
- 2022-02-06 3
- 2022-02-07 3
- …
id | html_url | issue_url | node_id | user | created_at | updated_at | author_association | body | reactions | issue ▼ | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
541118904 | https://github.com/simonw/datasette/issues/507#issuecomment-541118904 | https://api.github.com/repos/simonw/datasette/issues/507 | MDEyOklzc3VlQ29tbWVudDU0MTExODkwNA== | rixx 2657547 | 2019-10-11T15:48:49Z | 2019-10-11T15:48:49Z | CONTRIBUTOR | Headless Chrome and Firefox via Selenium are a solid choice in my experience. You may be interested in how pretix and pretalx solve this problem: They use pytest to create those screenshots on release to make sure they are up to date. See [this writeup](https://behind.pretix.eu/2018/11/15/automated-screenshots/) and [this repo](https://github.com/pretix/pretix-screenshots). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Every datasette plugin on the ecosystem page should have a screenshot 455852801 | |
510730200 | https://github.com/simonw/datasette/issues/511#issuecomment-510730200 | https://api.github.com/repos/simonw/datasette/issues/511 | MDEyOklzc3VlQ29tbWVudDUxMDczMDIwMA== | abdusco 3243482 | 2019-07-12T03:23:22Z | 2019-07-12T03:23:22Z | CONTRIBUTOR | @simonw yes it works fine on Windows, but test suite doesn't run properly, for that I had to use WSL | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Get Datasette tests passing on Windows in GitHub Actions 456578474 | |
541119038 | https://github.com/simonw/datasette/issues/512#issuecomment-541119038 | https://api.github.com/repos/simonw/datasette/issues/512 | MDEyOklzc3VlQ29tbWVudDU0MTExOTAzOA== | rixx 2657547 | 2019-10-11T15:49:13Z | 2019-10-11T15:49:13Z | CONTRIBUTOR | How open are you to changing the config variable names (with appropriate deprecation, of course)? `"about_url_text", "license_url_text"` etc might be better suited to convey that these are just meant as basically URL titles. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | "about" parameter in metadata does not appear when alone 457147936 | |
504662904 | https://github.com/simonw/datasette/issues/514#issuecomment-504662904 | https://api.github.com/repos/simonw/datasette/issues/514 | MDEyOklzc3VlQ29tbWVudDUwNDY2MjkwNA== | russss 45057 | 2019-06-22T12:45:21Z | 2019-06-22T12:45:39Z | CONTRIBUTOR | On most modern Linux distros, systemd is the easiest answer. Example systemd unit file (save to `/etc/systemd/system/datasette.service`): ``` [Unit] Description=Datasette After=network.target [Service] Type=simple User=<username> WorkingDirectory=/path/to/data ExecStart=/path/to/datasette serve -h 0.0.0.0 ./my.db Restart=on-failure [Install] WantedBy=multi-user.target ``` Activate it with: ```bash $ sudo systemctl daemon-reload $ sudo systemctl enable datasette $ sudo systemctl start datasette ``` Logs are best viewed using `journalctl -u datasette -f`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Documentation with recommendations on running Datasette in production without using Docker 459397625 | |
504663766 | https://github.com/simonw/datasette/issues/514#issuecomment-504663766 | https://api.github.com/repos/simonw/datasette/issues/514 | MDEyOklzc3VlQ29tbWVudDUwNDY2Mzc2Ng== | russss 45057 | 2019-06-22T12:57:59Z | 2019-06-22T12:57:59Z | CONTRIBUTOR | > This example is useful to - I like how it has a Makefile that knows how to set up systemd: https://github.com/pikesley/Queube I wasn't even aware it was possible to add a systemd service at an arbitrary path, but it seems a little messy to me. Maybe worth noting that systemd does support [per-user services](https://wiki.archlinux.org/index.php/Systemd/User) which don't require root access. Cool but probably overkill for most people (especially when you're going to need root to listen on port 80 anyway, directly or via a reverse proxy). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Documentation with recommendations on running Datasette in production without using Docker 459397625 | |
504684831 | https://github.com/simonw/datasette/issues/514#issuecomment-504684831 | https://api.github.com/repos/simonw/datasette/issues/514 | MDEyOklzc3VlQ29tbWVudDUwNDY4NDgzMQ== | russss 45057 | 2019-06-22T17:38:23Z | 2019-06-22T17:38:23Z | CONTRIBUTOR | > > WorkingDirectory=/path/to/data > > @russss, Which directory does this represent? It's the working directory (cwd) of the spawned process. In this case if you set it to the directory your data is in, you can use relative paths to the db (and metadata/templates/etc) in the `ExecStart` command. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Documentation with recommendations on running Datasette in production without using Docker 459397625 | |
504690927 | https://github.com/simonw/datasette/issues/514#issuecomment-504690927 | https://api.github.com/repos/simonw/datasette/issues/514 | MDEyOklzc3VlQ29tbWVudDUwNDY5MDkyNw== | russss 45057 | 2019-06-22T19:06:07Z | 2019-06-22T19:06:07Z | CONTRIBUTOR | I'd rather not turn this into a systemd support thread, but you're trying to execute the package directory there. Your datasette executable is probably at `/home/chris/Env/datasette/bin/datasette`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Documentation with recommendations on running Datasette in production without using Docker 459397625 | |
504809397 | https://github.com/simonw/datasette/issues/523#issuecomment-504809397 | https://api.github.com/repos/simonw/datasette/issues/523 | MDEyOklzc3VlQ29tbWVudDUwNDgwOTM5Nw== | rixx 2657547 | 2019-06-24T01:38:14Z | 2019-06-24T01:38:14Z | CONTRIBUTOR | Ah, apologies – I had found and read those issues, but I was under the impression that they refered only to the filtered row count, not the unfiltered total row count. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Show total/unfiltered row count when filtering 459627549 | |
992971072 | https://github.com/simonw/datasette/issues/526#issuecomment-992971072 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c47L4lA | fgregg 536941 | 2021-12-13T22:29:34Z | 2021-12-13T22:29:34Z | CONTRIBUTOR | just came by to open this issue. would make my data analysis in observable a lot better! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
993078038 | https://github.com/simonw/datasette/issues/526#issuecomment-993078038 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c47MSsW | fgregg 536941 | 2021-12-14T01:46:52Z | 2021-12-14T01:46:52Z | CONTRIBUTOR | the nested query idea is very nice, and i stole if for [my client side paginator](https://observablehq.com/d/1d5da3a3c3f2f347#DatasetteClient). However, it won't do the right thing if the original query orders by random(). If you go the nested query route, maybe raise a 4XX status code if the query has such a clause? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1254064260 | https://github.com/simonw/datasette/issues/526#issuecomment-1254064260 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5Kv4CE | fgregg 536941 | 2022-09-21T18:17:04Z | 2022-09-21T18:18:01Z | CONTRIBUTOR | hi @simonw, this is becoming more of a bother for my [labor data warehouse](https://labordata.bunkum.us/). Is there any research or a spike i could do that would help you investigate this issue? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1258167564 | https://github.com/simonw/datasette/issues/526#issuecomment-1258167564 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5K_h0M | fgregg 536941 | 2022-09-26T14:57:44Z | 2022-09-26T15:08:36Z | CONTRIBUTOR | reading the database execute method i have a few questions. https://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/database.py#L229-L242 --- unless i'm missing something (which is very likely!!), the `max_returned_rows` argument doesn't actually offer any protections against running very expensive queries. It's not like adding a `LIMIT max_rows` argument. it make sense that it isn't because, the query could already have an `LIMIT` argument. Doing something like `select * from (query) limit {max_returned_rows}` **might** be protective but wouldn't always. Instead the code executes the full original query, and if still has time it fetches out the first `max_rows + 1` rows. this *does* offer some protection of memory exhaustion, as you won't hydrate a huge result set into python (however, there are [data flow patterns](https://github.com/simonw/datasette/issues/1727#issuecomment-1258129113) that could avoid that too) given the current architecture, i don't see how creating a new connection would be use? --- If we just removed the `max_return_rows` limitation, then i think most things would be fine **except** for the QueryViews. Right now rendering, just [5000 rows takes a lot of client-side memory](https://github.com/simonw/datasette/issues/1655) so some form of pagination would be required. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1258337011 | https://github.com/simonw/datasette/issues/526#issuecomment-1258337011 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5LALLz | fgregg 536941 | 2022-09-26T16:49:48Z | 2022-09-26T16:49:48Z | CONTRIBUTOR | i think the smallest change that gets close to what i want is to change the behavior so that `max_returned_rows` is not applied in the `execute` method when we are are asking for a csv of query. there are some infelicities for that approach, but i'll make a PR to make it easier to discuss. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1258849766 | https://github.com/simonw/datasette/issues/526#issuecomment-1258849766 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5LCIXm | fgregg 536941 | 2022-09-27T01:27:03Z | 2022-09-27T01:27:03Z | CONTRIBUTOR | i agree with that concern! but if i'm understanding the code correctly, `maximum_returned_rows` does not protect against long-running queries in any way. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1258871525 | https://github.com/simonw/datasette/issues/526#issuecomment-1258871525 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5LCNrl | fgregg 536941 | 2022-09-27T02:09:32Z | 2022-09-27T02:14:53Z | CONTRIBUTOR | thanks @simonw, i learned something i didn't know about sqlite's execution model! > Imagine if Datasette CSVs did allow unlimited retrievals. Someone could hit the CSV endpoint for that recursive query and tie up Datasette's SQL connection effectively forever. why wouldn't the `sqlite_timelimit` guard prevent that? --- on my local version which has the code to [turn off truncations for query csv](#1820), `sqlite_timelimit` does protect me. ![Screenshot 2022-09-26 at 22-14-31 Error 500](https://user-images.githubusercontent.com/536941/192415680-94b32b7f-868f-4b89-8194-5752d45f6009.png) | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1258878311 | https://github.com/simonw/datasette/issues/526#issuecomment-1258878311 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5LCPVn | fgregg 536941 | 2022-09-27T02:19:48Z | 2022-09-27T02:19:48Z | CONTRIBUTOR | this sql query doesn't trip up `maximum_returned_rows` but does timeout ```sql with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter LIMIT 10 OFFSET 100000000 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1258910228 | https://github.com/simonw/datasette/issues/526#issuecomment-1258910228 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5LCXIU | fgregg 536941 | 2022-09-27T03:11:07Z | 2022-09-27T03:11:07Z | CONTRIBUTOR | i think this feature would be safe, as its really only the time limit that can, and imo, should protect against long running queries, as it is pretty easy to make very expensive queries that don't return many rows. moving away from `max_returned_rows` will requires some thinking about: 1. memory usage and data flows to handle potentially very large result sets 2. how to avoid rendering tens or hundreds of thousands of [html rows](#1655). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
1259718517 | https://github.com/simonw/datasette/issues/526#issuecomment-1259718517 | https://api.github.com/repos/simonw/datasette/issues/526 | IC_kwDOBm6k_c5LFcd1 | fgregg 536941 | 2022-09-27T16:02:51Z | 2022-09-27T16:04:46Z | CONTRIBUTOR | i think that `max_returned_rows` **is** a defense mechanism, just not for connection exhaustion. `max_returned_rows` is a defense mechanism against **memory bombs**. if you are potentially yielding out hundreds of thousands or even millions of rows, you need to be quite careful about data flow to not run out of memory on the server, or on the client. you have a lot of places in your code that are protective of that right now, but `max_returned_rows` acts as the final backstop. so, given that, it makes sense to have removing `max_returned_rows` altogether be a non-goal, but instead allow for for specific codepaths (like streaming csv's) be able to bypass. that could dramatically lower the surface area for a memory-bomb attack. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Stream all results for arbitrary SQL and canned queries 459882902 | |
509618339 | https://github.com/simonw/datasette/pull/554#issuecomment-509618339 | https://api.github.com/repos/simonw/datasette/issues/554 | MDEyOklzc3VlQ29tbWVudDUwOTYxODMzOQ== | abdusco 3243482 | 2019-07-09T12:16:32Z | 2019-07-09T12:16:32Z | CONTRIBUTOR | I've also added another fix for using static mounts with absolute paths on Windows. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Fix static mounts using relative paths and prevent traversal exploits 465728430 | |
509629331 | https://github.com/simonw/datasette/pull/554#issuecomment-509629331 | https://api.github.com/repos/simonw/datasette/issues/554 | MDEyOklzc3VlQ29tbWVudDUwOTYyOTMzMQ== | abdusco 3243482 | 2019-07-09T12:51:35Z | 2019-07-09T12:51:35Z | CONTRIBUTOR | I wanted to add a test for it too, but I've realized it's impossible to test a server process as we cannot get its exit code. ```python # tests/test_cli.py def test_static_mounts_on_windows(): if sys.platform != "win32": return runner = CliRunner() result = runner.invoke( cli, ["serve", "--static", r"s:C:\\"] ) assert result.exit_code == 0 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Fix static mounts using relative paths and prevent traversal exploits 465728430 | |
1303660293 | https://github.com/simonw/sqlite-utils/issues/50#issuecomment-1303660293 | https://api.github.com/repos/simonw/sqlite-utils/issues/50 | IC_kwDOCGYnMM5NtEcF | chapmanjacobd 7908073 | 2022-11-04T14:38:36Z | 2022-11-04T14:38:36Z | CONTRIBUTOR | where did you see the limit as 999? I believe the limit has been 32766 for quite some time. If you could detect which one this could speed up batch insert of some types of data significantly | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | "Too many SQL variables" on large inserts 473083260 | |
1420941334 | https://github.com/simonw/datasette/pull/564#issuecomment-1420941334 | https://api.github.com/repos/simonw/datasette/issues/564 | IC_kwDOBm6k_c5UsdgW | psychemedia 82988 | 2023-02-07T15:14:10Z | 2023-02-07T15:14:10Z | CONTRIBUTOR | Is this feature covered by any more recent updates to `datasette`, or via any plugins that you're aware of? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | First proof-of-concept of Datasette Library 473288428 | |
527209840 | https://github.com/simonw/sqlite-utils/pull/56#issuecomment-527209840 | https://api.github.com/repos/simonw/sqlite-utils/issues/56 | MDEyOklzc3VlQ29tbWVudDUyNzIwOTg0MA== | amjith 49260 | 2019-09-02T17:23:21Z | 2019-09-02T17:23:21Z | CONTRIBUTOR | I have updated the other PR with the changes from this one and added tests. I have also changed the escaping from double quotes to brackets. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Escape the table name in populate_fts and search. 487847945 | |
527211047 | https://github.com/simonw/sqlite-utils/pull/57#issuecomment-527211047 | https://api.github.com/repos/simonw/sqlite-utils/issues/57 | MDEyOklzc3VlQ29tbWVudDUyNzIxMTA0Nw== | amjith 49260 | 2019-09-02T17:30:43Z | 2019-09-02T17:30:43Z | CONTRIBUTOR | I have merged the other PR (#56) into this one. I have incorporated your suggestions. Cheers! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add triggers while enabling FTS 487987958 | |
533818697 | https://github.com/simonw/sqlite-utils/issues/61#issuecomment-533818697 | https://api.github.com/repos/simonw/sqlite-utils/issues/61 | MDEyOklzc3VlQ29tbWVudDUzMzgxODY5Nw== | amjith 49260 | 2019-09-21T18:09:01Z | 2019-09-21T18:09:28Z | CONTRIBUTOR | @witeshadow The library version doesn't have helpers around CSV (at least not from what I can see in the code). But here's a snippet that makes it easy to insert from CSV using the library. ``` import csv from sqlite_utils import Database # CSV Reader csv_file = open("filename.csv") # open the csv file. reader = csv.reader(csv_file) # Create a CSV reader headers = next(reader) # First line is the header docs = (dict(zip(headers, row)) for row in reader) # Now you can use the `sqlite_utils` library. db = Database("my_database.db") db["table_name"].insert_all(docs) ``` This snippet is adapted from reading the CLI source code on how it implements the csv option. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | importing CSV to SQLite as library 491219910 | |
559632608 | https://github.com/simonw/datasette/issues/573#issuecomment-559632608 | https://api.github.com/repos/simonw/datasette/issues/573 | MDEyOklzc3VlQ29tbWVudDU1OTYzMjYwOA== | psychemedia 82988 | 2019-11-29T01:43:38Z | 2019-11-29T01:43:38Z | CONTRIBUTOR | In passing, it looks like a start was made on a datasette Jupyter server extension in https://github.com/lucasdurand/jupyter-datasette although the build fails in MyBinder. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Exposing Datasette via Jupyter-server-proxy 492153532 | |
593026413 | https://github.com/simonw/datasette/issues/573#issuecomment-593026413 | https://api.github.com/repos/simonw/datasette/issues/573 | MDEyOklzc3VlQ29tbWVudDU5MzAyNjQxMw== | wragge 127565 | 2020-03-01T01:24:45Z | 2020-03-01T01:24:45Z | CONTRIBUTOR | Did you manage to find an answer to this? I've got a notebook to help people generate datasets on the fly from an API, so it would be cool if they flick it to Datasette for initial exploration. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Exposing Datasette via Jupyter-server-proxy 492153532 | |
604328163 | https://github.com/simonw/datasette/issues/573#issuecomment-604328163 | https://api.github.com/repos/simonw/datasette/issues/573 | MDEyOklzc3VlQ29tbWVudDYwNDMyODE2Mw== | psychemedia 82988 | 2020-03-26T09:41:30Z | 2020-03-26T09:41:30Z | CONTRIBUTOR | Fixed by @simonw; example here: https://github.com/simonw/jupyterserverproxy-datasette-demo | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Exposing Datasette via Jupyter-server-proxy 492153532 | |
541052329 | https://github.com/simonw/datasette/issues/585#issuecomment-541052329 | https://api.github.com/repos/simonw/datasette/issues/585 | MDEyOklzc3VlQ29tbWVudDU0MTA1MjMyOQ== | rixx 2657547 | 2019-10-11T12:53:51Z | 2019-10-11T12:53:51Z | CONTRIBUTOR | I think this would be good, yeah – currently, databases are explicitly sorted by name in the IndexView, we could just remove that part (and use an `OrderedDict` for consistency, I suppose)? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Databases on index page should display in order they were passed to "datasette serve"? 503217375 | |
541562581 | https://github.com/simonw/datasette/pull/590#issuecomment-541562581 | https://api.github.com/repos/simonw/datasette/issues/590 | MDEyOklzc3VlQ29tbWVudDU0MTU2MjU4MQ== | rixx 2657547 | 2019-10-14T08:57:46Z | 2019-10-14T08:57:46Z | CONTRIBUTOR | Ah, thank you – I saw the need for unit tests but wasn't sure what the best way to add one would be. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Handle spaces in DB names 505818256 | |
541587823 | https://github.com/simonw/datasette/pull/590#issuecomment-541587823 | https://api.github.com/repos/simonw/datasette/issues/590 | MDEyOklzc3VlQ29tbWVudDU0MTU4NzgyMw== | rixx 2657547 | 2019-10-14T09:58:23Z | 2019-10-14T09:58:23Z | CONTRIBUTOR | Added tests. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Handle spaces in DB names 505818256 | |
544008463 | https://github.com/simonw/datasette/pull/601#issuecomment-544008463 | https://api.github.com/repos/simonw/datasette/issues/601 | MDEyOklzc3VlQ29tbWVudDU0NDAwODQ2Mw== | rixx 2657547 | 2019-10-18T23:39:21Z | 2019-10-18T23:39:21Z | CONTRIBUTOR | That looks right, and I completely agree with the intent. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Don't auto-format SQL on page load 509340359 | |
544008944 | https://github.com/simonw/datasette/pull/601#issuecomment-544008944 | https://api.github.com/repos/simonw/datasette/issues/601 | MDEyOklzc3VlQ29tbWVudDU0NDAwODk0NA== | rixx 2657547 | 2019-10-18T23:40:48Z | 2019-10-18T23:40:48Z | CONTRIBUTOR | The only negative impact that comes to mind is that now you have no way to get the read-only query to be formatted nicely, I think, so maybe a second PR adding the formatting functionality even to the read-only page would be good? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Don't auto-format SQL on page load 509340359 | |
544214418 | https://github.com/simonw/datasette/pull/601#issuecomment-544214418 | https://api.github.com/repos/simonw/datasette/issues/601 | MDEyOklzc3VlQ29tbWVudDU0NDIxNDQxOA== | rixx 2657547 | 2019-10-20T02:29:49Z | 2019-10-20T02:29:49Z | CONTRIBUTOR | Submitted in #602! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Don't auto-format SQL on page load 509340359 | |
549246007 | https://github.com/simonw/datasette/pull/602#issuecomment-549246007 | https://api.github.com/repos/simonw/datasette/issues/602 | MDEyOklzc3VlQ29tbWVudDU0OTI0NjAwNw== | rixx 2657547 | 2019-11-04T07:29:33Z | 2019-11-04T07:29:33Z | CONTRIBUTOR | Not sure – I'm always a bit weirded out when elements that I clicked disappear on me. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Offer to format readonly SQL 509535510 | |
552134876 | https://github.com/dogsheep/twitter-to-sqlite/issues/29#issuecomment-552134876 | https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/29 | MDEyOklzc3VlQ29tbWVudDU1MjEzNDg3Ng== | jacobian 21148 | 2019-11-09T20:33:38Z | 2019-11-09T20:33:38Z | CONTRIBUTOR | ❤️ thanks! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | `import` command fails on empty files 518725064 | |
558687342 | https://github.com/simonw/datasette/issues/639#issuecomment-558687342 | https://api.github.com/repos/simonw/datasette/issues/639 | MDEyOklzc3VlQ29tbWVudDU1ODY4NzM0Mg== | jacobian 21148 | 2019-11-26T15:40:00Z | 2019-11-26T15:40:00Z | CONTRIBUTOR | A bit of background: the reason `heroku git:clone` brings down an empty directory is because `datasette publish heroku` uses the [builds API](https://devcenter.heroku.com/articles/build-and-release-using-the-api), rather than a `git push`, to release the app. I originally did this because it seemed like a lower bar than having a working `git`, but the downside is, as you found out, that tweaking the created app is hard. So there's one option -- change `datasette publish heroku` to use `git push` instead of `heroku builds:create`. @pkoppstein - what you suggested seems like it ought to work (you don't need maintenance mode, though). I'm not sure why it doesn't. You could also look into using the [slugs API](https://devcenter.heroku.com/articles/platform-api-deploying-slugs) to download the slug, change `metadata.json`, re-pack and re-upload the slug. Ultimately though I think I think @simonw's idea of reading `metadata.json` from an external source might be better (#357). Reading from an alternate URL would be fine, or you could also just stuff the whole `metadata.json` into a Heroku config var, and write a plugin to read it from there. Hope this helps a bit! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | updating metadata.json without recreating the app 527670799 | |
559207224 | https://github.com/simonw/datasette/issues/642#issuecomment-559207224 | https://api.github.com/repos/simonw/datasette/issues/642 | MDEyOklzc3VlQ29tbWVudDU1OTIwNzIyNA== | psychemedia 82988 | 2019-11-27T18:40:57Z | 2019-11-27T18:41:07Z | CONTRIBUTOR | Would cookie cutter approaches also work for creating various flavours of customised templates? I need to try to create a couple of sites for myself to get a feel for what sorts of thing are easily doable, and what cribbable cookie cutter items might be. I'm guessing https://simonwillison.net/2019/Nov/25/niche-museums/ is a good place to start from? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Provide a cookiecutter template for creating new plugins 529429214 | |
565755208 | https://github.com/simonw/datasette/pull/644#issuecomment-565755208 | https://api.github.com/repos/simonw/datasette/issues/644 | MDEyOklzc3VlQ29tbWVudDU2NTc1NTIwOA== | chris48s 6025893 | 2019-12-14T21:33:31Z | 2019-12-14T21:33:31Z | CONTRIBUTOR | Hi @simonw Have you had a chance to look at this at all? I'm going to have a chunk of time free next week so if there is additional work needed on this, that would be a particularly convenient time for me to revisit this. Cheers | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Validate metadata json on startup 530513784 | |
582105810 | https://github.com/simonw/datasette/pull/653#issuecomment-582105810 | https://api.github.com/repos/simonw/datasette/issues/653 | MDEyOklzc3VlQ29tbWVudDU4MjEwNTgxMA== | jaywgraves 418191 | 2020-02-04T20:43:01Z | 2020-02-04T20:43:01Z | CONTRIBUTOR | I *think* the existing code will be OK even if I strip the lines in the middle of a new line delimited string. It's only used for the validation, SQLite handles the `--` just fine and the whole SQL textarea still gets sent once it passes validation. I can add your test case to my branch later this evening though. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | allow leading comments in SQL input field 541331755 | |
582106085 | https://github.com/simonw/datasette/pull/653#issuecomment-582106085 | https://api.github.com/repos/simonw/datasette/issues/653 | MDEyOklzc3VlQ29tbWVudDU4MjEwNjA4NQ== | jaywgraves 418191 | 2020-02-04T20:43:43Z | 2020-02-04T20:43:43Z | CONTRIBUTOR | but this also doesn't have to land at all if it doesn't match your use case. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | allow leading comments in SQL input field 541331755 | |
573388052 | https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573388052 | https://api.github.com/repos/simonw/sqlite-utils/issues/74 | MDEyOklzc3VlQ29tbWVudDU3MzM4ODA1Mg== | jayvdb 15092 | 2020-01-12T06:51:30Z | 2020-01-12T06:51:30Z | CONTRIBUTOR | Thanks. That showed me that there was a click cli runner error, and setting `export LANG=en_US.UTF-8` fixed it. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column 546073980 | |
573389669 | https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573389669 | https://api.github.com/repos/simonw/sqlite-utils/issues/74 | MDEyOklzc3VlQ29tbWVudDU3MzM4OTY2OQ== | jayvdb 15092 | 2020-01-12T07:21:17Z | 2020-01-12T07:21:17Z | CONTRIBUTOR | I guess there is some extra flag for ` CliRunner.invoke` to check exitcode and raise the exception, or that should be an extra assert added. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column 546073980 | |
576293773 | https://github.com/simonw/datasette/issues/656#issuecomment-576293773 | https://api.github.com/repos/simonw/datasette/issues/656 | MDEyOklzc3VlQ29tbWVudDU3NjI5Mzc3Mw== | JBPressac 6371750 | 2020-01-20T14:17:11Z | 2020-01-20T14:17:11Z | CONTRIBUTOR | Seems that headers and definitions has simply to be filled as an HTML table in the description field of matadata.json. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Display of the column definitions 546961357 | |
1012158895 | https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012158895 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | IC_kwDOCGYnMM48VFGv | eyeseast 25778 | 2022-01-13T13:55:59Z | 2022-01-13T13:55:59Z | CONTRIBUTOR | Came here to add this. I might pick it up. Would also add a utility to create (and update and delete?) a spatial index. It's not much code but I have to look it up every time. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Helper methods for working with SpatiaLite 557842245 | |
1012230212 | https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012230212 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | IC_kwDOCGYnMM48VWhE | eyeseast 25778 | 2022-01-13T15:15:13Z | 2022-01-13T15:15:13Z | CONTRIBUTOR | Some proposals I'd add to sqlite-utils: Some version of this, from [geojson-to-sqlite](https://github.com/simonw/geojson-to-sqlite/blob/main/geojson_to_sqlite/utils.py#L124-L130): ```python def init_spatialite(db, lib): db.conn.enable_load_extension(True) db.conn.load_extension(lib) # Initialize SpatiaLite if not yet initialized if "spatial_ref_sys" in db.table_names(): return db.conn.execute("select InitSpatialMetadata(1)") ``` Also a function for creating a spatial index: ```python db.conn.execute("select CreateSpatialIndex(?, ?)", [table, "geometry"]) ``` I don't know the nuances of updating a spatial index, or checking if one already exists. This could be a CLI method like: ```sh sqlite-utils spatial-index spatial.db table-name column-name ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Helper methods for working with SpatiaLite 557842245 | |
1012253198 | https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012253198 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | IC_kwDOCGYnMM48VcIO | eyeseast 25778 | 2022-01-13T15:39:14Z | 2022-01-13T15:39:14Z | CONTRIBUTOR | Other thing: If there get to be enough utils, I think it's worth moving all the spatialite stuff into its own file (`gis.py` or something) just so it's easier to find later. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Helper methods for working with SpatiaLite 557842245 | |
1012413729 | https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012413729 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | IC_kwDOCGYnMM48WDUh | eyeseast 25778 | 2022-01-13T18:50:00Z | 2022-01-13T18:50:00Z | CONTRIBUTOR | One more thing I'm going to add: A method to add a geometry column, which I'll need to do to create a spatial index on a table. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Helper methods for working with SpatiaLite 557842245 | |
1013698557 | https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1013698557 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | IC_kwDOCGYnMM48a8_9 | eyeseast 25778 | 2022-01-15T15:15:22Z | 2022-01-15T15:15:22Z | CONTRIBUTOR | @simonw I have a PR here https://github.com/simonw/sqlite-utils/pull/385 that adds Spatialite helpers on the Python side. Please let me know how it looks. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Helper methods for working with SpatiaLite 557842245 | |
1029317527 | https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1029317527 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | IC_kwDOCGYnMM49WiOX | eyeseast 25778 | 2022-02-03T19:18:02Z | 2022-02-03T19:18:02Z | CONTRIBUTOR | Taking part of the conversation from #385 here. > Would sqlite-utils add-geometry-column ... be a good CLI enhancement. for example? Yes. And also `sqlite-utils create-spatial-index` would be great to have. My plan would be to add those once the Python API is settled. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Helper methods for working with SpatiaLite 557842245 | |
590022164 | https://github.com/simonw/datasette/pull/666#issuecomment-590022164 | https://api.github.com/repos/simonw/datasette/issues/666 | MDEyOklzc3VlQ29tbWVudDU5MDAyMjE2NA== | kevindkeogh 13896256 | 2020-02-23T03:26:00Z | 2020-02-23T03:26:00Z | CONTRIBUTOR | It was very helpful for me, using it for a 15M row table. Added a test, happy to amend though! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Use inspect-file, if possible, for total row count 562085508 | |
643709037 | https://github.com/simonw/datasette/issues/691#issuecomment-643709037 | https://api.github.com/repos/simonw/datasette/issues/691 | MDEyOklzc3VlQ29tbWVudDY0MzcwOTAzNw== | amjith 49260 | 2020-06-14T02:35:16Z | 2020-06-14T02:35:16Z | CONTRIBUTOR | The server should reload in the `config_dir` mode. Ref: #848 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | --reload sould reload server if code in --plugins-dir changes 574021194 | |
604225034 | https://github.com/simonw/datasette/issues/712#issuecomment-604225034 | https://api.github.com/repos/simonw/datasette/issues/712 | MDEyOklzc3VlQ29tbWVudDYwNDIyNTAzNA== | wragge 127565 | 2020-03-26T04:40:08Z | 2020-03-26T04:40:08Z | CONTRIBUTOR | Great! Yes, can confirm that this works on Binder. However, when I try to run the same code locally, I get an Internal Server Error when I try to access Datasette. ``` ERROR: Exception in ASGI application Traceback (most recent call last): File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi result = await app(self.scope, self.receive, self.send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__ return await self.app(scope, receive, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette_debug_asgi.py", line 24, in wrapped_app await app(scope, recieve, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette/utils/asgi.py", line 174, in __call__ await self.app(scope, receive, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette/tracer.py", line 75, in __call__ await self.app(scope, receive, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette/app.py", line 746, in __call__ raw_path = dict(scope["headers"])[path_from_header.encode("utf8")].split(b"?")[0] KeyError: b'x-original-uri' INFO: 127.0.0.1:49320 - "GET / HTTP/1.1" 500 Internal Server Error ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | base_url doesn't entirely work for running Datasette inside Binder 588108428 | |
604249402 | https://github.com/simonw/datasette/issues/712#issuecomment-604249402 | https://api.github.com/repos/simonw/datasette/issues/712 | MDEyOklzc3VlQ29tbWVudDYwNDI0OTQwMg== | wragge 127565 | 2020-03-26T06:11:44Z | 2020-03-26T06:11:44Z | CONTRIBUTOR | Following on from @betatim's suggestion on Twitter, I've changed the proxy url to include 'absolute'. ``` python proxy_url = f'{base_url}proxy/absolute/8001/' ``` This works both on Binder and locally, without using the `path_from_header` option. I've updated the demo repository. Sorry @simonw if I've led you down the wrong path! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | base_url doesn't entirely work for running Datasette inside Binder 588108428 | |
934372104 | https://github.com/dogsheep/dogsheep-photos/issues/3#issuecomment-934372104 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/3 | IC_kwDOD079W843sWMI | RhetTbull 41546558 | 2021-10-05T12:38:24Z | 2021-10-05T12:38:24Z | CONTRIBUTOR | As dogsheep-photos already uses [osxphotos](https://github.com/RhetTbull/osxphotos) to load photos you can access the EXIF data via osxphotos. Apple Photos imports a small subset of EXIF data at the time the photo is imported and osxphotos provides this via the [exif_info](https://github.com/RhetTbull/osxphotos#exifinfo) property. If you want the full EXIF data, osxphotos also provides a wrapper around [exiftool](https://github.com/RhetTbull/osxphotos#exiftool). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import EXIF data into SQLite - lens used, ISO, aperture etc 602533481 | |
623463200 | https://github.com/simonw/datasette/pull/730#issuecomment-623463200 | https://api.github.com/repos/simonw/datasette/issues/730 | MDEyOklzc3VlQ29tbWVudDYyMzQ2MzIwMA== | dependabot-preview[bot] 27856297 | 2020-05-04T13:27:22Z | 2020-05-04T13:27:22Z | CONTRIBUTOR | Superseded by #753. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Update pytest-asyncio requirement from ~=0.10.0 to >=0.10,<0.12 604001627 | |
618126449 | https://github.com/simonw/datasette/issues/731#issuecomment-618126449 | https://api.github.com/repos/simonw/datasette/issues/731 | MDEyOklzc3VlQ29tbWVudDYxODEyNjQ0OQ== | eyeseast 25778 | 2020-04-23T01:38:55Z | 2020-04-23T01:38:55Z | CONTRIBUTOR | I've almost suggested this same thing a couple times. I tend to have Makefile (because I'm doing other `make` stuff anyway to get data prepped), and I end up putting all those CLI options in something like `make run`. But it would be way easier to just have all those typical options -- plugins, templates, metadata -- be defaults. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Option to automatically configure based on directory layout 605110015 | |
618758326 | https://github.com/simonw/datasette/issues/731#issuecomment-618758326 | https://api.github.com/repos/simonw/datasette/issues/731 | MDEyOklzc3VlQ29tbWVudDYxODc1ODMyNg== | eyeseast 25778 | 2020-04-24T01:55:00Z | 2020-04-24T01:55:00Z | CONTRIBUTOR | Mounting `./static` at `/static` seems the simplest way. Saves you the trouble of deciding what else (`img` for example) gets special treatment. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Option to automatically configure based on directory layout 605110015 | |
1125342229 | https://github.com/simonw/datasette/issues/741#issuecomment-1125342229 | https://api.github.com/repos/simonw/datasette/issues/741 | IC_kwDOBm6k_c5DE1wV | eyeseast 25778 | 2022-05-12T19:21:16Z | 2022-05-12T19:21:16Z | CONTRIBUTOR | Came here to check if this had been flagged already. Was helping a colleague get something on Cloud Run and had to dig to find `--extra-options="--setting sql_time_limit_ms 2500"`. If I get some time next week, maybe I'll try to tackle it. Would definitely make things easier to be able to do something like this: ```sh datasette publish cloudrun something.db --setting sql_time_limit_ms 2500 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Replace "datasette publish --extra-options" with "--setting" 607223136 | |
622599528 | https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622599528 | https://api.github.com/repos/simonw/sqlite-utils/issues/103 | MDEyOklzc3VlQ29tbWVudDYyMjU5OTUyOA== | b0b5h4rp13 32605365 | 2020-05-01T22:49:12Z | 2020-05-02T11:15:44Z | CONTRIBUTOR | With SQLITE_MAX_VARS = 999, or even 899, This hits the problem with the batch rows causing a overflow (works fine if SQLITE_MAX_VARS = 799). p.s. I have tried a few list of dicts to sqlite modules and this was the easiest to use/understand ------------- file begins ------------------ import sqlite_utils as su data = [ {'tickerId': 913324382, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'CONSTELLATION B', 'symbol': 'STZ B', 'disSymbol': 'STZ-B', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '163.13', 'change': '6.46', 'changeRatio': '0.0412', 'marketValue': '31180699895.63', 'volume': '417', 'turnoverRate': '0.0000'}, {'tickerId': 913323791, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Molina Health', 'symbol': 'MOH', 'disSymbol': 'MOH', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '173.25', 'change': '9.28', 'changeRatio': '0.0566', 'pPrice': '173.25', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '10520341695.50', 'volume': '1281557', 'turnoverRate': '0.0202'}, {'tickerId': 913257501, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Seattle Genetics', 'symbol': 'SGEN', 'disSymbol': 'SGEN', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '145.64', 'change': '8.41', 'changeRatio': '0.0613', 'pPrice': '146.45', 'pChange': '0.8100', 'pChRatio': '0.0056', 'marketValue': '25117961347.60', 'volume': '2791411', 'turnoverRate': '0.0162'}, {'tickerId': 925381971, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Bandwidth', 'symbol': 'BAND', 'disSymbol': 'BAND', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', '… | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472 | |
748436779 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748436779 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDc0ODQzNjc3OQ== | RhetTbull 41546558 | 2020-12-19T07:49:00Z | 2020-12-19T07:49:00Z | CONTRIBUTOR | @nickvazz ZGENERICASSET changed to ZASSET in Big Sur. Here's a list of other changes to the schema in Big Sur: https://github.com/RhetTbull/osxphotos/wiki/Changes-in-Photos-6---Big-Sur | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
748562288 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748562288 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDc0ODU2MjI4OA== | RhetTbull 41546558 | 2020-12-20T04:44:22Z | 2020-12-20T04:44:22Z | CONTRIBUTOR | @nickvazz @simonw I opened a [PR](https://github.com/dogsheep/dogsheep-photos/pull/31) that replaces the SQL for `ZCOMPUTEDASSETATTRIBUTES` to use osxphotos which now exposes all this data and has been updated for Big Sur. I did regression tests to confirm the extracted data is identical, with one exception which should not affect operation: the old code pulled data from `ZCOMPUTEDASSETATTRIBUTES` for missing photos while the main loop ignores missing photos and does not add them to `apple_photos`. The new code does not add rows to the `apple_photos_scores` table for missing photos. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
623845014 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623845014 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg0NTAxNA== | RhetTbull 41546558 | 2020-05-05T03:55:14Z | 2020-05-05T03:56:24Z | CONTRIBUTOR | I'm traveling w/o access to my Mac so can't help with any code right now. I suspected ZSCENEIDENTIFIER was a foreign key into one of these psi.sqlite tables. But looks like you're on to something connecting groups to assets. As for the UUID, I think there's two ints because each is 64-bits but UUIDs are 128-bits. Thus they need to be combined to get the 128 bit UUID. You might be able to use Apple's [NSUUID](https://developer.apple.com/documentation/foundation/nsuuid?language=objc), for example, by wrapping with pyObjC. Here's one [example](https://github.com/ronaldoussoren/pyobjc/blob/881c82a7ba90f193934b52b44143360c80dce5e5/pyobjc-framework-Cocoa/PyObjCTest/test_nsuuid.py) of using this in PyObjC's test suite. Interesting it's stored this way instead of a UUIDString as in Photos.sqlite. Perhaps it for faster indexing. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
624284539 | https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624284539 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17 | MDEyOklzc3VlQ29tbWVudDYyNDI4NDUzOQ== | RhetTbull 41546558 | 2020-05-05T20:20:05Z | 2020-05-05T20:20:05Z | CONTRIBUTOR | FYI, I've got an [issue](https://github.com/RhetTbull/osxphotos/issues/25) to make osxphotos cross-platform but it's low on my priority list. About 90% of the functionality could be done cross-platform but right now the MacOS specific stuff is embedded throughout and would take some work. Though I try to minimize it, there's sprinklings of ObjC & Applescript throughout osxphotos. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Only install osxphotos if running on macOS 612860531 | |
626390317 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626390317 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5MDMxNw== | RhetTbull 41546558 | 2020-05-10T21:11:24Z | 2020-05-10T21:50:58Z | CONTRIBUTOR | Ugh....Yeah, I think easiest is to catch the exception and return no place as you suggest. This particular bit of code involves un-archiving a serialized NSKeyedArchiver which uses an object table and it is certainly possible to create a circular reference that way. Because this is happening in the decode, the circular reference must be in the original data. Does Photos show valid reverse geolocation info for the photo in question? If so, Photos may be doing something beyond a simple decode of the binary plist. For now, I'll push a patch to catch the exception. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626395507 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395507 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5NTUwNw== | RhetTbull 41546558 | 2020-05-10T21:54:45Z | 2020-05-10T21:54:45Z | CONTRIBUTOR | @simonw does Photos show valid reverse geolocation info? Are you sure you're using [bpylist2](https://github.com/xa4a/bpylist2) and not bpylist? They're both unfortunately imported as "bpylist" so if you somehow got the wrong (original bpylist) version installed, it could be the issue. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626395641 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395641 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5NTY0MQ== | RhetTbull 41546558 | 2020-05-10T21:55:54Z | 2020-05-10T21:55:54Z | CONTRIBUTOR | Did removing old bpylist solve the original problem or do you still have a photo that throws circular reference? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626396379 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626396379 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5NjM3OQ== | RhetTbull 41546558 | 2020-05-10T22:01:48Z | 2020-05-10T22:01:48Z | CONTRIBUTOR | Frustrates me when package authors create a "drop in" replacement with the same import name...this kind of thing has bitten me more than once! Would've been nicer I think for bpylist2 to do "import bpylist2 as bpylist" | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626667235 | https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-626667235 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 | MDEyOklzc3VlQ29tbWVudDYyNjY2NzIzNQ== | RhetTbull 41546558 | 2020-05-11T12:20:34Z | 2020-05-11T12:20:34Z | CONTRIBUTOR | @simonw FYI, osxphotos includes a built in ExifTool class that uses [exiftool](https://exiftool.org/) to read and write exif data. It's not exposed yet in the docs because I really only use it right now in the osphotos command line interface to write tags when exporting. In v0.28.16 (just pushed) I added an ExifTool.as_dict() method which will give you a dict with all the exif tags in a file. For example: ```python import osxphotos photos = osxphotos.PhotosDB().photos() exiftool = osxphotos.exiftool.ExifTool(photos[0].path) exifdata = exiftool.as_dict() tags = exifdata["IPTC:Keywords"] ``` Not as elegant perhaps as a python only implementation because ExifTool has to make subprocess calls to an external tool but exiftool is by far the best tool available for reading and writing EXIF data and it does support HEIC. As for implementation, ExifTool uses a singleton pattern so the first time you instantiate it, it spawns an IPC to exiftool but then keeps it open and uses the same process for any subsequent calls (even on different files). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Try out ExifReader 615626118 | |
627007458 | https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-627007458 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 | MDEyOklzc3VlQ29tbWVudDYyNzAwNzQ1OA== | RhetTbull 41546558 | 2020-05-11T22:51:52Z | 2020-05-11T22:52:26Z | CONTRIBUTOR | I'm not familiar with `ExifReader`. I wrote my own wrapper around `exiftool` because I wanted a simple way to write EXIF data when exporting photos (e.g. writing out to PersonInImage and keywords to IPTC:Keywords) and the existing python packages like [pyexiftool](https://github.com/smarnach/pyexiftool) didn't do quite what I wanted. If all you're after is the camera and shot info, that's available in `ZEXTENDEDATTRIBUTES` table. I've got an open issue [#11](https://github.com/RhetTbull/osxphotos/issues/11) to add this to osxphotos but it hasn't bubbled to the top of my backlog yet. osxphotos will give you the location info: `PhotoInfo.location` returns a tuple of (lat, lon) though this info is in ZEXTENDEDATTRIBUTES too (though it might not be correct as I believe Photos creates this table at import and the user might have changed the location of a photo, e.g. if camera didn't have GPS). ```sql CREATE TABLE ZEXTENDEDATTRIBUTES ( Z_PK INTEGER PRIMARY KEY, Z_ENT INTEGER, Z_OPT INTEGER, ZFLASHFIRED INTEGER, ZISO INTEGER, ZMETERINGMODE INTEGER, ZSAMPLERATE INTEGER, ZTRACKFORMAT INTEGER, ZWHITEBALANCE INTEGER, ZASSET INTEGER, ZAPERTURE FLOAT, ZBITRATE FLOAT, ZDURATION FLOAT, ZEXPOSUREBIAS FLOAT, ZFOCALLENGTH FLOAT, ZFPS FLOAT, ZLATITUDE FLOAT, ZLONGITUDE FLOAT, ZSHUTTERSPEED FLOAT, ZCAMERAMAKE VARCHAR, ZCAMERAMODEL VARCHAR, ZCODEC VARCHAR, ZLENSMODEL VARCHAR ); ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Try out ExifReader 615626118 | |
628405453 | https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-628405453 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 | MDEyOklzc3VlQ29tbWVudDYyODQwNTQ1Mw== | RhetTbull 41546558 | 2020-05-14T05:59:53Z | 2020-05-14T05:59:53Z | CONTRIBUTOR | I've added support for the above exif data to [v0.28.17](https://github.com/RhetTbull/osxphotos/releases/tag/v0.28.17) of osxphotos. `PhotoInfo.exif_info` will return an `ExifInfo` [dataclass](https://docs.python.org/3/library/dataclasses.html) object with the following properties: ```python flash_fired: bool iso: int metering_mode: int sample_rate: int track_format: int white_balance: int aperture: float bit_rate: float duration: float exposure_bias: float focal_length: float fps: float latitude: float longitude: float shutter_speed: float camera_make: str camera_model: str codec: str lens_model: str ``` It's not all the EXIF data available in most files but is the data Photos deems important to save. Of course, you can get all the exif_data Note: this only works in Photos 5. As best as I can tell, EXIF data is not stored in the database for earlier versions. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Try out ExifReader 615626118 | |
791509910 | https://github.com/simonw/datasette/issues/766#issuecomment-791509910 | https://api.github.com/repos/simonw/datasette/issues/766 | MDEyOklzc3VlQ29tbWVudDc5MTUwOTkxMA== | JBPressac 6371750 | 2021-03-05T15:57:35Z | 2021-03-05T16:35:21Z | CONTRIBUTOR | Hello, I have the same wildcards search problems with an instance of Datasette. http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwerz&_sort=rowid is OK but http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwe* is not (FTS is activated on "Reference" "IntituleAnalyse" "NomDuProducteur" "PresentationDuContenu" "Notes"). Notice that a SQL query as below launched directly from SQLite in the server's shell, retrieves results. `select * from fonds_auguste_dupouy_1872_1967_fts where IntituleAnalyse MATCH "gwe*";` Thanks, | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Enable wildcard-searches by default 617323873 | |
632555800 | https://github.com/simonw/datasette/issues/767#issuecomment-632555800 | https://api.github.com/repos/simonw/datasette/issues/767 | MDEyOklzc3VlQ29tbWVudDYzMjU1NTgwMA== | rixx 2657547 | 2020-05-22T08:00:23Z | 2020-05-22T08:00:23Z | CONTRIBUTOR | That would be perfect! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Allow to specify a URL fragment for canned queries 620969465 | |
817414881 | https://github.com/simonw/datasette/issues/830#issuecomment-817414881 | https://api.github.com/repos/simonw/datasette/issues/830 | MDEyOklzc3VlQ29tbWVudDgxNzQxNDg4MQ== | mroswell 192568 | 2021-04-12T01:06:34Z | 2021-04-12T01:07:27Z | CONTRIBUTOR | Related: #1285, including arguments for natural breaks, equal interval, etc. modeled after choropleth map legends. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Redesign register_facet_classes plugin hook 636511683 | |
716123598 | https://github.com/simonw/datasette/issues/838#issuecomment-716123598 | https://api.github.com/repos/simonw/datasette/issues/838 | MDEyOklzc3VlQ29tbWVudDcxNjEyMzU5OA== | psychemedia 82988 | 2020-10-25T10:20:12Z | 2020-10-25T10:53:24Z | CONTRIBUTOR | I'm trying to [run something behind a MyBinder proxy](https://github.com/ouseful-testing/nbsearch), but seem to have something set up incorrectly and not sure what the fix is? I'm starting datasette with jupyter-server-proxy setup: ``` # __init__.py def setup_nbsearch(): return { "command": [ "datasette", "serve", f"{_NBSEARCH_DB_PATH}", "-p", "{port}", "--config", "base_url:{base_url}nbsearch/" ], "absolute_url": True, # The following needs a the labextension installing. # eg in postBuild: jupyter labextension install jupyterlab-server-proxy "launcher_entry": { "enabled": True, "title": "nbsearch", }, } ``` where the `base_url` gets automatically populated by the server-proxy. I define the loaders as: ``` # __init__.py from datasette import hookimpl @hookimpl def extra_css_urls(database, table, columns, view_name, datasette): return [ "/-/static-plugins/nbsearch/prism.css", "/-/static-plugins/nbsearch/nbsearch.css", ] ``` but these seem to also need a base_url prefix set somehow? Currently, the generated HTML loads properly but internal links are incorrect; eg they take the form `<link rel="stylesheet" href="/-/static-plugins/nbsearch/prism.css">` which resolves to eg `https://notebooks.gesis.org/hub/-/static-plugins/nbsearch/prism.css` rather than required URL of form `https://notebooks.gesis.org/binder/jupyter/user/ouseful-testing-nbsearch-0fx1mx67/nbsearch/-/static-plugins/nbsearch/prism.css`. The main css is loaded correctly: `<link rel="stylesheet" href="/binder/jupyter/user/ouseful-testing-nbsearch-0fx1mx67/nbsearch/-/static/app.css?404439">` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Incorrect URLs when served behind a proxy with base_url set 637395097 | |
720354227 | https://github.com/simonw/datasette/issues/838#issuecomment-720354227 | https://api.github.com/repos/simonw/datasette/issues/838 | MDEyOklzc3VlQ29tbWVudDcyMDM1NDIyNw== | psychemedia 82988 | 2020-11-02T09:33:58Z | 2020-11-02T09:33:58Z | CONTRIBUTOR | Thanks; just a note that the `datasette.urls.static(path)` and `datasette.urls.static_plugins(plugin_name, path)` items both seem to be repeated and appear in the docs twice? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Incorrect URLs when served behind a proxy with base_url set 637395097 | |
645293374 | https://github.com/simonw/datasette/issues/851#issuecomment-645293374 | https://api.github.com/repos/simonw/datasette/issues/851 | MDEyOklzc3VlQ29tbWVudDY0NTI5MzM3NA== | abdusco 3243482 | 2020-06-17T10:32:02Z | 2020-06-17T10:32:28Z | CONTRIBUTOR | Welp, I'm an idiot. Turns out I had a sneaky comma `,` after `sql` key: ``` ... (:name, :url), ``` which tells sqlite to expect another `values(...)` list. Correcting the SQL solved the issue. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Having trouble getting writable canned queries to work 640330278 | |
647135713 | https://github.com/simonw/datasette/issues/859#issuecomment-647135713 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzEzNTcxMw== | abdusco 3243482 | 2020-06-21T14:30:02Z | 2020-06-21T14:30:02Z | CONTRIBUTOR | Oops, the same method is called from both index and database pages. But removing select count queries speed up the page load quite a bit. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647194131 | https://github.com/simonw/datasette/issues/859#issuecomment-647194131 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzE5NDEzMQ== | abdusco 3243482 | 2020-06-21T23:15:54Z | 2020-06-21T23:26:09Z | CONTRIBUTOR | I'm not sure if table counts are to blame. There shouldn't be a ~3 orders of magnitude difference. ```fish user@klein /a/w/scrapyard (master)> set sql "select count(*) from table_1; select count(*) from table_2; select count(*) from table_3;" user@klein /a/w/scrapyard (master)> time sqlite3 scrapyard.db "$sql" 187489 46492 2229 ________________________________________________________ Executed in 25.57 millis fish external usr time 3.55 millis 0.00 micros 3.55 millis sys time 22.42 millis 1123.00 micros 21.30 millis ``` but not letting datasette count the tables definitely helps. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647922203 | https://github.com/simonw/datasette/issues/859#issuecomment-647922203 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkyMjIwMw== | abdusco 3243482 | 2020-06-23T05:44:58Z | 2021-01-05T08:22:43Z | CONTRIBUTOR | I'm seeing the problem on database page. Index page and table page runs quite fast. - Tables have <10 columns (`id`, `url`, `title`, `body_html`, `date`, `author`, `meta` (for keeping unstructured json)). I've added index on `date` columns (using `sqlite-utils`) in addition to the index present on `id` columns. - All tables have FTS enabled on `text` and `varchar` columns (`title`, `body_html` etc) to speed up searching. - There are couple of tables related with foreign keys (think a thread in a forum and posts in that thread, related with `thread_id`) | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647923666 | https://github.com/simonw/datasette/issues/859#issuecomment-647923666 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkyMzY2Ng== | abdusco 3243482 | 2020-06-23T05:49:31Z | 2020-06-23T05:49:31Z | CONTRIBUTOR | I think I should mention that having FTS on all tables mean I have 5 visible, 25 hidden (FTS) tables displayed on database page. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647925594 | https://github.com/simonw/datasette/issues/859#issuecomment-647925594 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkyNTU5NA== | abdusco 3243482 | 2020-06-23T05:55:21Z | 2020-06-23T06:28:29Z | CONTRIBUTOR | Hmm, not seeing the problem now. I've removed the commented out sections in `database.py` and restarted the process. Database page now loads in <250ms. I have couple of workers that check some pages regularly and scrape new content and save to the DB. Could it be that datasette tries to recount tables every time database size changes? Normally it keeps a count cache, but as DB gets updated so often (new content every 5 min or so) it's practically recounting every time I go to the database page? EDIT: It turns out it doesn't hold cache with mutable databases. I'll update the issue with more findings and a better way to reproduce the problem if I encounter it again. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647935300 | https://github.com/simonw/datasette/issues/859#issuecomment-647935300 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkzNTMwMA== | abdusco 3243482 | 2020-06-23T06:23:01Z | 2020-06-23T06:23:01Z | CONTRIBUTOR | > You said "200k+, 50+ rows in a couple of tables" - does that mean 50+ columns? I'll try with larger numbers of columns and see what difference that makes. Ah that was a typo, I meant 50k. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647936117 | https://github.com/simonw/datasette/issues/859#issuecomment-647936117 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkzNjExNw== | abdusco 3243482 | 2020-06-23T06:25:17Z | 2020-06-23T06:25:17Z | CONTRIBUTOR | > > > ``` > sqlite-generate many-cols.db --tables 2 --rows 200000 --columns 50 > ``` > > Looks like that will take 35 minutes to run (it's not a particularly fast tool). Try chunking write operations into batches every 1000 records or so. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
648232645 | https://github.com/simonw/datasette/issues/859#issuecomment-648232645 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0ODIzMjY0NQ== | abdusco 3243482 | 2020-06-23T15:19:53Z | 2020-06-23T15:19:53Z | CONTRIBUTOR | The issue seems to appear sporadically, like when I return to database page after a while, during which some records have been added to the database. I've just visited database, page first visit took ~10s, consecutive visits took 0.3s. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
648669523 | https://github.com/simonw/datasette/issues/859#issuecomment-648669523 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0ODY2OTUyMw== | abdusco 3243482 | 2020-06-24T08:13:23Z | 2020-06-24T10:30:36Z | CONTRIBUTOR | I tried setting `cache_size_kb=0` then `cache_size_kb=100000`, still getting this behavior. I even changed `Database::table_counts` and lowered time limit to 1 ```py table_count = ( await self.execute( "select count(*) from [{}]".format(table), custom_time_limit=1, ) ).rows[0][0] counts[table] = table_count ``` I feel like 10 seconds is a magic number, like a processing timeout and datasette gives up and returns the page. Index page loads instantly, table page, query page, as well. But when I return to database page after some time, it loads in 10s. EDIT: It's always like 10 + 0.3s, like 10s wait and timeout then 300ms to render the page | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
652160909 | https://github.com/simonw/datasette/issues/859#issuecomment-652160909 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY1MjE2MDkwOQ== | abdusco 3243482 | 2020-07-01T03:09:32Z | 2020-07-01T03:10:21Z | CONTRIBUTOR | I've just realized Datasette tries to count hidden tables too. There are 5 visible tables, 25 hidden tables, which I haven't realize earlier to consider their effect. I've turned off counting for hidden tables to see if it has any effect. What's the point of counting FTS tables? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
904982056 | https://github.com/simonw/datasette/issues/859#issuecomment-904982056 | https://api.github.com/repos/simonw/datasette/issues/859 | IC_kwDOBm6k_c418O4o | brandonrobertz 2670795 | 2021-08-24T21:15:04Z | 2021-08-24T21:15:30Z | CONTRIBUTOR | I'm running into issues with this as well. All other pages seem to work with lots of DBs except the home page, which absolutely tanks. Would be willing to put some work into this, if there's been any kind of progress on concepts on how this ought to work. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
905899177 | https://github.com/simonw/datasette/issues/859#issuecomment-905899177 | https://api.github.com/repos/simonw/datasette/issues/859 | IC_kwDOBm6k_c41_uyp | brandonrobertz 2670795 | 2021-08-25T21:48:00Z | 2021-08-25T21:48:00Z | CONTRIBUTOR | Upon first stab, there's two issues here: - DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check. - Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
905904540 | https://github.com/simonw/datasette/issues/859#issuecomment-905904540 | https://api.github.com/repos/simonw/datasette/issues/859 | IC_kwDOBm6k_c41_wGc | brandonrobertz 2670795 | 2021-08-25T21:59:14Z | 2021-08-25T21:59:55Z | CONTRIBUTOR | I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so: `for i in {1..20}; do sqlite-generate db$i.db --tables ${i}00 --rows 100,2000 --columns 5,100 --pks 0 --fks 0; done` This was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins. By simply persisting the `_internal` DB to disk, I was able to avoid most of the performance issues I was experiencing previously. (To do this, I changed the `datasette/internal_db.py:init_internal_db` creates to if not exists, and changed the `_internal` DB instantiation in `datasette/app.py:Datasette.__init__` to a path with `is_mutable=True`.) Super rough, but the pages now load so I can continue testing ideas. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Database page loads too slowly with many large tables (due to table counts) 642572841 | |
652166115 | https://github.com/simonw/datasette/issues/877#issuecomment-652166115 | https://api.github.com/repos/simonw/datasette/issues/877 | MDEyOklzc3VlQ29tbWVudDY1MjE2NjExNQ== | abdusco 3243482 | 2020-07-01T03:28:07Z | 2020-07-01T03:28:07Z | CONTRIBUTOR | Does this mean custom routes get to expose endpoints accepting POST requests? I've tried earlier to add some POST endpoints, but requests were being rejected by Datasette due to CSRF | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Consider dropping explicit CSRF protection entirely? 648421105 | |
652255960 | https://github.com/simonw/datasette/issues/877#issuecomment-652255960 | https://api.github.com/repos/simonw/datasette/issues/877 | MDEyOklzc3VlQ29tbWVudDY1MjI1NTk2MA== | abdusco 3243482 | 2020-07-01T07:52:25Z | 2020-07-01T08:10:00Z | CONTRIBUTOR | I am calling the API from another origin, so injecting CSRF token into templates wouldn't work. EDIT: I'll try the new version, it sounds promising | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Consider dropping explicit CSRF protection entirely? 648421105 | |
652261382 | https://github.com/simonw/datasette/issues/877#issuecomment-652261382 | https://api.github.com/repos/simonw/datasette/issues/877 | MDEyOklzc3VlQ29tbWVudDY1MjI2MTM4Mg== | abdusco 3243482 | 2020-07-01T08:03:17Z | 2020-07-01T08:03:23Z | CONTRIBUTOR | Bearer tokens sound interesting. Where do tokens come from? An auth provider of my choosing? How do they get verified? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Consider dropping explicit CSRF protection entirely? 648421105 | |
652297139 | https://github.com/simonw/datasette/pull/883#issuecomment-652297139 | https://api.github.com/repos/simonw/datasette/issues/883 | MDEyOklzc3VlQ29tbWVudDY1MjI5NzEzOQ== | abdusco 3243482 | 2020-07-01T09:11:29Z | 2020-07-01T09:11:29Z | CONTRIBUTOR | Turns out we should include hidden tables in the result dict, or we're breaking tests. I've committed a refactor https://github.com/simonw/datasette/pull/883/commits/4f06e1bf6fbe4b73be770b87f610bf7c0e6e3ea7 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Skip counting hidden tables 648749062 | |
652394742 | https://github.com/simonw/datasette/pull/883#issuecomment-652394742 | https://api.github.com/repos/simonw/datasette/issues/883 | MDEyOklzc3VlQ29tbWVudDY1MjM5NDc0Mg== | abdusco 3243482 | 2020-07-01T12:41:13Z | 2020-07-01T12:41:13Z | CONTRIBUTOR | Well tests need to be updated. I need to get tests working on Windows. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Skip counting hidden tables 648749062 | |
652990131 | https://github.com/simonw/datasette/issues/889#issuecomment-652990131 | https://api.github.com/repos/simonw/datasette/issues/889 | MDEyOklzc3VlQ29tbWVudDY1Mjk5MDEzMQ== | amjith 49260 | 2020-07-02T12:58:11Z | 2020-07-02T13:00:18Z | CONTRIBUTOR | FWIW, this error does NOT happen in datasette 0.45a4. It only started on 0.45a5 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | asgi_wrapper plugin hook is crashing at startup 649907676 | |
653002499 | https://github.com/simonw/datasette/issues/889#issuecomment-653002499 | https://api.github.com/repos/simonw/datasette/issues/889 | MDEyOklzc3VlQ29tbWVudDY1MzAwMjQ5OQ== | amjith 49260 | 2020-07-02T13:22:13Z | 2020-07-02T13:22:13Z | CONTRIBUTOR | I was able to narrow this down to the fact that lifespan protocol is turned on. I see the workaround you've used here: https://github.com/simonw/datasette-debug-asgi/commit/72d568d32a3159c763ce908c0b269736935c6987 If so, maybe it's time to update some of the asg_wrapper [plugins](https://datasette.readthedocs.io/en/stable/plugin_hooks.html#asgi-wrapper-datasette). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | asgi_wrapper plugin hook is crashing at startup 649907676 | |
655018966 | https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655018966 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | MDEyOklzc3VlQ29tbWVudDY1NTAxODk2Ng== | tsibley 79913 | 2020-07-07T17:41:06Z | 2020-07-07T17:41:06Z | CONTRIBUTOR | Hmm, while tests pass, this may not work as intended on larger datasets. Looking into it. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add insert --truncate option 651844316 | |
655052451 | https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655052451 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | MDEyOklzc3VlQ29tbWVudDY1NTA1MjQ1MQ== | tsibley 79913 | 2020-07-07T18:45:23Z | 2020-07-07T18:45:23Z | CONTRIBUTOR | Ah, I see the problem. The truncate is inside a loop I didn't realize was there. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add insert --truncate option 651844316 | |
655239728 | https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | MDEyOklzc3VlQ29tbWVudDY1NTIzOTcyOA== | tsibley 79913 | 2020-07-08T02:16:42Z | 2020-07-08T02:16:42Z | CONTRIBUTOR | I fixed my original oops by moving the `DELETE FROM $table` out of the chunking loop and repushed. I think this change can be considered in isolation from issues around transactions, which I discuss next. I wanted to make the DELETE + INSERT happen all in the same transaction so it was robust, but that was more complicated than I expected. The transaction handling in the Database/Table classes isn't systematic, and this poses big hurdles to making `Table.insert_all` (or other operations) consistent and robust in the face of errors. For example, I wanted to do this (whitespace ignored in diff, so indentation change not highlighted): ```diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index d6b9ecf..4107ceb 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -1028,6 +1028,11 @@ class Table(Queryable): batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns)) self.last_rowid = None self.last_pk = None + with self.db.conn: + # Explicit BEGIN is necessary because Python's sqlite3 doesn't + # issue implicit BEGINs for DDL, only DML. We mix DDL and DML + # below and might execute DDL first, e.g. for table creation. + self.db.conn.execute("BEGIN") if truncate and self.exists(): self.db.conn.execute("DELETE FROM [{}];".format(self.name)) for chunk in chunks(itertools.chain([first_record], records), batch_size): @@ -1038,7 +1043,11 @@ class Table(Queryable): # Use the first batch to derive the table names column_types = suggest_column_types(chunk) column_types.update(columns or {}) - self.create( + # Not self.create() because that is wrapped in its own + # transaction and Python's sqlite3 doesn't support + # nested transactions. + self.db.create_table( + … | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add insert --truncate option 651844316 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
updated_at (date) >30 ✖