github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/507#issuecomment-541118904 | https://api.github.com/repos/simonw/datasette/issues/507 | 541118904 | MDEyOklzc3VlQ29tbWVudDU0MTExODkwNA== | 2657547 | 2019-10-11T15:48:49Z | 2019-10-11T15:48:49Z | CONTRIBUTOR | Headless Chrome and Firefox via Selenium are a solid choice in my experience. You may be interested in how pretix and pretalx solve this problem: They use pytest to create those screenshots on release to make sure they are up to date. See [this writeup](https://behind.pretix.eu/2018/11/15/automated-screenshots/) and [this repo](https://github.com/pretix/pretix-screenshots). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 455852801 | |
https://github.com/simonw/datasette/issues/511#issuecomment-510730200 | https://api.github.com/repos/simonw/datasette/issues/511 | 510730200 | MDEyOklzc3VlQ29tbWVudDUxMDczMDIwMA== | 3243482 | 2019-07-12T03:23:22Z | 2019-07-12T03:23:22Z | CONTRIBUTOR | @simonw yes it works fine on Windows, but test suite doesn't run properly, for that I had to use WSL | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 456578474 | |
https://github.com/simonw/datasette/issues/512#issuecomment-541119038 | https://api.github.com/repos/simonw/datasette/issues/512 | 541119038 | MDEyOklzc3VlQ29tbWVudDU0MTExOTAzOA== | 2657547 | 2019-10-11T15:49:13Z | 2019-10-11T15:49:13Z | CONTRIBUTOR | How open are you to changing the config variable names (with appropriate deprecation, of course)? `"about_url_text", "license_url_text"` etc might be better suited to convey that these are just meant as basically URL titles. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 457147936 | |
https://github.com/simonw/datasette/issues/514#issuecomment-504662904 | https://api.github.com/repos/simonw/datasette/issues/514 | 504662904 | MDEyOklzc3VlQ29tbWVudDUwNDY2MjkwNA== | 45057 | 2019-06-22T12:45:21Z | 2019-06-22T12:45:39Z | CONTRIBUTOR | On most modern Linux distros, systemd is the easiest answer. Example systemd unit file (save to `/etc/systemd/system/datasette.service`): ``` [Unit] Description=Datasette After=network.target [Service] Type=simple User=<username> WorkingDirectory=/path/to/data ExecStart=/path/to/datasette serve -h 0.0.0.0 ./my.db Restart=on-failure [Install] WantedBy=multi-user.target ``` Activate it with: ```bash $ sudo systemctl daemon-reload $ sudo systemctl enable datasette $ sudo systemctl start datasette ``` Logs are best viewed using `journalctl -u datasette -f`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459397625 | |
https://github.com/simonw/datasette/issues/514#issuecomment-504663766 | https://api.github.com/repos/simonw/datasette/issues/514 | 504663766 | MDEyOklzc3VlQ29tbWVudDUwNDY2Mzc2Ng== | 45057 | 2019-06-22T12:57:59Z | 2019-06-22T12:57:59Z | CONTRIBUTOR | > This example is useful to - I like how it has a Makefile that knows how to set up systemd: https://github.com/pikesley/Queube I wasn't even aware it was possible to add a systemd service at an arbitrary path, but it seems a little messy to me. Maybe worth noting that systemd does support [per-user services](https://wiki.archlinux.org/index.php/Systemd/User) which don't require root access. Cool but probably overkill for most people (especially when you're going to need root to listen on port 80 anyway, directly or via a reverse proxy). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459397625 | |
https://github.com/simonw/datasette/issues/514#issuecomment-504684831 | https://api.github.com/repos/simonw/datasette/issues/514 | 504684831 | MDEyOklzc3VlQ29tbWVudDUwNDY4NDgzMQ== | 45057 | 2019-06-22T17:38:23Z | 2019-06-22T17:38:23Z | CONTRIBUTOR | > > WorkingDirectory=/path/to/data > > @russss, Which directory does this represent? It's the working directory (cwd) of the spawned process. In this case if you set it to the directory your data is in, you can use relative paths to the db (and metadata/templates/etc) in the `ExecStart` command. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459397625 | |
https://github.com/simonw/datasette/issues/514#issuecomment-504690927 | https://api.github.com/repos/simonw/datasette/issues/514 | 504690927 | MDEyOklzc3VlQ29tbWVudDUwNDY5MDkyNw== | 45057 | 2019-06-22T19:06:07Z | 2019-06-22T19:06:07Z | CONTRIBUTOR | I'd rather not turn this into a systemd support thread, but you're trying to execute the package directory there. Your datasette executable is probably at `/home/chris/Env/datasette/bin/datasette`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459397625 | |
https://github.com/simonw/datasette/issues/523#issuecomment-504809397 | https://api.github.com/repos/simonw/datasette/issues/523 | 504809397 | MDEyOklzc3VlQ29tbWVudDUwNDgwOTM5Nw== | 2657547 | 2019-06-24T01:38:14Z | 2019-06-24T01:38:14Z | CONTRIBUTOR | Ah, apologies – I had found and read those issues, but I was under the impression that they refered only to the filtered row count, not the unfiltered total row count. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459627549 | |
https://github.com/simonw/datasette/issues/526#issuecomment-992971072 | https://api.github.com/repos/simonw/datasette/issues/526 | 992971072 | IC_kwDOBm6k_c47L4lA | 536941 | 2021-12-13T22:29:34Z | 2021-12-13T22:29:34Z | CONTRIBUTOR | just came by to open this issue. would make my data analysis in observable a lot better! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-993078038 | https://api.github.com/repos/simonw/datasette/issues/526 | 993078038 | IC_kwDOBm6k_c47MSsW | 536941 | 2021-12-14T01:46:52Z | 2021-12-14T01:46:52Z | CONTRIBUTOR | the nested query idea is very nice, and i stole if for [my client side paginator](https://observablehq.com/d/1d5da3a3c3f2f347#DatasetteClient). However, it won't do the right thing if the original query orders by random(). If you go the nested query route, maybe raise a 4XX status code if the query has such a clause? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1254064260 | https://api.github.com/repos/simonw/datasette/issues/526 | 1254064260 | IC_kwDOBm6k_c5Kv4CE | 536941 | 2022-09-21T18:17:04Z | 2022-09-21T18:18:01Z | CONTRIBUTOR | hi @simonw, this is becoming more of a bother for my [labor data warehouse](https://labordata.bunkum.us/). Is there any research or a spike i could do that would help you investigate this issue? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1258167564 | https://api.github.com/repos/simonw/datasette/issues/526 | 1258167564 | IC_kwDOBm6k_c5K_h0M | 536941 | 2022-09-26T14:57:44Z | 2022-09-26T15:08:36Z | CONTRIBUTOR | reading the database execute method i have a few questions. https://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/database.py#L229-L242 --- unless i'm missing something (which is very likely!!), the `max_returned_rows` argument doesn't actually offer any protections against running very expensive queries. It's not like adding a `LIMIT max_rows` argument. it make sense that it isn't because, the query could already have an `LIMIT` argument. Doing something like `select * from (query) limit {max_returned_rows}` **might** be protective but wouldn't always. Instead the code executes the full original query, and if still has time it fetches out the first `max_rows + 1` rows. this *does* offer some protection of memory exhaustion, as you won't hydrate a huge result set into python (however, there are [data flow patterns](https://github.com/simonw/datasette/issues/1727#issuecomment-1258129113) that could avoid that too) given the current architecture, i don't see how creating a new connection would be use? --- If we just removed the `max_return_rows` limitation, then i think most things would be fine **except** for the QueryViews. Right now rendering, just [5000 rows takes a lot of client-side memory](https://github.com/simonw/datasette/issues/1655) so some form of pagination would be required. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1258337011 | https://api.github.com/repos/simonw/datasette/issues/526 | 1258337011 | IC_kwDOBm6k_c5LALLz | 536941 | 2022-09-26T16:49:48Z | 2022-09-26T16:49:48Z | CONTRIBUTOR | i think the smallest change that gets close to what i want is to change the behavior so that `max_returned_rows` is not applied in the `execute` method when we are are asking for a csv of query. there are some infelicities for that approach, but i'll make a PR to make it easier to discuss. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1258849766 | https://api.github.com/repos/simonw/datasette/issues/526 | 1258849766 | IC_kwDOBm6k_c5LCIXm | 536941 | 2022-09-27T01:27:03Z | 2022-09-27T01:27:03Z | CONTRIBUTOR | i agree with that concern! but if i'm understanding the code correctly, `maximum_returned_rows` does not protect against long-running queries in any way. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1258871525 | https://api.github.com/repos/simonw/datasette/issues/526 | 1258871525 | IC_kwDOBm6k_c5LCNrl | 536941 | 2022-09-27T02:09:32Z | 2022-09-27T02:14:53Z | CONTRIBUTOR | thanks @simonw, i learned something i didn't know about sqlite's execution model! > Imagine if Datasette CSVs did allow unlimited retrievals. Someone could hit the CSV endpoint for that recursive query and tie up Datasette's SQL connection effectively forever. why wouldn't the `sqlite_timelimit` guard prevent that? --- on my local version which has the code to [turn off truncations for query csv](#1820), `sqlite_timelimit` does protect me. ![Screenshot 2022-09-26 at 22-14-31 Error 500](https://user-images.githubusercontent.com/536941/192415680-94b32b7f-868f-4b89-8194-5752d45f6009.png) | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1258878311 | https://api.github.com/repos/simonw/datasette/issues/526 | 1258878311 | IC_kwDOBm6k_c5LCPVn | 536941 | 2022-09-27T02:19:48Z | 2022-09-27T02:19:48Z | CONTRIBUTOR | this sql query doesn't trip up `maximum_returned_rows` but does timeout ```sql with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter LIMIT 10 OFFSET 100000000 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1258910228 | https://api.github.com/repos/simonw/datasette/issues/526 | 1258910228 | IC_kwDOBm6k_c5LCXIU | 536941 | 2022-09-27T03:11:07Z | 2022-09-27T03:11:07Z | CONTRIBUTOR | i think this feature would be safe, as its really only the time limit that can, and imo, should protect against long running queries, as it is pretty easy to make very expensive queries that don't return many rows. moving away from `max_returned_rows` will requires some thinking about: 1. memory usage and data flows to handle potentially very large result sets 2. how to avoid rendering tens or hundreds of thousands of [html rows](#1655). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/issues/526#issuecomment-1259718517 | https://api.github.com/repos/simonw/datasette/issues/526 | 1259718517 | IC_kwDOBm6k_c5LFcd1 | 536941 | 2022-09-27T16:02:51Z | 2022-09-27T16:04:46Z | CONTRIBUTOR | i think that `max_returned_rows` **is** a defense mechanism, just not for connection exhaustion. `max_returned_rows` is a defense mechanism against **memory bombs**. if you are potentially yielding out hundreds of thousands or even millions of rows, you need to be quite careful about data flow to not run out of memory on the server, or on the client. you have a lot of places in your code that are protective of that right now, but `max_returned_rows` acts as the final backstop. so, given that, it makes sense to have removing `max_returned_rows` altogether be a non-goal, but instead allow for for specific codepaths (like streaming csv's) be able to bypass. that could dramatically lower the surface area for a memory-bomb attack. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 459882902 | |
https://github.com/simonw/datasette/pull/554#issuecomment-509618339 | https://api.github.com/repos/simonw/datasette/issues/554 | 509618339 | MDEyOklzc3VlQ29tbWVudDUwOTYxODMzOQ== | 3243482 | 2019-07-09T12:16:32Z | 2019-07-09T12:16:32Z | CONTRIBUTOR | I've also added another fix for using static mounts with absolute paths on Windows. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 465728430 | |
https://github.com/simonw/datasette/pull/554#issuecomment-509629331 | https://api.github.com/repos/simonw/datasette/issues/554 | 509629331 | MDEyOklzc3VlQ29tbWVudDUwOTYyOTMzMQ== | 3243482 | 2019-07-09T12:51:35Z | 2019-07-09T12:51:35Z | CONTRIBUTOR | I wanted to add a test for it too, but I've realized it's impossible to test a server process as we cannot get its exit code. ```python # tests/test_cli.py def test_static_mounts_on_windows(): if sys.platform != "win32": return runner = CliRunner() result = runner.invoke( cli, ["serve", "--static", r"s:C:\\"] ) assert result.exit_code == 0 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 465728430 | |
https://github.com/simonw/sqlite-utils/issues/50#issuecomment-1303660293 | https://api.github.com/repos/simonw/sqlite-utils/issues/50 | 1303660293 | IC_kwDOCGYnMM5NtEcF | 7908073 | 2022-11-04T14:38:36Z | 2022-11-04T14:38:36Z | CONTRIBUTOR | where did you see the limit as 999? I believe the limit has been 32766 for quite some time. If you could detect which one this could speed up batch insert of some types of data significantly | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 473083260 | |
https://github.com/simonw/datasette/pull/564#issuecomment-1420941334 | https://api.github.com/repos/simonw/datasette/issues/564 | 1420941334 | IC_kwDOBm6k_c5UsdgW | 82988 | 2023-02-07T15:14:10Z | 2023-02-07T15:14:10Z | CONTRIBUTOR | Is this feature covered by any more recent updates to `datasette`, or via any plugins that you're aware of? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 473288428 | |
https://github.com/simonw/sqlite-utils/pull/56#issuecomment-527209840 | https://api.github.com/repos/simonw/sqlite-utils/issues/56 | 527209840 | MDEyOklzc3VlQ29tbWVudDUyNzIwOTg0MA== | 49260 | 2019-09-02T17:23:21Z | 2019-09-02T17:23:21Z | CONTRIBUTOR | I have updated the other PR with the changes from this one and added tests. I have also changed the escaping from double quotes to brackets. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 487847945 | |
https://github.com/simonw/sqlite-utils/pull/57#issuecomment-527211047 | https://api.github.com/repos/simonw/sqlite-utils/issues/57 | 527211047 | MDEyOklzc3VlQ29tbWVudDUyNzIxMTA0Nw== | 49260 | 2019-09-02T17:30:43Z | 2019-09-02T17:30:43Z | CONTRIBUTOR | I have merged the other PR (#56) into this one. I have incorporated your suggestions. Cheers! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 487987958 | |
https://github.com/simonw/sqlite-utils/issues/61#issuecomment-533818697 | https://api.github.com/repos/simonw/sqlite-utils/issues/61 | 533818697 | MDEyOklzc3VlQ29tbWVudDUzMzgxODY5Nw== | 49260 | 2019-09-21T18:09:01Z | 2019-09-21T18:09:28Z | CONTRIBUTOR | @witeshadow The library version doesn't have helpers around CSV (at least not from what I can see in the code). But here's a snippet that makes it easy to insert from CSV using the library. ``` import csv from sqlite_utils import Database # CSV Reader csv_file = open("filename.csv") # open the csv file. reader = csv.reader(csv_file) # Create a CSV reader headers = next(reader) # First line is the header docs = (dict(zip(headers, row)) for row in reader) # Now you can use the `sqlite_utils` library. db = Database("my_database.db") db["table_name"].insert_all(docs) ``` This snippet is adapted from reading the CLI source code on how it implements the csv option. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 491219910 | |
https://github.com/simonw/datasette/issues/573#issuecomment-559632608 | https://api.github.com/repos/simonw/datasette/issues/573 | 559632608 | MDEyOklzc3VlQ29tbWVudDU1OTYzMjYwOA== | 82988 | 2019-11-29T01:43:38Z | 2019-11-29T01:43:38Z | CONTRIBUTOR | In passing, it looks like a start was made on a datasette Jupyter server extension in https://github.com/lucasdurand/jupyter-datasette although the build fails in MyBinder. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 492153532 | |
https://github.com/simonw/datasette/issues/573#issuecomment-593026413 | https://api.github.com/repos/simonw/datasette/issues/573 | 593026413 | MDEyOklzc3VlQ29tbWVudDU5MzAyNjQxMw== | 127565 | 2020-03-01T01:24:45Z | 2020-03-01T01:24:45Z | CONTRIBUTOR | Did you manage to find an answer to this? I've got a notebook to help people generate datasets on the fly from an API, so it would be cool if they flick it to Datasette for initial exploration. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 492153532 | |
https://github.com/simonw/datasette/issues/573#issuecomment-604328163 | https://api.github.com/repos/simonw/datasette/issues/573 | 604328163 | MDEyOklzc3VlQ29tbWVudDYwNDMyODE2Mw== | 82988 | 2020-03-26T09:41:30Z | 2020-03-26T09:41:30Z | CONTRIBUTOR | Fixed by @simonw; example here: https://github.com/simonw/jupyterserverproxy-datasette-demo | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 492153532 | |
https://github.com/simonw/datasette/issues/585#issuecomment-541052329 | https://api.github.com/repos/simonw/datasette/issues/585 | 541052329 | MDEyOklzc3VlQ29tbWVudDU0MTA1MjMyOQ== | 2657547 | 2019-10-11T12:53:51Z | 2019-10-11T12:53:51Z | CONTRIBUTOR | I think this would be good, yeah – currently, databases are explicitly sorted by name in the IndexView, we could just remove that part (and use an `OrderedDict` for consistency, I suppose)? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 503217375 | |
https://github.com/simonw/datasette/pull/590#issuecomment-541562581 | https://api.github.com/repos/simonw/datasette/issues/590 | 541562581 | MDEyOklzc3VlQ29tbWVudDU0MTU2MjU4MQ== | 2657547 | 2019-10-14T08:57:46Z | 2019-10-14T08:57:46Z | CONTRIBUTOR | Ah, thank you – I saw the need for unit tests but wasn't sure what the best way to add one would be. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 505818256 | |
https://github.com/simonw/datasette/pull/590#issuecomment-541587823 | https://api.github.com/repos/simonw/datasette/issues/590 | 541587823 | MDEyOklzc3VlQ29tbWVudDU0MTU4NzgyMw== | 2657547 | 2019-10-14T09:58:23Z | 2019-10-14T09:58:23Z | CONTRIBUTOR | Added tests. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 505818256 | |
https://github.com/simonw/datasette/pull/601#issuecomment-544008463 | https://api.github.com/repos/simonw/datasette/issues/601 | 544008463 | MDEyOklzc3VlQ29tbWVudDU0NDAwODQ2Mw== | 2657547 | 2019-10-18T23:39:21Z | 2019-10-18T23:39:21Z | CONTRIBUTOR | That looks right, and I completely agree with the intent. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 509340359 | |
https://github.com/simonw/datasette/pull/601#issuecomment-544008944 | https://api.github.com/repos/simonw/datasette/issues/601 | 544008944 | MDEyOklzc3VlQ29tbWVudDU0NDAwODk0NA== | 2657547 | 2019-10-18T23:40:48Z | 2019-10-18T23:40:48Z | CONTRIBUTOR | The only negative impact that comes to mind is that now you have no way to get the read-only query to be formatted nicely, I think, so maybe a second PR adding the formatting functionality even to the read-only page would be good? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 509340359 | |
https://github.com/simonw/datasette/pull/601#issuecomment-544214418 | https://api.github.com/repos/simonw/datasette/issues/601 | 544214418 | MDEyOklzc3VlQ29tbWVudDU0NDIxNDQxOA== | 2657547 | 2019-10-20T02:29:49Z | 2019-10-20T02:29:49Z | CONTRIBUTOR | Submitted in #602! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 509340359 | |
https://github.com/simonw/datasette/pull/602#issuecomment-549246007 | https://api.github.com/repos/simonw/datasette/issues/602 | 549246007 | MDEyOklzc3VlQ29tbWVudDU0OTI0NjAwNw== | 2657547 | 2019-11-04T07:29:33Z | 2019-11-04T07:29:33Z | CONTRIBUTOR | Not sure – I'm always a bit weirded out when elements that I clicked disappear on me. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 509535510 | |
https://github.com/dogsheep/twitter-to-sqlite/issues/29#issuecomment-552134876 | https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/29 | 552134876 | MDEyOklzc3VlQ29tbWVudDU1MjEzNDg3Ng== | 21148 | 2019-11-09T20:33:38Z | 2019-11-09T20:33:38Z | CONTRIBUTOR | ❤️ thanks! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 518725064 | |
https://github.com/simonw/datasette/issues/639#issuecomment-558687342 | https://api.github.com/repos/simonw/datasette/issues/639 | 558687342 | MDEyOklzc3VlQ29tbWVudDU1ODY4NzM0Mg== | 21148 | 2019-11-26T15:40:00Z | 2019-11-26T15:40:00Z | CONTRIBUTOR | A bit of background: the reason `heroku git:clone` brings down an empty directory is because `datasette publish heroku` uses the [builds API](https://devcenter.heroku.com/articles/build-and-release-using-the-api), rather than a `git push`, to release the app. I originally did this because it seemed like a lower bar than having a working `git`, but the downside is, as you found out, that tweaking the created app is hard. So there's one option -- change `datasette publish heroku` to use `git push` instead of `heroku builds:create`. @pkoppstein - what you suggested seems like it ought to work (you don't need maintenance mode, though). I'm not sure why it doesn't. You could also look into using the [slugs API](https://devcenter.heroku.com/articles/platform-api-deploying-slugs) to download the slug, change `metadata.json`, re-pack and re-upload the slug. Ultimately though I think I think @simonw's idea of reading `metadata.json` from an external source might be better (#357). Reading from an alternate URL would be fine, or you could also just stuff the whole `metadata.json` into a Heroku config var, and write a plugin to read it from there. Hope this helps a bit! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 527670799 | |
https://github.com/simonw/datasette/issues/642#issuecomment-559207224 | https://api.github.com/repos/simonw/datasette/issues/642 | 559207224 | MDEyOklzc3VlQ29tbWVudDU1OTIwNzIyNA== | 82988 | 2019-11-27T18:40:57Z | 2019-11-27T18:41:07Z | CONTRIBUTOR | Would cookie cutter approaches also work for creating various flavours of customised templates? I need to try to create a couple of sites for myself to get a feel for what sorts of thing are easily doable, and what cribbable cookie cutter items might be. I'm guessing https://simonwillison.net/2019/Nov/25/niche-museums/ is a good place to start from? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 529429214 | |
https://github.com/simonw/datasette/pull/644#issuecomment-565755208 | https://api.github.com/repos/simonw/datasette/issues/644 | 565755208 | MDEyOklzc3VlQ29tbWVudDU2NTc1NTIwOA== | 6025893 | 2019-12-14T21:33:31Z | 2019-12-14T21:33:31Z | CONTRIBUTOR | Hi @simonw Have you had a chance to look at this at all? I'm going to have a chunk of time free next week so if there is additional work needed on this, that would be a particularly convenient time for me to revisit this. Cheers | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 530513784 | |
https://github.com/simonw/datasette/pull/653#issuecomment-582105810 | https://api.github.com/repos/simonw/datasette/issues/653 | 582105810 | MDEyOklzc3VlQ29tbWVudDU4MjEwNTgxMA== | 418191 | 2020-02-04T20:43:01Z | 2020-02-04T20:43:01Z | CONTRIBUTOR | I *think* the existing code will be OK even if I strip the lines in the middle of a new line delimited string. It's only used for the validation, SQLite handles the `--` just fine and the whole SQL textarea still gets sent once it passes validation. I can add your test case to my branch later this evening though. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 541331755 | |
https://github.com/simonw/datasette/pull/653#issuecomment-582106085 | https://api.github.com/repos/simonw/datasette/issues/653 | 582106085 | MDEyOklzc3VlQ29tbWVudDU4MjEwNjA4NQ== | 418191 | 2020-02-04T20:43:43Z | 2020-02-04T20:43:43Z | CONTRIBUTOR | but this also doesn't have to land at all if it doesn't match your use case. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 541331755 | |
https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573388052 | https://api.github.com/repos/simonw/sqlite-utils/issues/74 | 573388052 | MDEyOklzc3VlQ29tbWVudDU3MzM4ODA1Mg== | 15092 | 2020-01-12T06:51:30Z | 2020-01-12T06:51:30Z | CONTRIBUTOR | Thanks. That showed me that there was a click cli runner error, and setting `export LANG=en_US.UTF-8` fixed it. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 546073980 | |
https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573389669 | https://api.github.com/repos/simonw/sqlite-utils/issues/74 | 573389669 | MDEyOklzc3VlQ29tbWVudDU3MzM4OTY2OQ== | 15092 | 2020-01-12T07:21:17Z | 2020-01-12T07:21:17Z | CONTRIBUTOR | I guess there is some extra flag for ` CliRunner.invoke` to check exitcode and raise the exception, or that should be an extra assert added. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 546073980 | |
https://github.com/simonw/datasette/issues/656#issuecomment-576293773 | https://api.github.com/repos/simonw/datasette/issues/656 | 576293773 | MDEyOklzc3VlQ29tbWVudDU3NjI5Mzc3Mw== | 6371750 | 2020-01-20T14:17:11Z | 2020-01-20T14:17:11Z | CONTRIBUTOR | Seems that headers and definitions has simply to be filled as an HTML table in the description field of matadata.json. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 546961357 | |
https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012158895 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | 1012158895 | IC_kwDOCGYnMM48VFGv | 25778 | 2022-01-13T13:55:59Z | 2022-01-13T13:55:59Z | CONTRIBUTOR | Came here to add this. I might pick it up. Would also add a utility to create (and update and delete?) a spatial index. It's not much code but I have to look it up every time. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 557842245 | |
https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012230212 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | 1012230212 | IC_kwDOCGYnMM48VWhE | 25778 | 2022-01-13T15:15:13Z | 2022-01-13T15:15:13Z | CONTRIBUTOR | Some proposals I'd add to sqlite-utils: Some version of this, from [geojson-to-sqlite](https://github.com/simonw/geojson-to-sqlite/blob/main/geojson_to_sqlite/utils.py#L124-L130): ```python def init_spatialite(db, lib): db.conn.enable_load_extension(True) db.conn.load_extension(lib) # Initialize SpatiaLite if not yet initialized if "spatial_ref_sys" in db.table_names(): return db.conn.execute("select InitSpatialMetadata(1)") ``` Also a function for creating a spatial index: ```python db.conn.execute("select CreateSpatialIndex(?, ?)", [table, "geometry"]) ``` I don't know the nuances of updating a spatial index, or checking if one already exists. This could be a CLI method like: ```sh sqlite-utils spatial-index spatial.db table-name column-name ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 557842245 | |
https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012253198 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | 1012253198 | IC_kwDOCGYnMM48VcIO | 25778 | 2022-01-13T15:39:14Z | 2022-01-13T15:39:14Z | CONTRIBUTOR | Other thing: If there get to be enough utils, I think it's worth moving all the spatialite stuff into its own file (`gis.py` or something) just so it's easier to find later. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 557842245 | |
https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012413729 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | 1012413729 | IC_kwDOCGYnMM48WDUh | 25778 | 2022-01-13T18:50:00Z | 2022-01-13T18:50:00Z | CONTRIBUTOR | One more thing I'm going to add: A method to add a geometry column, which I'll need to do to create a spatial index on a table. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 557842245 | |
https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1013698557 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | 1013698557 | IC_kwDOCGYnMM48a8_9 | 25778 | 2022-01-15T15:15:22Z | 2022-01-15T15:15:22Z | CONTRIBUTOR | @simonw I have a PR here https://github.com/simonw/sqlite-utils/pull/385 that adds Spatialite helpers on the Python side. Please let me know how it looks. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 557842245 | |
https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1029317527 | https://api.github.com/repos/simonw/sqlite-utils/issues/79 | 1029317527 | IC_kwDOCGYnMM49WiOX | 25778 | 2022-02-03T19:18:02Z | 2022-02-03T19:18:02Z | CONTRIBUTOR | Taking part of the conversation from #385 here. > Would sqlite-utils add-geometry-column ... be a good CLI enhancement. for example? Yes. And also `sqlite-utils create-spatial-index` would be great to have. My plan would be to add those once the Python API is settled. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 557842245 | |
https://github.com/simonw/datasette/pull/666#issuecomment-590022164 | https://api.github.com/repos/simonw/datasette/issues/666 | 590022164 | MDEyOklzc3VlQ29tbWVudDU5MDAyMjE2NA== | 13896256 | 2020-02-23T03:26:00Z | 2020-02-23T03:26:00Z | CONTRIBUTOR | It was very helpful for me, using it for a 15M row table. Added a test, happy to amend though! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 562085508 | |
https://github.com/simonw/datasette/issues/691#issuecomment-643709037 | https://api.github.com/repos/simonw/datasette/issues/691 | 643709037 | MDEyOklzc3VlQ29tbWVudDY0MzcwOTAzNw== | 49260 | 2020-06-14T02:35:16Z | 2020-06-14T02:35:16Z | CONTRIBUTOR | The server should reload in the `config_dir` mode. Ref: #848 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 574021194 | |
https://github.com/simonw/datasette/issues/712#issuecomment-604225034 | https://api.github.com/repos/simonw/datasette/issues/712 | 604225034 | MDEyOklzc3VlQ29tbWVudDYwNDIyNTAzNA== | 127565 | 2020-03-26T04:40:08Z | 2020-03-26T04:40:08Z | CONTRIBUTOR | Great! Yes, can confirm that this works on Binder. However, when I try to run the same code locally, I get an Internal Server Error when I try to access Datasette. ``` ERROR: Exception in ASGI application Traceback (most recent call last): File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi result = await app(self.scope, self.receive, self.send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__ return await self.app(scope, receive, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette_debug_asgi.py", line 24, in wrapped_app await app(scope, recieve, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette/utils/asgi.py", line 174, in __call__ await self.app(scope, receive, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette/tracer.py", line 75, in __call__ await self.app(scope, receive, send) File "/Volumes/Workspace/mycode/datasette-test/lib/python3.7/site-packages/datasette/app.py", line 746, in __call__ raw_path = dict(scope["headers"])[path_from_header.encode("utf8")].split(b"?")[0] KeyError: b'x-original-uri' INFO: 127.0.0.1:49320 - "GET / HTTP/1.1" 500 Internal Server Error ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 588108428 | |
https://github.com/simonw/datasette/issues/712#issuecomment-604249402 | https://api.github.com/repos/simonw/datasette/issues/712 | 604249402 | MDEyOklzc3VlQ29tbWVudDYwNDI0OTQwMg== | 127565 | 2020-03-26T06:11:44Z | 2020-03-26T06:11:44Z | CONTRIBUTOR | Following on from @betatim's suggestion on Twitter, I've changed the proxy url to include 'absolute'. ``` python proxy_url = f'{base_url}proxy/absolute/8001/' ``` This works both on Binder and locally, without using the `path_from_header` option. I've updated the demo repository. Sorry @simonw if I've led you down the wrong path! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 588108428 | |
https://github.com/dogsheep/dogsheep-photos/issues/3#issuecomment-934372104 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/3 | 934372104 | IC_kwDOD079W843sWMI | 41546558 | 2021-10-05T12:38:24Z | 2021-10-05T12:38:24Z | CONTRIBUTOR | As dogsheep-photos already uses [osxphotos](https://github.com/RhetTbull/osxphotos) to load photos you can access the EXIF data via osxphotos. Apple Photos imports a small subset of EXIF data at the time the photo is imported and osxphotos provides this via the [exif_info](https://github.com/RhetTbull/osxphotos#exifinfo) property. If you want the full EXIF data, osxphotos also provides a wrapper around [exiftool](https://github.com/RhetTbull/osxphotos#exiftool). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 602533481 | |
https://github.com/simonw/datasette/pull/730#issuecomment-623463200 | https://api.github.com/repos/simonw/datasette/issues/730 | 623463200 | MDEyOklzc3VlQ29tbWVudDYyMzQ2MzIwMA== | 27856297 | 2020-05-04T13:27:22Z | 2020-05-04T13:27:22Z | CONTRIBUTOR | Superseded by #753. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 604001627 | |
https://github.com/simonw/datasette/issues/731#issuecomment-618126449 | https://api.github.com/repos/simonw/datasette/issues/731 | 618126449 | MDEyOklzc3VlQ29tbWVudDYxODEyNjQ0OQ== | 25778 | 2020-04-23T01:38:55Z | 2020-04-23T01:38:55Z | CONTRIBUTOR | I've almost suggested this same thing a couple times. I tend to have Makefile (because I'm doing other `make` stuff anyway to get data prepped), and I end up putting all those CLI options in something like `make run`. But it would be way easier to just have all those typical options -- plugins, templates, metadata -- be defaults. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 605110015 | |
https://github.com/simonw/datasette/issues/731#issuecomment-618758326 | https://api.github.com/repos/simonw/datasette/issues/731 | 618758326 | MDEyOklzc3VlQ29tbWVudDYxODc1ODMyNg== | 25778 | 2020-04-24T01:55:00Z | 2020-04-24T01:55:00Z | CONTRIBUTOR | Mounting `./static` at `/static` seems the simplest way. Saves you the trouble of deciding what else (`img` for example) gets special treatment. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 605110015 | |
https://github.com/simonw/datasette/issues/741#issuecomment-1125342229 | https://api.github.com/repos/simonw/datasette/issues/741 | 1125342229 | IC_kwDOBm6k_c5DE1wV | 25778 | 2022-05-12T19:21:16Z | 2022-05-12T19:21:16Z | CONTRIBUTOR | Came here to check if this had been flagged already. Was helping a colleague get something on Cloud Run and had to dig to find `--extra-options="--setting sql_time_limit_ms 2500"`. If I get some time next week, maybe I'll try to tackle it. Would definitely make things easier to be able to do something like this: ```sh datasette publish cloudrun something.db --setting sql_time_limit_ms 2500 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 607223136 | |
https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622599528 | https://api.github.com/repos/simonw/sqlite-utils/issues/103 | 622599528 | MDEyOklzc3VlQ29tbWVudDYyMjU5OTUyOA== | 32605365 | 2020-05-01T22:49:12Z | 2020-05-02T11:15:44Z | CONTRIBUTOR | With SQLITE_MAX_VARS = 999, or even 899, This hits the problem with the batch rows causing a overflow (works fine if SQLITE_MAX_VARS = 799). p.s. I have tried a few list of dicts to sqlite modules and this was the easiest to use/understand ------------- file begins ------------------ import sqlite_utils as su data = [ {'tickerId': 913324382, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'CONSTELLATION B', 'symbol': 'STZ B', 'disSymbol': 'STZ-B', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '163.13', 'change': '6.46', 'changeRatio': '0.0412', 'marketValue': '31180699895.63', 'volume': '417', 'turnoverRate': '0.0000'}, {'tickerId': 913323791, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Molina Health', 'symbol': 'MOH', 'disSymbol': 'MOH', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '173.25', 'change': '9.28', 'changeRatio': '0.0566', 'pPrice': '173.25', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '10520341695.50', 'volume': '1281557', 'turnoverRate': '0.0202'}, {'tickerId': 913257501, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Seattle Genetics', 'symbol': 'SGEN', 'disSymbol': 'SGEN', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '145.64', 'change': '8.41', 'changeRatio': '0.0613', 'pPrice': '146.45', 'pChange': '0.8100', 'pChRatio': '0.0056', 'marketValue': '25117961347.60', 'volume': '2791411', 'turnoverRate': '0.0162'}, {'tickerId': 925381971, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Bandwidth', 'symbol': 'BAND', 'disSymbol': 'BAND', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', '… | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 610517472 | |
https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748436779 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | 748436779 | MDEyOklzc3VlQ29tbWVudDc0ODQzNjc3OQ== | 41546558 | 2020-12-19T07:49:00Z | 2020-12-19T07:49:00Z | CONTRIBUTOR | @nickvazz ZGENERICASSET changed to ZASSET in Big Sur. Here's a list of other changes to the schema in Big Sur: https://github.com/RhetTbull/osxphotos/wiki/Changes-in-Photos-6---Big-Sur | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 612151767 | |
https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748562288 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | 748562288 | MDEyOklzc3VlQ29tbWVudDc0ODU2MjI4OA== | 41546558 | 2020-12-20T04:44:22Z | 2020-12-20T04:44:22Z | CONTRIBUTOR | @nickvazz @simonw I opened a [PR](https://github.com/dogsheep/dogsheep-photos/pull/31) that replaces the SQL for `ZCOMPUTEDASSETATTRIBUTES` to use osxphotos which now exposes all this data and has been updated for Big Sur. I did regression tests to confirm the extracted data is identical, with one exception which should not affect operation: the old code pulled data from `ZCOMPUTEDASSETATTRIBUTES` for missing photos while the main loop ignores missing photos and does not add them to `apple_photos`. The new code does not add rows to the `apple_photos_scores` table for missing photos. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 612151767 | |
https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623845014 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | 623845014 | MDEyOklzc3VlQ29tbWVudDYyMzg0NTAxNA== | 41546558 | 2020-05-05T03:55:14Z | 2020-05-05T03:56:24Z | CONTRIBUTOR | I'm traveling w/o access to my Mac so can't help with any code right now. I suspected ZSCENEIDENTIFIER was a foreign key into one of these psi.sqlite tables. But looks like you're on to something connecting groups to assets. As for the UUID, I think there's two ints because each is 64-bits but UUIDs are 128-bits. Thus they need to be combined to get the 128 bit UUID. You might be able to use Apple's [NSUUID](https://developer.apple.com/documentation/foundation/nsuuid?language=objc), for example, by wrapping with pyObjC. Here's one [example](https://github.com/ronaldoussoren/pyobjc/blob/881c82a7ba90f193934b52b44143360c80dce5e5/pyobjc-framework-Cocoa/PyObjCTest/test_nsuuid.py) of using this in PyObjC's test suite. Interesting it's stored this way instead of a UUIDString as in Photos.sqlite. Perhaps it for faster indexing. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 612287234 | |
https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624284539 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17 | 624284539 | MDEyOklzc3VlQ29tbWVudDYyNDI4NDUzOQ== | 41546558 | 2020-05-05T20:20:05Z | 2020-05-05T20:20:05Z | CONTRIBUTOR | FYI, I've got an [issue](https://github.com/RhetTbull/osxphotos/issues/25) to make osxphotos cross-platform but it's low on my priority list. About 90% of the functionality could be done cross-platform but right now the MacOS specific stuff is embedded throughout and would take some work. Though I try to minimize it, there's sprinklings of ObjC & Applescript throughout osxphotos. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 612860531 | |
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626390317 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | 626390317 | MDEyOklzc3VlQ29tbWVudDYyNjM5MDMxNw== | 41546558 | 2020-05-10T21:11:24Z | 2020-05-10T21:50:58Z | CONTRIBUTOR | Ugh....Yeah, I think easiest is to catch the exception and return no place as you suggest. This particular bit of code involves un-archiving a serialized NSKeyedArchiver which uses an object table and it is certainly possible to create a circular reference that way. Because this is happening in the decode, the circular reference must be in the original data. Does Photos show valid reverse geolocation info for the photo in question? If so, Photos may be doing something beyond a simple decode of the binary plist. For now, I'll push a patch to catch the exception. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 615474990 | |
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395507 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | 626395507 | MDEyOklzc3VlQ29tbWVudDYyNjM5NTUwNw== | 41546558 | 2020-05-10T21:54:45Z | 2020-05-10T21:54:45Z | CONTRIBUTOR | @simonw does Photos show valid reverse geolocation info? Are you sure you're using [bpylist2](https://github.com/xa4a/bpylist2) and not bpylist? They're both unfortunately imported as "bpylist" so if you somehow got the wrong (original bpylist) version installed, it could be the issue. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 615474990 | |
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395641 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | 626395641 | MDEyOklzc3VlQ29tbWVudDYyNjM5NTY0MQ== | 41546558 | 2020-05-10T21:55:54Z | 2020-05-10T21:55:54Z | CONTRIBUTOR | Did removing old bpylist solve the original problem or do you still have a photo that throws circular reference? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 615474990 | |
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626396379 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | 626396379 | MDEyOklzc3VlQ29tbWVudDYyNjM5NjM3OQ== | 41546558 | 2020-05-10T22:01:48Z | 2020-05-10T22:01:48Z | CONTRIBUTOR | Frustrates me when package authors create a "drop in" replacement with the same import name...this kind of thing has bitten me more than once! Would've been nicer I think for bpylist2 to do "import bpylist2 as bpylist" | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 615474990 | |
https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-626667235 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 | 626667235 | MDEyOklzc3VlQ29tbWVudDYyNjY2NzIzNQ== | 41546558 | 2020-05-11T12:20:34Z | 2020-05-11T12:20:34Z | CONTRIBUTOR | @simonw FYI, osxphotos includes a built in ExifTool class that uses [exiftool](https://exiftool.org/) to read and write exif data. It's not exposed yet in the docs because I really only use it right now in the osphotos command line interface to write tags when exporting. In v0.28.16 (just pushed) I added an ExifTool.as_dict() method which will give you a dict with all the exif tags in a file. For example: ```python import osxphotos photos = osxphotos.PhotosDB().photos() exiftool = osxphotos.exiftool.ExifTool(photos[0].path) exifdata = exiftool.as_dict() tags = exifdata["IPTC:Keywords"] ``` Not as elegant perhaps as a python only implementation because ExifTool has to make subprocess calls to an external tool but exiftool is by far the best tool available for reading and writing EXIF data and it does support HEIC. As for implementation, ExifTool uses a singleton pattern so the first time you instantiate it, it spawns an IPC to exiftool but then keeps it open and uses the same process for any subsequent calls (even on different files). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 615626118 | |
https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-627007458 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 | 627007458 | MDEyOklzc3VlQ29tbWVudDYyNzAwNzQ1OA== | 41546558 | 2020-05-11T22:51:52Z | 2020-05-11T22:52:26Z | CONTRIBUTOR | I'm not familiar with `ExifReader`. I wrote my own wrapper around `exiftool` because I wanted a simple way to write EXIF data when exporting photos (e.g. writing out to PersonInImage and keywords to IPTC:Keywords) and the existing python packages like [pyexiftool](https://github.com/smarnach/pyexiftool) didn't do quite what I wanted. If all you're after is the camera and shot info, that's available in `ZEXTENDEDATTRIBUTES` table. I've got an open issue [#11](https://github.com/RhetTbull/osxphotos/issues/11) to add this to osxphotos but it hasn't bubbled to the top of my backlog yet. osxphotos will give you the location info: `PhotoInfo.location` returns a tuple of (lat, lon) though this info is in ZEXTENDEDATTRIBUTES too (though it might not be correct as I believe Photos creates this table at import and the user might have changed the location of a photo, e.g. if camera didn't have GPS). ```sql CREATE TABLE ZEXTENDEDATTRIBUTES ( Z_PK INTEGER PRIMARY KEY, Z_ENT INTEGER, Z_OPT INTEGER, ZFLASHFIRED INTEGER, ZISO INTEGER, ZMETERINGMODE INTEGER, ZSAMPLERATE INTEGER, ZTRACKFORMAT INTEGER, ZWHITEBALANCE INTEGER, ZASSET INTEGER, ZAPERTURE FLOAT, ZBITRATE FLOAT, ZDURATION FLOAT, ZEXPOSUREBIAS FLOAT, ZFOCALLENGTH FLOAT, ZFPS FLOAT, ZLATITUDE FLOAT, ZLONGITUDE FLOAT, ZSHUTTERSPEED FLOAT, ZCAMERAMAKE VARCHAR, ZCAMERAMODEL VARCHAR, ZCODEC VARCHAR, ZLENSMODEL VARCHAR ); ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 615626118 | |
https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-628405453 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 | 628405453 | MDEyOklzc3VlQ29tbWVudDYyODQwNTQ1Mw== | 41546558 | 2020-05-14T05:59:53Z | 2020-05-14T05:59:53Z | CONTRIBUTOR | I've added support for the above exif data to [v0.28.17](https://github.com/RhetTbull/osxphotos/releases/tag/v0.28.17) of osxphotos. `PhotoInfo.exif_info` will return an `ExifInfo` [dataclass](https://docs.python.org/3/library/dataclasses.html) object with the following properties: ```python flash_fired: bool iso: int metering_mode: int sample_rate: int track_format: int white_balance: int aperture: float bit_rate: float duration: float exposure_bias: float focal_length: float fps: float latitude: float longitude: float shutter_speed: float camera_make: str camera_model: str codec: str lens_model: str ``` It's not all the EXIF data available in most files but is the data Photos deems important to save. Of course, you can get all the exif_data Note: this only works in Photos 5. As best as I can tell, EXIF data is not stored in the database for earlier versions. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 615626118 | |
https://github.com/simonw/datasette/issues/766#issuecomment-791509910 | https://api.github.com/repos/simonw/datasette/issues/766 | 791509910 | MDEyOklzc3VlQ29tbWVudDc5MTUwOTkxMA== | 6371750 | 2021-03-05T15:57:35Z | 2021-03-05T16:35:21Z | CONTRIBUTOR | Hello, I have the same wildcards search problems with an instance of Datasette. http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwerz&_sort=rowid is OK but http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwe* is not (FTS is activated on "Reference" "IntituleAnalyse" "NomDuProducteur" "PresentationDuContenu" "Notes"). Notice that a SQL query as below launched directly from SQLite in the server's shell, retrieves results. `select * from fonds_auguste_dupouy_1872_1967_fts where IntituleAnalyse MATCH "gwe*";` Thanks, | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 617323873 | |
https://github.com/simonw/datasette/issues/767#issuecomment-632555800 | https://api.github.com/repos/simonw/datasette/issues/767 | 632555800 | MDEyOklzc3VlQ29tbWVudDYzMjU1NTgwMA== | 2657547 | 2020-05-22T08:00:23Z | 2020-05-22T08:00:23Z | CONTRIBUTOR | That would be perfect! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 620969465 | |
https://github.com/simonw/datasette/issues/830#issuecomment-817414881 | https://api.github.com/repos/simonw/datasette/issues/830 | 817414881 | MDEyOklzc3VlQ29tbWVudDgxNzQxNDg4MQ== | 192568 | 2021-04-12T01:06:34Z | 2021-04-12T01:07:27Z | CONTRIBUTOR | Related: #1285, including arguments for natural breaks, equal interval, etc. modeled after choropleth map legends. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 636511683 | |
https://github.com/simonw/datasette/issues/838#issuecomment-716123598 | https://api.github.com/repos/simonw/datasette/issues/838 | 716123598 | MDEyOklzc3VlQ29tbWVudDcxNjEyMzU5OA== | 82988 | 2020-10-25T10:20:12Z | 2020-10-25T10:53:24Z | CONTRIBUTOR | I'm trying to [run something behind a MyBinder proxy](https://github.com/ouseful-testing/nbsearch), but seem to have something set up incorrectly and not sure what the fix is? I'm starting datasette with jupyter-server-proxy setup: ``` # __init__.py def setup_nbsearch(): return { "command": [ "datasette", "serve", f"{_NBSEARCH_DB_PATH}", "-p", "{port}", "--config", "base_url:{base_url}nbsearch/" ], "absolute_url": True, # The following needs a the labextension installing. # eg in postBuild: jupyter labextension install jupyterlab-server-proxy "launcher_entry": { "enabled": True, "title": "nbsearch", }, } ``` where the `base_url` gets automatically populated by the server-proxy. I define the loaders as: ``` # __init__.py from datasette import hookimpl @hookimpl def extra_css_urls(database, table, columns, view_name, datasette): return [ "/-/static-plugins/nbsearch/prism.css", "/-/static-plugins/nbsearch/nbsearch.css", ] ``` but these seem to also need a base_url prefix set somehow? Currently, the generated HTML loads properly but internal links are incorrect; eg they take the form `<link rel="stylesheet" href="/-/static-plugins/nbsearch/prism.css">` which resolves to eg `https://notebooks.gesis.org/hub/-/static-plugins/nbsearch/prism.css` rather than required URL of form `https://notebooks.gesis.org/binder/jupyter/user/ouseful-testing-nbsearch-0fx1mx67/nbsearch/-/static-plugins/nbsearch/prism.css`. The main css is loaded correctly: `<link rel="stylesheet" href="/binder/jupyter/user/ouseful-testing-nbsearch-0fx1mx67/nbsearch/-/static/app.css?404439">` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 637395097 | |
https://github.com/simonw/datasette/issues/838#issuecomment-720354227 | https://api.github.com/repos/simonw/datasette/issues/838 | 720354227 | MDEyOklzc3VlQ29tbWVudDcyMDM1NDIyNw== | 82988 | 2020-11-02T09:33:58Z | 2020-11-02T09:33:58Z | CONTRIBUTOR | Thanks; just a note that the `datasette.urls.static(path)` and `datasette.urls.static_plugins(plugin_name, path)` items both seem to be repeated and appear in the docs twice? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 637395097 | |
https://github.com/simonw/datasette/issues/851#issuecomment-645293374 | https://api.github.com/repos/simonw/datasette/issues/851 | 645293374 | MDEyOklzc3VlQ29tbWVudDY0NTI5MzM3NA== | 3243482 | 2020-06-17T10:32:02Z | 2020-06-17T10:32:28Z | CONTRIBUTOR | Welp, I'm an idiot. Turns out I had a sneaky comma `,` after `sql` key: ``` ... (:name, :url), ``` which tells sqlite to expect another `values(...)` list. Correcting the SQL solved the issue. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 640330278 | |
https://github.com/simonw/datasette/issues/859#issuecomment-647135713 | https://api.github.com/repos/simonw/datasette/issues/859 | 647135713 | MDEyOklzc3VlQ29tbWVudDY0NzEzNTcxMw== | 3243482 | 2020-06-21T14:30:02Z | 2020-06-21T14:30:02Z | CONTRIBUTOR | Oops, the same method is called from both index and database pages. But removing select count queries speed up the page load quite a bit. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-647194131 | https://api.github.com/repos/simonw/datasette/issues/859 | 647194131 | MDEyOklzc3VlQ29tbWVudDY0NzE5NDEzMQ== | 3243482 | 2020-06-21T23:15:54Z | 2020-06-21T23:26:09Z | CONTRIBUTOR | I'm not sure if table counts are to blame. There shouldn't be a ~3 orders of magnitude difference. ```fish user@klein /a/w/scrapyard (master)> set sql "select count(*) from table_1; select count(*) from table_2; select count(*) from table_3;" user@klein /a/w/scrapyard (master)> time sqlite3 scrapyard.db "$sql" 187489 46492 2229 ________________________________________________________ Executed in 25.57 millis fish external usr time 3.55 millis 0.00 micros 3.55 millis sys time 22.42 millis 1123.00 micros 21.30 millis ``` but not letting datasette count the tables definitely helps. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-647922203 | https://api.github.com/repos/simonw/datasette/issues/859 | 647922203 | MDEyOklzc3VlQ29tbWVudDY0NzkyMjIwMw== | 3243482 | 2020-06-23T05:44:58Z | 2021-01-05T08:22:43Z | CONTRIBUTOR | I'm seeing the problem on database page. Index page and table page runs quite fast. - Tables have <10 columns (`id`, `url`, `title`, `body_html`, `date`, `author`, `meta` (for keeping unstructured json)). I've added index on `date` columns (using `sqlite-utils`) in addition to the index present on `id` columns. - All tables have FTS enabled on `text` and `varchar` columns (`title`, `body_html` etc) to speed up searching. - There are couple of tables related with foreign keys (think a thread in a forum and posts in that thread, related with `thread_id`) | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-647923666 | https://api.github.com/repos/simonw/datasette/issues/859 | 647923666 | MDEyOklzc3VlQ29tbWVudDY0NzkyMzY2Ng== | 3243482 | 2020-06-23T05:49:31Z | 2020-06-23T05:49:31Z | CONTRIBUTOR | I think I should mention that having FTS on all tables mean I have 5 visible, 25 hidden (FTS) tables displayed on database page. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-647925594 | https://api.github.com/repos/simonw/datasette/issues/859 | 647925594 | MDEyOklzc3VlQ29tbWVudDY0NzkyNTU5NA== | 3243482 | 2020-06-23T05:55:21Z | 2020-06-23T06:28:29Z | CONTRIBUTOR | Hmm, not seeing the problem now. I've removed the commented out sections in `database.py` and restarted the process. Database page now loads in <250ms. I have couple of workers that check some pages regularly and scrape new content and save to the DB. Could it be that datasette tries to recount tables every time database size changes? Normally it keeps a count cache, but as DB gets updated so often (new content every 5 min or so) it's practically recounting every time I go to the database page? EDIT: It turns out it doesn't hold cache with mutable databases. I'll update the issue with more findings and a better way to reproduce the problem if I encounter it again. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-647935300 | https://api.github.com/repos/simonw/datasette/issues/859 | 647935300 | MDEyOklzc3VlQ29tbWVudDY0NzkzNTMwMA== | 3243482 | 2020-06-23T06:23:01Z | 2020-06-23T06:23:01Z | CONTRIBUTOR | > You said "200k+, 50+ rows in a couple of tables" - does that mean 50+ columns? I'll try with larger numbers of columns and see what difference that makes. Ah that was a typo, I meant 50k. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-647936117 | https://api.github.com/repos/simonw/datasette/issues/859 | 647936117 | MDEyOklzc3VlQ29tbWVudDY0NzkzNjExNw== | 3243482 | 2020-06-23T06:25:17Z | 2020-06-23T06:25:17Z | CONTRIBUTOR | > > > ``` > sqlite-generate many-cols.db --tables 2 --rows 200000 --columns 50 > ``` > > Looks like that will take 35 minutes to run (it's not a particularly fast tool). Try chunking write operations into batches every 1000 records or so. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-648232645 | https://api.github.com/repos/simonw/datasette/issues/859 | 648232645 | MDEyOklzc3VlQ29tbWVudDY0ODIzMjY0NQ== | 3243482 | 2020-06-23T15:19:53Z | 2020-06-23T15:19:53Z | CONTRIBUTOR | The issue seems to appear sporadically, like when I return to database page after a while, during which some records have been added to the database. I've just visited database, page first visit took ~10s, consecutive visits took 0.3s. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-648669523 | https://api.github.com/repos/simonw/datasette/issues/859 | 648669523 | MDEyOklzc3VlQ29tbWVudDY0ODY2OTUyMw== | 3243482 | 2020-06-24T08:13:23Z | 2020-06-24T10:30:36Z | CONTRIBUTOR | I tried setting `cache_size_kb=0` then `cache_size_kb=100000`, still getting this behavior. I even changed `Database::table_counts` and lowered time limit to 1 ```py table_count = ( await self.execute( "select count(*) from [{}]".format(table), custom_time_limit=1, ) ).rows[0][0] counts[table] = table_count ``` I feel like 10 seconds is a magic number, like a processing timeout and datasette gives up and returns the page. Index page loads instantly, table page, query page, as well. But when I return to database page after some time, it loads in 10s. EDIT: It's always like 10 + 0.3s, like 10s wait and timeout then 300ms to render the page | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-652160909 | https://api.github.com/repos/simonw/datasette/issues/859 | 652160909 | MDEyOklzc3VlQ29tbWVudDY1MjE2MDkwOQ== | 3243482 | 2020-07-01T03:09:32Z | 2020-07-01T03:10:21Z | CONTRIBUTOR | I've just realized Datasette tries to count hidden tables too. There are 5 visible tables, 25 hidden tables, which I haven't realize earlier to consider their effect. I've turned off counting for hidden tables to see if it has any effect. What's the point of counting FTS tables? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-904982056 | https://api.github.com/repos/simonw/datasette/issues/859 | 904982056 | IC_kwDOBm6k_c418O4o | 2670795 | 2021-08-24T21:15:04Z | 2021-08-24T21:15:30Z | CONTRIBUTOR | I'm running into issues with this as well. All other pages seem to work with lots of DBs except the home page, which absolutely tanks. Would be willing to put some work into this, if there's been any kind of progress on concepts on how this ought to work. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-905899177 | https://api.github.com/repos/simonw/datasette/issues/859 | 905899177 | IC_kwDOBm6k_c41_uyp | 2670795 | 2021-08-25T21:48:00Z | 2021-08-25T21:48:00Z | CONTRIBUTOR | Upon first stab, there's two issues here: - DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check. - Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-905904540 | https://api.github.com/repos/simonw/datasette/issues/859 | 905904540 | IC_kwDOBm6k_c41_wGc | 2670795 | 2021-08-25T21:59:14Z | 2021-08-25T21:59:55Z | CONTRIBUTOR | I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so: `for i in {1..20}; do sqlite-generate db$i.db --tables ${i}00 --rows 100,2000 --columns 5,100 --pks 0 --fks 0; done` This was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins. By simply persisting the `_internal` DB to disk, I was able to avoid most of the performance issues I was experiencing previously. (To do this, I changed the `datasette/internal_db.py:init_internal_db` creates to if not exists, and changed the `_internal` DB instantiation in `datasette/app.py:Datasette.__init__` to a path with `is_mutable=True`.) Super rough, but the pages now load so I can continue testing ideas. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 642572841 | |
https://github.com/simonw/datasette/issues/877#issuecomment-652166115 | https://api.github.com/repos/simonw/datasette/issues/877 | 652166115 | MDEyOklzc3VlQ29tbWVudDY1MjE2NjExNQ== | 3243482 | 2020-07-01T03:28:07Z | 2020-07-01T03:28:07Z | CONTRIBUTOR | Does this mean custom routes get to expose endpoints accepting POST requests? I've tried earlier to add some POST endpoints, but requests were being rejected by Datasette due to CSRF | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 648421105 | |
https://github.com/simonw/datasette/issues/877#issuecomment-652255960 | https://api.github.com/repos/simonw/datasette/issues/877 | 652255960 | MDEyOklzc3VlQ29tbWVudDY1MjI1NTk2MA== | 3243482 | 2020-07-01T07:52:25Z | 2020-07-01T08:10:00Z | CONTRIBUTOR | I am calling the API from another origin, so injecting CSRF token into templates wouldn't work. EDIT: I'll try the new version, it sounds promising | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 648421105 | |
https://github.com/simonw/datasette/issues/877#issuecomment-652261382 | https://api.github.com/repos/simonw/datasette/issues/877 | 652261382 | MDEyOklzc3VlQ29tbWVudDY1MjI2MTM4Mg== | 3243482 | 2020-07-01T08:03:17Z | 2020-07-01T08:03:23Z | CONTRIBUTOR | Bearer tokens sound interesting. Where do tokens come from? An auth provider of my choosing? How do they get verified? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 648421105 | |
https://github.com/simonw/datasette/pull/883#issuecomment-652297139 | https://api.github.com/repos/simonw/datasette/issues/883 | 652297139 | MDEyOklzc3VlQ29tbWVudDY1MjI5NzEzOQ== | 3243482 | 2020-07-01T09:11:29Z | 2020-07-01T09:11:29Z | CONTRIBUTOR | Turns out we should include hidden tables in the result dict, or we're breaking tests. I've committed a refactor https://github.com/simonw/datasette/pull/883/commits/4f06e1bf6fbe4b73be770b87f610bf7c0e6e3ea7 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 648749062 | |
https://github.com/simonw/datasette/pull/883#issuecomment-652394742 | https://api.github.com/repos/simonw/datasette/issues/883 | 652394742 | MDEyOklzc3VlQ29tbWVudDY1MjM5NDc0Mg== | 3243482 | 2020-07-01T12:41:13Z | 2020-07-01T12:41:13Z | CONTRIBUTOR | Well tests need to be updated. I need to get tests working on Windows. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 648749062 | |
https://github.com/simonw/datasette/issues/889#issuecomment-652990131 | https://api.github.com/repos/simonw/datasette/issues/889 | 652990131 | MDEyOklzc3VlQ29tbWVudDY1Mjk5MDEzMQ== | 49260 | 2020-07-02T12:58:11Z | 2020-07-02T13:00:18Z | CONTRIBUTOR | FWIW, this error does NOT happen in datasette 0.45a4. It only started on 0.45a5 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 649907676 | |
https://github.com/simonw/datasette/issues/889#issuecomment-653002499 | https://api.github.com/repos/simonw/datasette/issues/889 | 653002499 | MDEyOklzc3VlQ29tbWVudDY1MzAwMjQ5OQ== | 49260 | 2020-07-02T13:22:13Z | 2020-07-02T13:22:13Z | CONTRIBUTOR | I was able to narrow this down to the fact that lifespan protocol is turned on. I see the workaround you've used here: https://github.com/simonw/datasette-debug-asgi/commit/72d568d32a3159c763ce908c0b269736935c6987 If so, maybe it's time to update some of the asg_wrapper [plugins](https://datasette.readthedocs.io/en/stable/plugin_hooks.html#asgi-wrapper-datasette). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 649907676 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655018966 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655018966 | MDEyOklzc3VlQ29tbWVudDY1NTAxODk2Ng== | 79913 | 2020-07-07T17:41:06Z | 2020-07-07T17:41:06Z | CONTRIBUTOR | Hmm, while tests pass, this may not work as intended on larger datasets. Looking into it. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655052451 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655052451 | MDEyOklzc3VlQ29tbWVudDY1NTA1MjQ1MQ== | 79913 | 2020-07-07T18:45:23Z | 2020-07-07T18:45:23Z | CONTRIBUTOR | Ah, I see the problem. The truncate is inside a loop I didn't realize was there. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655239728 | MDEyOklzc3VlQ29tbWVudDY1NTIzOTcyOA== | 79913 | 2020-07-08T02:16:42Z | 2020-07-08T02:16:42Z | CONTRIBUTOR | I fixed my original oops by moving the `DELETE FROM $table` out of the chunking loop and repushed. I think this change can be considered in isolation from issues around transactions, which I discuss next. I wanted to make the DELETE + INSERT happen all in the same transaction so it was robust, but that was more complicated than I expected. The transaction handling in the Database/Table classes isn't systematic, and this poses big hurdles to making `Table.insert_all` (or other operations) consistent and robust in the face of errors. For example, I wanted to do this (whitespace ignored in diff, so indentation change not highlighted): ```diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index d6b9ecf..4107ceb 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -1028,6 +1028,11 @@ class Table(Queryable): batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns)) self.last_rowid = None self.last_pk = None + with self.db.conn: + # Explicit BEGIN is necessary because Python's sqlite3 doesn't + # issue implicit BEGINs for DDL, only DML. We mix DDL and DML + # below and might execute DDL first, e.g. for table creation. + self.db.conn.execute("BEGIN") if truncate and self.exists(): self.db.conn.execute("DELETE FROM [{}];".format(self.name)) for chunk in chunks(itertools.chain([first_record], records), batch_size): @@ -1038,7 +1043,11 @@ class Table(Queryable): # Use the first batch to derive the table names column_types = suggest_column_types(chunk) column_types.update(columns or {}) - self.create( + # Not self.create() because that is wrapped in its own + # transaction and Python's sqlite3 doesn't support + # nested transactions. + self.db.create_table( + … | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655643078 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655643078 | MDEyOklzc3VlQ29tbWVudDY1NTY0MzA3OA== | 79913 | 2020-07-08T17:05:59Z | 2020-07-08T17:05:59Z | CONTRIBUTOR | > The only thing missing from this PR is updates to the documentation. Ah, yes, thanks for this reminder! I've repushed with doc bits added. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 651844316 |