html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007633376,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1007633376,IC_kwDOCGYnMM48D0Pg,9599,2022-01-07T18:24:07Z,2022-01-07T18:24:07Z,OWNER,Relevant documentation: https://www.sqlite.org/lang_analyze.html,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007634999,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1007634999,IC_kwDOCGYnMM48D0o3,9599,2022-01-07T18:26:22Z,2022-01-07T18:26:22Z,OWNER,"I've not used the `ANALYZE` feature in SQLite at all before. Should probably add Python library methods for it.
Annoyingly I use the word ""analyze"" to mean something else in the CLI - for these features:
- #207
- #320","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007636709,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1007636709,IC_kwDOCGYnMM48D1Dl,536941,2022-01-07T18:28:33Z,2022-01-07T18:29:43Z,CONTRIBUTOR,"i added an index to one table with sqlite-utils, and then a query that used to take about 1 second started taking hundreds of seconds.
running analyze got me back to sub second speed.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007642831,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1007642831,IC_kwDOCGYnMM48D2jP,9599,2022-01-07T18:37:18Z,2022-01-07T18:37:18Z,OWNER,"After implementing #366 I can make it so `sqlite-utils create-index` automatically runs `db.analyze(index_name)` afterwards, maybe with a `--no-analyze` option in case anyone wants to opt out of that for specific performance reasons.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007643254,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1007643254,IC_kwDOCGYnMM48D2p2,9599,2022-01-07T18:37:56Z,2022-01-07T18:37:56Z,OWNER,Or I could leave off `--no-analyze` and tell people that if they want to add an index without running analyze they can execute the `CREATE INDEX` themselves.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008158357,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008158357,IC_kwDOCGYnMM48F0aV,9599,2022-01-08T21:33:07Z,2022-01-08T21:33:07Z,OWNER,"The one thing that worries me a little bit about doing this by default is that it adds a surprising new table to the database - it may be confusing to users if they run `create-index` and their database suddenly has a new `sqlite_stat1` table, see https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132
Options here are:
- Do it anyway. People can tolerate a surprise table appearing when they create an index.
- Only run `ANALYZE` if the user says `sqlite-utils create-index ... --analyze`
- Use the `--analyze` option, but also automatically run `ANALYZE` if they create an index and the database they are working with already has a `sqlite_stat1` table
I'm currently leading towards that third option - @fgregg any thoughts?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008161965,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008161965,IC_kwDOCGYnMM48F1St,536941,2022-01-08T22:02:56Z,2022-01-08T22:02:56Z,CONTRIBUTOR,"for options 2 and 3, i would worry about discoverablity.
in other db’s it is not necessary to explicitly call analyze for most indices. ie for postgres
> The system regularly collects statistics on all of a table's columns. Newly-created non-expression indexes can immediately use these statistics to determine an index's usefulness.
i suppose i would propose raising a warning if the stats table is created that explains what is going on and informs users about a —no-analyze argument.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163050,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008163050,IC_kwDOCGYnMM48F1jq,9599,2022-01-08T22:10:51Z,2022-01-08T22:10:51Z,OWNER,"Is there a downside to having a `sqlite_stat1` table if it has wildly incorrect statistics in it?
Imagine the following sequence of events:
- User imports a few records, creating the table, using `sqlite-utils insert`
- User runs `sqlite-utils create-index ...` which also creates and populates the `sqlite_stat1` table
- User runs `insert` again to populate several million new records
The user now has a database file with several million records and a statistics table that is wildly out of date, having been populated when they only had a few.
Will this result in surprisingly bad query performance compared to it that statistics table did not exist at all?
If so, I lean much harder towards `ANALYZE` as a strictly opt-in optimization, maybe with the `--analyze` option added to `sqlite-utils insert` top to help users opt in to updating their statistics after running big inserts.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163585,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008163585,IC_kwDOCGYnMM48F1sB,9599,2022-01-08T22:14:39Z,2022-01-09T03:03:07Z,OWNER,"The reason I'm hesitating on this is that I've not actually used ANALYZE at all in nearly five years of messing around with SQLite! So I'm nervous that there are surprise downsides I haven't thought of.
My hunch is that ANALYZE is only worth worrying about on much larger databases, in which case I'm OK supporting it as a thoroughly documented power-user feature rather than a default.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164116,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008164116,IC_kwDOCGYnMM48F10U,536941,2022-01-08T22:18:57Z,2022-01-08T22:18:57Z,CONTRIBUTOR,"the table with the query ran so bad was about 50k.
i think the scenario should not be worse than no stats.
i also did not know that sqlite was so different from postgres and needed an explicit analyze call.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164786,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008164786,IC_kwDOCGYnMM48F1-y,536941,2022-01-08T22:24:19Z,2022-01-08T22:24:19Z,CONTRIBUTOR,the out-of-date scenario you describe could be addressed by automatically adding an analyze to the insert or convert commands if they implicate an index,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008166084,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008166084,IC_kwDOCGYnMM48F2TE,536941,2022-01-08T22:32:47Z,2022-01-08T22:32:47Z,CONTRIBUTOR,or using “ pragma optimize”,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008229839,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008229839,IC_kwDOCGYnMM48GF3P,9599,2022-01-09T04:51:44Z,2022-01-09T04:51:44Z,OWNER,"Found one report on Stack Overflow from 9 years ago of someone seeing broken performance after running `ANALYZE`, hard to say that's a trend and not a single weird edge-case though! https://stackoverflow.com/q/12947214/6083","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008275546,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1008275546,IC_kwDOCGYnMM48GRBa,536941,2022-01-09T11:01:15Z,2022-01-09T13:37:51Z,CONTRIBUTOR,"i don’t want to be such a partisan for analyze, but the query planner deciding *not* to use an index based on information collected by analyze is not necessarily a bug, but could be the correct choice.
the original poster in that stack overflow doesn’t say there’s a performance regression ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1009521921,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1009521921,IC_kwDOCGYnMM48LBUB,9599,2022-01-11T01:37:53Z,2022-01-11T01:37:53Z,OWNER,"I decided to go with making this opt-in, mainly for consistency with the other places where I added this feature - see:
- #379
- #366
You can now run the following:
sqlite-utils create-index mydb.db mytable mycolumn --analyze
And ``ANALYZE`` will be run on the index once it has been created.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1009548580,https://api.github.com/repos/simonw/sqlite-utils/issues/365,1009548580,IC_kwDOCGYnMM48LH0k,536941,2022-01-11T02:43:34Z,2022-01-11T02:43:34Z,CONTRIBUTOR,thanks so much! always a pleasure to see how you work through these things,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096558279,