issues: 512996469

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	pull_request	body	repo	type	active_lock_reason	performed_via_github_app	reactions	draft	state_reason
512996469	MDU6SXNzdWU1MTI5OTY0Njk=	607	Ways to improve fuzzy search speed on larger data sets?	8431341	closed	0			6	2019-10-27T17:31:37Z	2019-11-07T03:38:10Z	2019-11-07T03:38:10Z	NONE		I have an sqlite table with 16 million rows in it. Having read @simonw article "[Fast Autocomplete Search for Your Website](https://24ways.org/2018/fast-autocomplete-search-for-your-website/)" I was curious to try datasette to see what kind of query performance I could get out of it. In truth I don't need to do full text search since all I would like to do is give my users a way to search for the names of investors such as "Warren Buffet", or "Tim Cook" (who's names are in a single column). On the first search, Datasette takes over 20 seconds to return all records associated with `elon musk`: > ![image](https://user-images.githubusercontent.com/8431341/67638889-a86e1100-f8b7-11e9-9f7e-a9d13a42e988.png) > ![image](https://user-images.githubusercontent.com/8431341/67638825-ed457800-f8b6-11e9-94d1-b44f1a40ee8c.png) If I rerun the same search, it then takes almost 9 seconds: > ![image](https://user-images.githubusercontent.com/8431341/67638908-e4a17180-f8b7-11e9-9d00-748c80ef1f21.png) That's far to slow to implement an autocomplete feature. I could reduce the latency by making a special table of only unique investor names, thereby reducing the search space to less than a million rows (then I'd need to implement a way to add only new investor names to the table as I received new data.. about 4,000 rows a day). If I did that, I'm still concerned the new table wouldn't be lean enough to lookup investor names quickly. Plus, even if I can implement the autocomplete feature, I would still finally have to lookup records for that investors which would take between 8 - 20 seconds. Are there any tricks for speeding this up? Here's my hardware: > ![image](https://user-images.githubusercontent.com/8431341/67638861-55945980-f8b7-11e9-96a8-ca76c7c68c5d.png)	107914493	issue			{"url": "https://api.github.com/repos/simonw/datasette/issues/607/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}		completed

Links from other tables

0 rows from issues_id in issues_labels
6 rows from issue in issue_comments