{"html_url": "https://github.com/simonw/datasette/issues/607#issuecomment-546722281", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/607", "id": 546722281, "node_id": "MDEyOklzc3VlQ29tbWVudDU0NjcyMjI4MQ==", "user": {"value": 8431341, "label": "zeluspudding"}, "created_at": "2019-10-27T18:46:29Z", "updated_at": "2019-10-27T19:00:40Z", "author_association": "NONE", "body": "Update: I've created a table of only unique names. This reduces the search space from over 16 million, to just about 640,000. Interestingly, it takes less than 2 seconds to create this table using Python. Performing the same search that we did earlier for `elon musk` takes nearly a second - much faster than before but still not speedy enough for an autocomplete feature (which usually needs to return results within 100ms to feel \"real time\"). \r\n\r\nAny ideas for slashing the search speed nearly 10 fold?\r\n\r\n> ![image](https://user-images.githubusercontent.com/8431341/67639587-b6c02b00-f8bf-11e9-9344-1d8667cad395.png)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 512996469, "label": "Ways to improve fuzzy search speed on larger data sets?"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/607#issuecomment-546723302", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/607", "id": 546723302, "node_id": "MDEyOklzc3VlQ29tbWVudDU0NjcyMzMwMg==", "user": {"value": 8431341, "label": "zeluspudding"}, "created_at": "2019-10-27T18:59:55Z", "updated_at": "2019-10-27T19:00:48Z", "author_association": "NONE", "body": "Ultimately, I'm needing to serve searches like this to multiple users (at times concurrently). Given the size of the database I'm working with, can anyone comment as to whether I should be storing this in something like MySQL or Postgres rather than SQLite. I know there's been much [defense of sqlite being performant](https://www.sqlite.org/whentouse.html) but I wonder if those arguments break down as the database size increases.\r\n\r\nFor example, if I scroll to the bottom of that linked page, where it says **Checklist For Choosing The Right Database Engine**, here's how I answer those questions:\r\n\r\n - Is the data separated from the application by a network?  \u2192 choose client/server\r\n  __Yes__\r\n- Many concurrent writers? \u2192 choose client/server\r\n  __Not exactly. I may have many concurrent readers but almost no concurrent writers.__\r\n- Big data? \u2192 choose client/server\r\n  __No, my database is less than 40 gb and wont approach a terabyte in the next decade.__\r\n\r\nSo is sqlite still a good idea here?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 512996469, "label": "Ways to improve fuzzy search speed on larger data sets?"}, "performed_via_github_app": null}