{"html_url": "https://github.com/simonw/datasette/issues/417#issuecomment-586047525", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/417", "id": 586047525, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA0NzUyNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T01:03:43Z", "updated_at": "2020-02-14T01:59:02Z", "author_association": "OWNER", "body": "OK, I have a plan. I'm going to try and implement this is a core Datasette feature (no plugins) with the following design:\r\n\r\n- You can tell Datasette \"load any databases you find in this directory\" by passing the `--dir=path/to/dir` option to `datasette` that are valid SQLite files and will attach them to Datasette\r\n- Every 10 seconds Datasette will re-scan those directories to see if any new files have been added\r\n- That 10s will be the default for a new `--config directory_scan_s:10` config option. You can set this to `0` to disable scanning entirely, at which point Datasette will only run the scan once on startup.\r\n\r\nTo check if a file is valid SQLite, Datasette will first check if the first few bytes of the file are `b\"SQLite format 3\\x00\"`. If they are, it will open a connection to the file and attempt to run `select * from sqlite_master` against it. If that runs without any errors it will assume the file is usable and connect it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421546944, "label": "Datasette Library"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/417#issuecomment-586047995", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/417", "id": 586047995, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA0Nzk5NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T01:05:20Z", "updated_at": "2020-02-14T01:05:20Z", "author_association": "OWNER", "body": "I'm going to add two methods to the Datasette class to help support this work (and to enable exciting new plugin opportunities in the future):\r\n\r\n- `datasette.add_database(name, db)` - adds a new named database to the list of connected databases. `db` will be a `Database()` object, which may prove useful in the future for things like #670 and could also allow some plugins to provide in-memory SQLite databases.\r\n- `datasette.remove_database(name)`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421546944, "label": "Datasette Library"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/417#issuecomment-586065843", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/417", "id": 586065843, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA2NTg0Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T02:20:53Z", "updated_at": "2020-02-14T02:20:53Z", "author_association": "OWNER", "body": "MVP for this feature: just do it once on startup, don't scan for new files every X seconds.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421546944, "label": "Datasette Library"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/417#issuecomment-586066798", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/417", "id": 586066798, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA2Njc5OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T02:24:54Z", "updated_at": "2020-02-14T02:24:54Z", "author_association": "OWNER", "body": "I'm going to move this over to a draft pull request.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421546944, "label": "Datasette Library"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/576#issuecomment-586053947", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/576", "id": 586053947, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA1Mzk0Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T01:29:48Z", "updated_at": "2020-05-30T13:22:09Z", "author_association": "OWNER", "body": "OK, I've made a start on this now in 3ffb8f3b98252531d11897fd431711e9b8045ace - still plenty more methods to document. More importantly that class has a LOT of junk methods on that no-one should ever call from a plugin, so I need to decide what to do about those.\r\n\r\nhttps://datasette.readthedocs.io/en/latest/internals.html", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 497170355, "label": "Documented internals API for use in plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/671#issuecomment-586054154", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/671", "id": 586054154, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA1NDE1NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T01:30:35Z", "updated_at": "2020-02-14T01:30:35Z", "author_association": "OWNER", "body": "Documented here: https://datasette.readthedocs.io/en/latest/datasette.html", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565041624, "label": "datasette.add_database(name, db) and datasette.remove_database(name) methods"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586442292", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586442292, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0MjI5Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:36:37Z", "updated_at": "2020-02-14T19:36:37Z", "author_association": "OWNER", "body": "This can be a function in `utils/__init__.py`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586442978", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586442978, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0Mjk3OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:38:19Z", "updated_at": "2020-02-14T19:38:19Z", "author_association": "OWNER", "body": "Amazingly, I get 0 search results on Google for `RidList_VirtualReaderModule`! I guess no-one has reverse engineered the Apple Photos SQLite database at that level yet.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586443837", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586443837, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0MzgzNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:40:42Z", "updated_at": "2020-02-14T19:41:56Z", "author_association": "OWNER", "body": "Here's how to test if the `rtree` virtual table is supported:\r\n```\r\n>>> import sqlite3\r\n>>> c = sqlite3.connect(\":memory:\")\r\n>>> c.execute(\"create virtual table blah using rtree (a, b, c)\")\r\n\r\n>>> c.execute(\"create virtual table blah2 using rtree2 (a, b, c)\")\r\nTraceback (most recent call last):\r\n File \"\", line 1, in \r\nsqlite3.OperationalError: table blah already exists\r\n```\r\nAlso:\r\n```\r\n>>> c.execute('''CREATE VIRTUAL TABLE SpatialIndex USING VirtualSpatialIndex()''')\r\nTraceback (most recent call last):\r\n File \"\", line 1, in \r\nsqlite3.OperationalError: no such module: VirtualSpatialIndex\r\n>>> c.enable_load_extension(\r\n... True)\r\n>>> \r\n>>> c.load_extension(\"/usr/local/lib/mod_spatialite.dylib\")\r\n>>> c.execute('''CREATE VIRTUAL TABLE SpatialIndex USING VirtualSpatialIndex()''')\r\n\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586444835", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586444835, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0NDgzNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:43:27Z", "updated_at": "2020-02-14T19:43:27Z", "author_association": "OWNER", "body": "I can extend this function (maybe also rename it):\r\n\r\nhttps://github.com/simonw/datasette/blob/52ba34701cdbf510236de87d35b0e6df330626d1/datasette/utils/__init__.py#L595-L610", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586444970", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586444970, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0NDk3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:43:46Z", "updated_at": "2020-02-14T19:43:46Z", "author_association": "OWNER", "body": "`is_openable_sqlite` perhaps?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586445210", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586445210, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0NTIxMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:44:27Z", "updated_at": "2020-02-14T19:44:27Z", "author_association": "OWNER", "body": "For the unit tests I think I'm going to have to create minimal binary SQLite file examples and include them in the repo.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586448292", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586448292, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0ODI5Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:53:05Z", "updated_at": "2020-02-14T19:53:05Z", "author_association": "OWNER", "body": "I may be re-inventing this code at the moment:\r\nhttps://github.com/simonw/datasette/blob/3ffb8f3b98252531d11897fd431711e9b8045ace/datasette/app.py#L219-L237", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586449286", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586449286, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0OTI4Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:56:00Z", "updated_at": "2020-02-14T19:57:17Z", "author_association": "OWNER", "body": "I tried to make the smallest SpatiaLite database file I could (to use for the tests), but it ended up over 5MB!\r\n```\r\n$ echo '{\"type\":\"Feature\",\"properties\":{\"name\":\"Hearst Castle\"},\"geometry\":{\"type\":\"Point\",\"coordinates\":[-121.1686,35.685]}}' | geojson-to-sqlite /tmp/hearst.db places - --spatialite\r\n$ ls -lah /tmp/hearst.db \r\n-rw-r--r-- 1 simonw wheel 5.3M Feb 14 11:54 /tmp/hearst.db\r\n```\r\nI imagine that's because of these tables:\r\n\"tiny\"\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586450571", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586450571, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ1MDU3MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:59:41Z", "updated_at": "2020-02-14T20:01:14Z", "author_association": "OWNER", "body": "This helped:\r\n```\r\n$ sqlite3 /tmp/hearst.db \r\nSQLite version 3.24.0 2018-06-04 14:10:15\r\nEnter \".help\" for usage hints.\r\nsqlite> delete from spatial_ref_sys where srid != 4326;\r\nsqlitte> delete from spatial_ref_sys_aux where srid != 4326;\r\nsqlite> vacuum;\r\nsqlite> ^D\r\n$ ls -lah /tmp/hearst.db \r\n-rw-r--r-- 1 simonw wheel 216K Feb 14 12:01 /tmp/hearst.db\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586454371", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586454371, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ1NDM3MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T20:11:02Z", "updated_at": "2020-02-14T20:11:02Z", "author_association": "OWNER", "body": "The technique from `run_sanity_checks` of running `PRAGMA table_info({})` for every table seems to work just fine. It failed for the Apple Photos database for example:\r\n```\r\nsqlite> pragma table_info(RKSceneInVersion_VirtualBufferReader);\r\nError: no such module: VirtualBufferReaderModule\r\n```\r\nSo I think the solution to this ticket is going to be moving that logic into a new utility function.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/673#issuecomment-586455321", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/673", "id": 586455321, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ1NTMyMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T20:13:59Z", "updated_at": "2020-02-14T20:13:59Z", "author_association": "OWNER", "body": "Closing this in favour of rethinking how sanity checks work.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565518772, "label": "Mechanism for checking if a SQLite database file is safe to open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586067794", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586067794, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA2Nzc5NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T02:29:16Z", "updated_at": "2020-02-14T02:29:16Z", "author_association": "OWNER", "body": "One design issue: how to pick neat unique names for database files in a file hierarchy?\r\n\r\nHere's what I have so far:\r\n\r\nhttps://github.com/simonw/datasette/blob/fe6f9e6a7397cab2e4bc57745a8da9d824dad218/datasette/app.py#L231-L237\r\n\r\nFor these files:\r\n```\r\n../travel-old.db\r\n../sf-tree-history/trees.db\r\n../library-of-congress/records-from-df.db\r\n```\r\nIt made these names:\r\n```\r\ntravel-old\r\nsf-tree-history_trees\r\nlibrary-of-congress_records-from-df\r\n```\r\nMaybe this is good enough? Needs some tests.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586068095", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586068095, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA2ODA5NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T02:30:37Z", "updated_at": "2020-02-14T02:30:46Z", "author_association": "OWNER", "body": "This can take a LONG time to run, and at the moment it's blocking and prevents Datasette from starting up.\r\n\r\nIt would be much better if this ran in a thread, or an asyncio task. Probably have to be a thread because there's no easy `async` version of `pathlib.Path.glob()` that I've seen.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586069529", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586069529, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjA2OTUyOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T02:37:17Z", "updated_at": "2020-02-14T02:37:17Z", "author_association": "OWNER", "body": "Another problem: if any of the found databases use SpatiaLite then Datasette will fail to start at all.\r\n\r\nIt should skip them instead.\r\n\r\nThe `select * from sqlite_master` check apparently isn't quite enough to catch this case.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586107989", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586107989, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjEwNzk4OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T05:45:12Z", "updated_at": "2020-02-14T05:45:12Z", "author_association": "OWNER", "body": "I tried running the `scan_dirs()` method in a thread and got an interesting error while trying to load the homepage: `RuntimeError: OrderedDict mutated during iteration`\r\n\r\nMakes sense - I had a thread that added an item to that dictionary right while the homepage was attempting to run this code:\r\n\r\nhttps://github.com/simonw/datasette/blob/efa54b439fd0394440c302602b919255047b59c5/datasette/views/index.py#L24-L27\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586109032", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586109032, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjEwOTAzMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T05:50:15Z", "updated_at": "2020-02-14T05:50:15Z", "author_association": "OWNER", "body": "So I need to ensure the `ds.databases` data structure is manipulated in a thread-safe manner.\r\n\r\nMainly I need to ensure that it is locked during iterations over it, then unlocked at the end.\r\n\r\nTrickiest part is probably ensuring there is a test that proves this is working - I feel like I got lucky encountering that `RuntimeError` as early as I did.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586109238", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586109238, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjEwOTIzOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T05:51:12Z", "updated_at": "2020-02-14T05:51:12Z", "author_association": "OWNER", "body": "... or maybe I can cheat and wrap the access to `self.ds.databases.items()` in `list()`, so I'm iterating over an atomically-created list of those things instead? I'll try that first.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586109784", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586109784, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjEwOTc4NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T05:53:50Z", "updated_at": "2020-02-14T05:54:21Z", "author_association": "OWNER", "body": "... cheating like this seems to work:\r\n```\r\nfor name, db in list(self.ds.databases.items()):\r\n```\r\nPython built-in operations are supposedly threadsafe, so in this case I can grab a copy of the list atomically (I think) and then safely iterate over it.\r\n\r\nSeems to work in my testing. Wish I could prove it with a unit test though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586111102", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586111102, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjExMTEwMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T05:59:24Z", "updated_at": "2020-02-14T06:00:36Z", "author_association": "OWNER", "body": "Interesting new problem: hitting Ctrl+C no longer terminates the problem provided that `scan_dirs()` thread is still running.\r\n\r\nhttps://stackoverflow.com/questions/49992329/the-workers-in-threadpoolexecutor-is-not-really-daemon has clues. The workers are only meant to exit when their worker queues are empty.\r\n\r\nBut... I want to run the worker every 10 seconds. How do I do that without having it loop forever and hence never quit?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586111619", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586111619, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjExMTYxOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T06:01:24Z", "updated_at": "2020-02-14T06:01:24Z", "author_association": "OWNER", "body": "https://gist.github.com/clchiou/f2608cbe54403edb0b13 might work.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586112662", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586112662, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjExMjY2Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T06:05:27Z", "updated_at": "2020-02-14T06:05:27Z", "author_association": "OWNER", "body": "It think the fix is to use an old-fashioned `threading` module daemon thread directly. That should exit cleanly when the program exits.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/672#issuecomment-586441484", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/672", "id": 586441484, "node_id": "MDEyOklzc3VlQ29tbWVudDU4NjQ0MTQ4NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-02-14T19:34:25Z", "updated_at": "2020-02-14T19:34:25Z", "author_association": "OWNER", "body": "I've figured out how to tell if a database is safe to open or not:\r\n```sql\r\nselect sql from sqlite_master where sql like 'CREATE VIRTUAL TABLE%';\r\n```\r\nThis returns the SQL definitions for virtual tables. The bit after `using` tells you what they need.\r\n\r\nRun this against a SpatiaLite database and you get the following:\r\n```sql\r\nCREATE VIRTUAL TABLE SpatialIndex USING VirtualSpatialIndex()\r\nCREATE VIRTUAL TABLE ElementaryGeometries USING VirtualElementary()\r\n```\r\nRun it against an Apple Photos `photos.db` file (found with `find ~/Library | grep photos.db`) and you get this (partial list):\r\n```sql\r\nCREATE VIRTUAL TABLE RidList_VirtualReader using RidList_VirtualReaderModule\r\nCREATE VIRTUAL TABLE Array_VirtualReader using Array_VirtualReaderModule\r\nCREATE VIRTUAL TABLE LiGlobals_VirtualBufferReader using VirtualBufferReaderModule\r\nCREATE VIRTUAL TABLE RKPlace_RTree using rtree (modelId,minLongitude,maxLongitude,minLatitude,maxLatitude)\r\n```\r\nFor a database with FTS4 you get:\r\n```sql\r\nCREATE VIRTUAL TABLE \"docs_fts\" USING FTS4 (\r\n [title], [content], content=\"docs\"\r\n)\r\n```\r\nFTS5:\r\n```sql\r\nCREATE VIRTUAL TABLE [FARA_All_Registrants_fts] USING FTS5 (\r\n [Name], [Address_1], [Address_2],\r\n content=[FARA_All_Registrants]\r\n )\r\n```\r\nSo I can use this to figure out all of the `using` pieces and then compare them to a list of known support ones.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 565064079, "label": "--dirs option for scanning directories for SQLite databases"}, "performed_via_github_app": null}