{"id": 1199158210, "node_id": "I_kwDOCGYnMM5HebPC", "number": 423, "title": ".extract() doesn't set foreign key when extracted columns contain NULL value", "user": {"value": 37447552, "label": "jlieth"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-04-10T20:05:30Z", "updated_at": "2022-08-27T14:45:04Z", "closed_at": "2022-08-27T14:45:04Z", "author_association": "NONE", "pull_request": null, "body": "I've run into an issue with `extract` and I don't believe this is the intended behaviour.\r\n\r\nI'm working with a database with music listening information. Currently it has one large table `listens` that contains all information. I'm trying to normalize the database by extracting relevant columns to separate tables (`artists`, `tracks`, `albums`). Not every track has an album.\r\n\r\nA simplified demonstration with just `track_title` and `album_title` columns:\r\n```ipython\r\nIn [1]: import sqlite_utils\r\n\r\nIn [2]: db = sqlite_utils.Database(memory=True)\r\n\r\nIn [3]: db[\"listens\"].insert_all([\r\n ...: {\"id\": 1, \"track_title\": \"foo\", \"album_title\": \"bar\"},\r\n ...: {\"id\": 2, \"track_title\": \"baz\", \"album_title\": None}\r\n ...: ], pk=\"id\")\r\nOut[3]: \r\n```\r\n\r\nThe track in the first row has an album, the second track doesn't. Now I extract album information into a separate column:\r\n```ipython\r\nIn [4]: db[\"listens\"].extract(columns=[\"album_title\"], table=\"albums\", fk_column=\"album_id\")\r\nOut[4]:
\r\n\r\nIn [5]: list(db[\"albums\"].rows)\r\nOut[5]: [{'id': 1, 'album_title': 'bar'}, {'id': 2, 'album_title': None}]\r\n\r\nIn [6]: list(db[\"listens\"].rows)\r\nOut[6]: \r\n[{'id': 1, 'track_title': 'foo', 'album_id': 1},\r\n {'id': 2, 'track_title': 'baz', 'album_id': None}]\r\n```\r\n\r\nThis behaves as expected -- the `album` table contains entries for both the existing album and the NULL album. The `listens` table has a foreign key only for the first row (since the album in the second row was empty).\r\n\r\nNow I want to extract the track information as well. Album information belongs to the track so I want to extract both columns to a new table.\r\n```ipython\r\nIn [7]: db[\"listens\"].extract(columns=[\"track_title\", \"album_id\"], table=\"tracks\", fk_column=\"track_id\")\r\nOut[7]:
\r\n\r\nIn [8]: list(db[\"tracks\"].rows)\r\nOut[8]: \r\n[{'id': 1, 'track_title': 'foo', 'album_id': 1},\r\n {'id': 2, 'track_title': 'baz', 'album_id': None}]\r\n\r\nIn [9]: list(db[\"listens\"].rows)\r\nOut[9]: [{'id': 1, 'track_id': 1}, {'id': 2, 'track_id': None}]\r\n```\r\n\r\nExtracting to the `tracks` table worked fine (both tracks are present with correct columns). However, the `listens` table only has a foreign key to the newly created tracks for the first row, the foreign key in the second row is NULL.\r\n\r\nChanging the order of extracts doesn't help.\r\n\r\nI poked around in the source a bit and I believe [this line](https://github.com/simonw/sqlite-utils/blob/433813612ff9b4b501739fd7543bef0040dd51fe/sqlite_utils/db.py#L1737) (essentially comparing `NULL = NULL`) is the problem, but I don't know enough about SQL to create a reliable fix myself.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/423/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1309542173, "node_id": "PR_kwDOCGYnMM47pwAb", "number": 455, "title": "in extract code, check equality with IS instead of = for nulls", "user": {"value": 536941, "label": "fgregg"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-07-19T13:40:25Z", "updated_at": "2022-08-27T14:45:03Z", "closed_at": "2022-08-27T14:45:03Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/455", "body": "sqlite \"IS\" is equivalent to SQL \"IS NOT DISTINCT FROM\"\r\n\r\ncloses #423", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/455/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1348169997, "node_id": "I_kwDOCGYnMM5QW3EN", "number": 467, "title": "Mechanism for ensuring a table has all the columns", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 13, "created_at": "2022-08-23T15:50:23Z", "updated_at": "2022-08-27T23:19:41Z", "closed_at": "2022-08-27T23:17:56Z", "author_association": "OWNER", "pull_request": null, "body": "Suggested by @jefftriplett on Discord: https://discord.com/channels/823971286308356157/997738192360964156/1011655389063958600", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/467/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1348294436, "node_id": "PR_kwDOCGYnMM49qP2V", "number": 468, "title": "db[table].create(..., transform=True) and create-table --transform", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 6, "created_at": "2022-08-23T17:27:58Z", "updated_at": "2022-08-27T23:17:55Z", "closed_at": "2022-08-27T23:17:55Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/468", "body": "Work in progress. Still needs documentation and tests (and to cover more cases of things that might have changed).\r\n\r\nRefs:\r\n- #467\r\n\r\n\r\n----\r\n:books: Documentation preview :books:: https://sqlite-utils--468.org.readthedocs.build/en/468/\r\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/468/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1352931464, "node_id": "I_kwDOCGYnMM5QpBiI", "number": 469, "title": "sqlite-utils rows --order option", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 1, "created_at": "2022-08-27T03:49:51Z", "updated_at": "2022-08-27T04:30:49Z", "closed_at": "2022-08-27T04:10:32Z", "author_association": "OWNER", "pull_request": null, "body": "For consistency with `search`: https://sqlite-utils.datasette.io/en/stable/cli-reference.html#search\r\n\r\n```\r\n -o, --order TEXT Order by ('column' or 'column desc')\r\n```\r\n\r\nI wanted to run `sqlite-utils rows db.db mytable --order 'rowid desc'` to see the most recently imported rows.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/469/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1352932038, "node_id": "I_kwDOCGYnMM5QpBrG", "number": 470, "title": "Upgrade `--load-extension` to accept entrypoints like Datasette", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 6, "created_at": "2022-08-27T03:53:20Z", "updated_at": "2022-08-27T05:55:49Z", "closed_at": "2022-08-27T05:55:48Z", "author_association": "OWNER", "pull_request": null, "body": "Imitate:\r\n- https://github.com/simonw/datasette/pull/1789\r\n```\r\n# would load default entrypoint like before\r\ndatasette data.db --load-extension ext\r\n\r\n# loads the extensions with the \"sqlite3_foo_init\" entrpoint\r\ndatasette data.db --load-extension ext:sqlite3_foo_init\r\n\r\n# loads the extensions with the \"sqlite3_bar_init\" entrpoint\r\ndatasette data.db --load-extension ext:sqlite3_bar_init\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/470/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1352932716, "node_id": "I_kwDOCGYnMM5QpB1s", "number": 471, "title": "sqlite-utils query --functions mechanism for registering extra functions", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 12, "created_at": "2022-08-27T03:57:53Z", "updated_at": "2022-09-07T03:46:26Z", "closed_at": "2022-08-27T05:10:57Z", "author_association": "OWNER", "pull_request": null, "body": "It would be really cool if you could register additional custom SQL functions for use with the `sqlite-utils query` command - something like this:\r\n\r\n```\r\nsqlite-utils data.db 'update images set domain = extract_domain(url)' --functions '\r\nfrom urllib.parse import urlparse\r\n\r\ndef extract_domain(url):\r\n return urlparse(url).netloc\r\n'\r\n```\r\nEvery function defined in that code block would be registered with the connection, unless the name began with an underscore.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/471/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1352946135, "node_id": "I_kwDOCGYnMM5QpFHX", "number": 472, "title": "Reuse the locals/globals fix from --functions for other code accepting options", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 2, "created_at": "2022-08-27T05:12:05Z", "updated_at": "2022-08-27T05:20:12Z", "closed_at": "2022-08-27T05:20:12Z", "author_association": "OWNER", "pull_request": null, "body": "I figured out a workaround for the ugly `global x` hack here:\r\n- https://github.com/simonw/sqlite-utils/issues/471#issuecomment-1229120653", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/472/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1352953535, "node_id": "PR_kwDOCGYnMM4950Az", "number": 473, "title": "Support entrypoints for `--load-extension`", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-08-27T05:53:59Z", "updated_at": "2022-08-27T05:55:52Z", "closed_at": "2022-08-27T05:55:47Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/473", "body": "Refs #470\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--473.org.readthedocs.build/en/473/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/473/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1353189941, "node_id": "I_kwDOCGYnMM5QqAo1", "number": 475, "title": "table.default_values introspection property", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 1, "created_at": "2022-08-27T22:33:31Z", "updated_at": "2022-08-27T22:44:46Z", "closed_at": "2022-08-27T22:43:02Z", "author_association": "OWNER", "pull_request": null, "body": "> Interesting challenge with `default_value`: I need to be able to tell if the default values passed to `.create()` differ from those in the database already.\r\n>\r\n> Introspecting that is a bit tricky:\r\n>\r\n> ```pycon\r\n> >>> import sqlite_utils\r\n> >>> db = sqlite_utils.Database(memory=True)\r\n> >>> db[\"blah\"].create({\"id\": int, \"name\": str}, not_null=(\"name\",), defaults={\"name\": \"bob\"})\r\n>
\r\n> >>> db[\"blah\"].columns\r\n> [Column(cid=0, name='id', type='INTEGER', notnull=0, default_value=None, is_pk=0), Column(cid=1, name='name', type='TEXT', notnull=1, default_value=\"'bob'\", is_pk=0)]\r\n> ```\r\n> Note how a default value of the Python string `bob` is represented in the results of `PRAGMA table_info()` as `default_value=\"'bob'\"` - it's got single quotes added to it!\r\n> \r\n> So comparing default values from introspecting the database needs me to first parse that syntax. This may require a new table introspection method.\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/468#issuecomment-1229279539_", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/475/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"}