{"id": 1044267332, "node_id": "I_kwDOCGYnMM4-PkFE", "number": 336, "title": "sqlite-util tranform --column-order mangles columns of type \"timestamp\"", "user": {"value": 536941, "label": "fgregg"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-11-04T01:15:38Z", "updated_at": "2023-05-08T21:13:38Z", "closed_at": "2023-05-08T21:13:38Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Reproducible code below:\r\n\r\n```bash\r\n> echo 'create table bar (baz text, created_at timestamp default CURRENT_TIMESTAMP)' | sqlite3 foo.db\r\n> sqlite3 foo.db\r\nSQLite version 3.36.0 2021-06-18 18:36:39\r\nEnter \".help\" for usage hints.\r\nsqlite> .schema bar\r\nCREATE TABLE bar (baz text, created_at timestamp default CURRENT_TIMESTAMP);\r\nsqlite> .exit\r\n> sqlite-utils transform foo.db bar --column-order baz\r\nsqlite3 foo.db\r\nSQLite version 3.36.0 2021-06-18 18:36:39\r\nEnter \".help\" for usage hints.\r\nsqlite> .schema bar\r\nCREATE TABLE IF NOT EXISTS \"bar\" (\r\n [baz] TEXT,\r\n [created_at] FLOAT DEFAULT 'CURRENT_TIMESTAMP'\r\n);\r\nsqlite> .exit\r\n> sqlite-utils transform foo.db bar --column-order baz\r\n> sqlite3 foo.db\r\nSQLite version 3.36.0 2021-06-18 18:36:39\r\nEnter \".help\" for usage hints.\r\nsqlite> .schema bar\r\nCREATE TABLE IF NOT EXISTS \"bar\" (\r\n [baz] TEXT,\r\n [created_at] FLOAT DEFAULT '''CURRENT_TIMESTAMP'''\r\n);\r\n```\r\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/336/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1279144769, "node_id": "I_kwDOCGYnMM5MPjNB", "number": 448, "title": "Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto'", "user": {"value": 236907, "label": "mungewell"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-06-21T21:48:27Z", "updated_at": "2023-05-08T22:01:00Z", "closed_at": "2023-05-08T22:01:00Z", "author_association": "NONE", "pull_request": null, "body": "Attempting to run the example given here (without extra bracket ;-):\r\nhttps://sqlite-utils.datasette.io/en/stable/python-api.html#reading-rows-from-a-file\r\n```\r\nfrom sqlite_utils.utils import rows_from_file\r\nimport io\r\n\r\nrows, format = rows_from_file(io.StringIO(\"id,name\\n1,Cleo\"))\r\nprint(list(rows), format)\r\n# Outputs [{'id': '1', 'name': 'Cleo'}] Format.CSV\r\n```\r\n\r\nGives error\r\n```\r\n>\"c:\\Program Files\\Python37\\python.exe\" test2.py\r\nTraceback (most recent call last):\r\n File \"test2.py\", line 4, in \r\n rows, format = rows_from_file(io.StringIO(\"id,name\\n1,Cleo\"))\r\n File \"C:\\Users\\swood\\Downloads\\sqlite-utils-main-20220621\\sqlite-utils-main\\sqlite_utils\\utils.py\", line 300, in rows_from_file\r\n first_bytes = buffered.peek(2048).strip()\r\nAttributeError: '_io.StringIO' object has no attribute 'readinto'\r\n```\r\n\r\nI am running Python on Windows.\r\n```\r\n>\"c:\\Program Files\\Python37\\python.exe\"\r\nPython 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32\r\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/448/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1432377191, "node_id": "I_kwDOCGYnMM5VYFdn", "number": 509, "title": "`sqlite-utils transform` breaks DEFAULT string values and STRFTIME()", "user": {"value": 2199875, "label": "kennysong"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-11-02T02:32:23Z", "updated_at": "2023-05-08T21:13:38Z", "closed_at": "2023-05-08T21:13:38Z", "author_association": "NONE", "pull_request": null, "body": "Very nice library! Our team found sqlite-utils through @simonw's [comment on the \"Simple declarative schema migration for SQLite\" article](https://news.ycombinator.com/item?id=31249823), and we were excited to use it, but unfortunately `sqlite-utils transform` seems to break our DB. \r\n\r\nRunning `sqlite-utils transform` to modify a column mangles their DEFAULT values:\r\n\r\n- Default string values are wrapped in extra single quotes\r\n- Function expressions such as [`STRFTIME()`](https://www.sqlite.org/lang_datefunc.html) are turned into strings!\r\n\r\n------\r\n\r\nHere are steps to reproduce:\r\n\r\n**Original database**\r\n\r\n```\r\n$ sqlite3 test.db << EOF\r\nCREATE TABLE mytable (\r\n col1 TEXT DEFAULT 'foo',\r\n col2 TEXT DEFAULT (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW'))\r\n)\r\nEOF\r\n\r\n$ sqlite3 test.db \"SELECT sql FROM sqlite_master WHERE name = 'mytable';\"\r\nCREATE TABLE mytable (\r\n col1 TEXT DEFAULT 'foo',\r\n col2 TEXT DEFAULT (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW'))\r\n)\r\n```\r\n\r\n**Modified database after sqlite-utils**\r\n\r\n```\r\n$ sqlite3 test.db \"INSERT INTO mytable DEFAULT VALUES; SELECT * FROM mytable;\"\r\nfoo|2022-11-02 02:26:58.038\r\n\r\n$ sqlite-utils transform test.db mytable --rename col1 renamedcol1\r\n\r\n$ sqlite3 test.db \"SELECT sql FROM sqlite_master WHERE name = 'mytable';\"\r\nCREATE TABLE \"mytable\" (\r\n [renamedcol1] TEXT DEFAULT '''foo''',\r\n [col2] TEXT DEFAULT 'STRFTIME(''%Y-%m-%d %H:%M:%f'', ''NOW'')'\r\n)\r\n\r\n$ sqlite3 test.db \"INSERT INTO mytable DEFAULT VALUES; SELECT * FROM mytable;\"\r\nfoo|2022-11-02 02:26:58.038\r\n'foo'|STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')\r\n```\r\n\r\n(Related: #336)", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/509/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1465194249, "node_id": "I_kwDOCGYnMM5XVRcJ", "number": 514, "title": "upsert of new row with check constraints fails", "user": {"value": 193185, "label": "cldellow"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-11-26T16:12:23Z", "updated_at": "2023-05-08T21:50:52Z", "closed_at": "2023-05-08T21:50:51Z", "author_association": "NONE", "pull_request": null, "body": "(I originally opened this in https://github.com/simonw/datasette-insert/issues/20, but I see that that library depends on sqlite-utils)\r\n\r\nIn the case of a new row, upsert first adds the row, specifying only its pkeys: https://github.com/simonw/sqlite-utils/blob/965ca0d5f5bffe06cc02cd7741344d1ddddf9d56/sqlite_utils/db.py#L2783-L2787\r\n\r\nThis means that a table with NON NULL (or other constraint) columns that aren't part of the pkey can't have new rows upserted.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/514/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1465194930, "node_id": "PR_kwDOCGYnMM5DvZxa", "number": 515, "title": "upsert new rows with constraints, fixes #514", "user": {"value": 193185, "label": "cldellow"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-11-26T16:15:21Z", "updated_at": "2023-05-08T21:27:11Z", "closed_at": "2023-05-08T21:27:10Z", "author_association": "NONE", "pull_request": "simonw/sqlite-utils/pulls/515", "body": "This fixes #514 by making the initial insert for upserts include all columns, so that new rows can be added to tables with non-pkey columns that have constraints.\r\n\r\n(aside: I'm not a python programmer. `pip`? `pipenv`? `venv`? These are mystical incantations to me. The process to set up this repo for local development and testing was _so easy_. Thank you for the excellent contributing documentation!)\r\n\r\n\r\n----\r\n:books: Documentation preview :books:: https://sqlite-utils--515.org.readthedocs.build/en/515/\r\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/515/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1505568103, "node_id": "PR_kwDOCGYnMM5F609a", "number": 519, "title": "Fixes breaking DEFAULT values", "user": {"value": 13819005, "label": "rhoboro"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-12-21T01:27:52Z", "updated_at": "2023-05-08T21:13:37Z", "closed_at": "2023-05-08T21:13:37Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/519", "body": "Fixes #509, Fixes #336\r\n\r\nThanks for the great library!\r\nI fixed a bug that `sqlite-utils transform` breaks DEFAULT values.\r\nAll tests already present passed with no changes, and I added some tests for this PR.\r\n\r\nIn #509 case, fixed here.\r\n\r\n```shell\r\n$ sqlite3 test.db << EOF\r\nCREATE TABLE mytable (\r\n col1 TEXT DEFAULT 'foo',\r\n col2 TEXT DEFAULT (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW'))\r\n)\r\nEOF\r\n\r\n$ sqlite3 test.db \"SELECT sql FROM sqlite_master WHERE name = 'mytable';\"\r\nCREATE TABLE mytable (\r\n col1 TEXT DEFAULT 'foo',\r\n col2 TEXT DEFAULT (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW'))\r\n)\r\n\r\n$ sqlite3 test.db \"INSERT INTO mytable DEFAULT VALUES; SELECT * FROM mytable;\"\r\nfoo|2022-12-21 01:15:39.669\r\n\r\n$ sqlite-utils transform test.db mytable --rename col1 renamedcol1\r\n$ sqlite3 test.db \"SELECT sql FROM sqlite_master WHERE name = 'mytable';\"\r\nCREATE TABLE \"mytable\" (\r\n [renamedcol1] TEXT DEFAULT 'foo',\r\n [col2] TEXT DEFAULT (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')) # \u2190 Non-String Value\r\n)\r\n\r\n$ sqlite3 test.db \"INSERT INTO mytable DEFAULT VALUES; SELECT * FROM mytable;\"\r\nfoo|2022-12-21 01:15:39.669\r\nfoo|2022-12-21 01:15:56.432\r\n```\r\n\r\nAnd #336 case also fixed.\r\nSpecial values are described [here](https://www.sqlite.org/lang_createtable.html).\r\n\r\n> 3.2. The DEFAULT clause\r\n> ... A default value may also be one of the special case-independent keywords CURRENT_TIME, CURRENT_DATE or CURRENT_TIMESTAMP.\r\n\r\n```shell\r\n$ echo 'create table bar (baz text, created_at timestamp default CURRENT_TIMESTAMP)' | sqlite3 foo.db\r\n$ sqlite3 foo.db\r\nSQLite version 3.39.5 2022-10-14 20:58:05\r\nEnter \".help\" for usage hints.\r\nsqlite> .schema bar\r\nCREATE TABLE bar (baz text, created_at timestamp default CURRENT_TIMESTAMP);\r\nsqlite> .exit\r\n\r\n$ sqlite-utils transform foo.db bar --column-order baz\r\n$ sqlite3 foo.db\r\nSQLite version 3.39.5 2022-10-14 20:58:05\r\nEnter \".help\" for usage hints.\r\nsqlite> .schema bar\r\nCREATE TABLE IF NOT EXISTS \"bar\" (\r\n [baz] TEXT,\r\n [created_at] FLOAT DEFAULT CURRENT_TIMESTAMP\r\n);\r\nsqlite> .exit\r\n\r\n$ sqlite-utils transform foo.db bar --column-order baz\r\n$ sqlite3 foo.db\r\nSQLite version 3.39.5 2022-10-14 20:58:05\r\nEnter \".help\" for usage hints.\r\nsqlite> .schema bar\r\nCREATE TABLE IF NOT EXISTS \"bar\" (\r\n [baz] TEXT,\r\n [created_at] FLOAT DEFAULT CURRENT_TIMESTAMP # \u2190 Non-String Value\r\n);\r\n```\r\n\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--519.org.readthedocs.build/en/519/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/519/reactions\", \"total_count\": 3, \"+1\": 3, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1516644980, "node_id": "I_kwDOCGYnMM5aZip0", "number": 520, "title": "rows_from_file() raises confusing error if file-like object is not in binary mode", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-01-02T19:00:14Z", "updated_at": "2023-05-08T22:08:07Z", "closed_at": "2023-05-08T22:08:07Z", "author_association": "OWNER", "pull_request": null, "body": "I got this error:\r\n\r\n```\r\n File \"/Users/simon/Dropbox/Development/openai-to-sqlite/openai_to_sqlite/cli.py\", line 27, in embeddings\r\n rows, _ = rows_from_file(input)\r\n ^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/simon/.local/share/virtualenvs/openai-to-sqlite-jt4obeb2/lib/python3.11/site-packages/sqlite_utils/utils.py\", line 305, in rows_from_file\r\n first_bytes = buffered.peek(2048).strip()\r\n ^^^^^^^^^^^^^^^^^^^\r\n```\r\nFrom this code:\r\n```python\r\n\r\n@cli.command()\r\n@click.argument(\r\n \"db_path\",\r\n type=click.Path(file_okay=True, dir_okay=False, allow_dash=False),\r\n)\r\n@click.option(\r\n \"-i\",\r\n \"--input\",\r\n type=click.File(\"r\"),\r\n default=\"-\",\r\n)\r\ndef embeddings(db_path, input):\r\n \"Store embeddings for one or more text documents\"\r\n click.echo(\"Here is some output\")\r\n db = sqlite_utils.Database(db_path)\r\n rows, _ = rows_from_file(input)\r\n print(list(rows))\r\n```\r\nThe error went away when I changed it to `type=click.File(\"rb\")`.\r\n\r\nThis should either be called out in the documentation or `rows_from_file()` should be fixed to handle text-mode files in addition to binary files.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/520/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1575131737, "node_id": "I_kwDOCGYnMM5d4ppZ", "number": 525, "title": "Repeated calls to `Table.convert()` fail", "user": {"value": 167893, "label": "mcarpenter"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2023-02-07T22:40:47Z", "updated_at": "2023-05-08T21:59:41Z", "closed_at": "2023-05-08T21:54:02Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "## Summary\r\nWhen using the API, repeated calls to `Table.convert()` do not work correctly since all conversions quietly use the callable (function, lambda) from the first call to `convert()` only. Subsequent invocations with different callables use the callable from the first invocation only.\r\n\r\n## Example\r\n```python\r\nfrom sqlite_utils import Database\r\n\r\ndb = Database(memory=True)\r\ntable = db['table']\r\ncol = 'x'\r\ntable.insert_all([{col: 1}])\r\nprint(table.get(1))\r\n\r\ntable.convert(col, lambda x: x*2)\r\nprint(table.get(1))\r\n\r\ndef zeroize(x):\r\n return 0\r\n#zeroize = lambda x: 0\r\n#zeroize.__name__ = 'zeroize'\r\ntable.convert(col, zeroize)\r\nprint(table.get(1))\r\n```\r\n\r\nOutput:\r\n```\r\n{'x': 1}\r\n{'x': 2}\r\n{'x': 4}\r\n```\r\nExpected:\r\n```\r\n{'x': 1}\r\n{'x': 2}\r\n{'x': 0}\r\n```\r\n\r\n## Explanation\r\nThis is some relevant [documentation](https://github.com/simonw/sqlite-utils/blob/1491b66dd7439dd87cd5cd4c4684f46eb3c5751b/docs/python-api.rst#registering-custom-sql-functions:~:text=By%20default%20registering%20a%20function%20with%20the%20same%20name%20and%20number%20of%20arguments%20will%20have%20no%20effect).\r\n\r\n * `Table.convert()` takes a `Callable` to perform data conversion on a column\r\n * The `Callable` is passed to `Database.register_function()`\r\n * `Database.register_function()` uses the callable's `__name__` attribute for registration\r\n * (Aside: all lambdas have a `__name__` of ``: I thought this was the problem, and it was close, but not quite)\r\n * However `convert()` first wraps the callable by local function [`convert_value()`](https://github.com/simonw/sqlite-utils/blob/fc221f9b62ed8624b1d2098e564f525c84497969/sqlite_utils/db.py#L2661)\r\n * Consequently `register_function()` sees name `convert_value` for all invocations from `convert()`\r\n * `register_function()` silently ignores registrations using the same name, retaining only the first such registration\r\n\r\nThere's a mismatch between the comments and the code: https://github.com/simonw/sqlite-utils/blob/fc221f9b62ed8624b1d2098e564f525c84497969/sqlite_utils/db.py#L404\r\n\r\nbut actually the existing function is returned/used instead (as the \"registering custom sql functions\" doc I linked above says too). Seems like this can be rectified to match the comment?\r\n\r\n## Suggested fix\r\nI think there are four things:\r\n1. The call to `register_function()` from `convert()`should have an explicit `name=` parameter (to continue using `convert_value()` and the progress bar).\r\n2. For functions, this name can be the real function name. (I understand the sqlite api needs a name, and it's nice if those are recognizable names where possible). For lambdas would `'lambda-{uuid}'` or similar be acceptable? \r\n3. `register_function()` really should throw an error on repeated attempts to register a duplicate (function, arity)-pair.\r\n4. A test? I haven't looked at the test framework here but seems this should be testable.\r\n\r\n## See also \r\n- #458 ", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/525/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1576990618, "node_id": "PR_kwDOCGYnMM5JkkED", "number": 526, "title": "Fix repeated calls to `Table.convert()`", "user": {"value": 167893, "label": "mcarpenter"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-02-09T00:14:49Z", "updated_at": "2023-05-08T21:56:05Z", "closed_at": "2023-05-08T21:53:58Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/526", "body": "Fixes #525. All tests pass.\r\n\r\nThere's perhaps a better way to name lambdas? There could be a collision if a caller passes a function with name like `lambda_123456`.\r\n\r\nSQLite [documentation](https://www.sqlite.org/appfunc.html) is a little, ah, lite on function name specs. If there is a character that can be used in place of underscore in a SQLite function name that is not permitted in a Python function identifier then that could be a good way to prevent accidental collisions. (I tried dash, colon, dot, no joy).\r\n\r\nOtherwise, there is little chance of this happening and if it should happen the risk is mitigated by now throwing an exception in the case of a (name, arity) collision without `replace=True`.\r\n\r\n\r\n----\r\n:books: Documentation preview :books:: https://sqlite-utils--526.org.readthedocs.build/en/526/\r\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/526/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1578790070, "node_id": "I_kwDOCGYnMM5eGmy2", "number": 527, "title": "`Table.convert()` skips falsey values", "user": {"value": 167893, "label": "mcarpenter"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2023-02-10T00:00:52Z", "updated_at": "2023-05-09T21:15:05Z", "closed_at": "2023-05-08T21:03:24Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "# Summary\r\n\r\nBy design, `Table.convert()` does [not attempt](https://github.com/simonw/sqlite-utils/blob/fc221f9b62ed8624b1d2098e564f525c84497969/sqlite_utils/db.py#L2663) conversion of falsey values (`None`, `\"\"`, `0`, ...). This is surprising (directly contradicts the docstring) and `convert()` may quietly skip cells where the user assumed a conversion would take place. \r\n\r\n# Example\r\nIncrement a column of integers by one\r\n\r\n``` python\r\nfrom sqlite_utils import Database\r\n\r\ndb = Database(memory=True)\r\ntable = db['table']\r\ncol = 'x'\r\ntable.insert_all([{col: 0}, {col:1}])\r\nprint(table.get(1)) # 0\r\nprint(table.get(2)) # 1\r\nprint()\r\n\r\ntable.convert(col, lambda x: x+1)\r\nprint(table.get(1)) # got 0, expected 1 \u26a0\u26a0\u26a0\r\nprint(table.get(2)) # got 2, expected 2\r\n```\r\n\r\nAnother example might be, say, transforming cells containing empty string to `NULL`.\r\n\r\n# Discussion\r\n\r\nThis was, I think, a pragmatic choice so that consumers can skip writing guard clauses for these falsey values (particularly from the CLI). But this surprising undocumented behavior can lead to incorrect data. I don't think this is a good trade-off between convenience and correctness.\r\n\r\nIn the absence of this convenience users will either have to write guard clauses into their conversion expressions (or adapt the called function to do the same), so: \r\n``` python\r\n fn(value) if value else value\r\n```\r\ninstead of:\r\n``` python\r\n fn(value)\r\n```\r\nThis is more typing and sometimes I will forget, and there will be errors. (But they will be noisy errors, which is a good thing).\r\n\r\nSuch a change will certainly inconvenience some existing consumers; there will be some breakage. But I think this is worth it to avoid quietly not converting some values by default, which can lead to quietly bad data.\r\n\r\nI have a PR that I will attach, please take a look and see what you think.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/527/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1578793661, "node_id": "PR_kwDOCGYnMM5Jqn1u", "number": 528, "title": "Enable `Table.convert()` on falsey values", "user": {"value": 167893, "label": "mcarpenter"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-02-10T00:04:09Z", "updated_at": "2023-05-08T21:08:23Z", "closed_at": "2023-05-08T21:08:23Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/528", "body": "Fixes #527\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--528.org.readthedocs.build/en/528/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/528/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1620254998, "node_id": "I_kwDOCGYnMM5gkyEW", "number": 532, "title": "Show more information when JSON can't be imported with sqlite-utils insert", "user": {"value": 83080728, "label": "voltagex"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-03-12T06:41:44Z", "updated_at": "2023-05-08T20:32:16Z", "closed_at": "2023-05-08T20:32:02Z", "author_association": "NONE", "pull_request": null, "body": "I am currently trying to import the [JSON export of my data from Discord](https://support.discord.com/hc/en-us/articles/360004027692-Requesting-a-Copy-of-your-Data), specifically `activity/reporting/events-*.json`\r\n\r\n```\r\nsqlite-utils.exe insert test.db reporting events-2023-00000-of-00001.json\r\n [###################################-] 99% 00:00:00\r\nError: Invalid JSON - use --csv for CSV or --tsv for TSV files\r\n```\r\n\r\nPlease show more information as to *why* this is invalid, if possible.\r\n\r\nI am using version 3.30 with Python 3.10 on Windows 11.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/532/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1622640374, "node_id": "I_kwDOCGYnMM5gt4b2", "number": 534, "title": " ResourceWarning: unclosed file", "user": {"value": 1244826, "label": "djhenderson"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-03-14T03:02:18Z", "updated_at": "2023-05-08T19:56:29Z", "closed_at": "2023-05-08T19:56:29Z", "author_association": "NONE", "pull_request": null, "body": "Issuing either\r\n\r\n```\r\npy -Wdefault -m sqlite_utils insert dogs.db dogs dogs0.csv --csv\r\n [#############-----------------------] 36%\r\n [####################################] 100%C:\\Users\\Doug\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\sqlite_utils\\cli.py:1187: ResourceWarning: unclosed file <_io.TextIOWrapper name='dogs0.csv' encoding='utf-8-sig'>\r\n insert_upsert_implementation(\r\nResourceWarning: Enable tracemalloc to get the object allocation traceback\r\n```\r\nor\r\n```\r\nset pythonwarnings=default\r\nsqlite-utils insert dogs.db dogs dogs0.csv --csv\r\n [#############-----------------------] 36%\r\n [####################################] 100%C:\\Users\\Doug\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\sqlite_utils\\cli.py:1187: ResourceWarning: unclosed file <_io.TextIOWrapper name='dogs0.csv' encoding='utf-8-sig'>\r\n insert_upsert_implementation(\r\nResourceWarning: Enable tracemalloc to get the object allocation traceback\r\n```\r\n\r\nexhibits a ResourceWarning indicating that the CSV file being loaded is not closed.\r\n\r\nsqlite-utils --version\r\nsqlite-utils, version 3.30\r\npy --version\r\nPython 3.11.2\r\nWindows Version 10.0.19045 Build 19045\r\nSQLite version 3.41.0 2023-02-21 18:09:37\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/534/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1665200812, "node_id": "PR_kwDOCGYnMM5OKveS", "number": 537, "title": "Support self-referencing FKs in `Table.create`", "user": {"value": 544011, "label": "numist"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-04-12T20:26:59Z", "updated_at": "2023-05-08T22:45:33Z", "closed_at": "2023-05-08T21:10:01Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/537", "body": "\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--537.org.readthedocs.build/en/537/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/537/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1695428235, "node_id": "I_kwDOCGYnMM5lDi6L", "number": 538, "title": "`table.upsert_all` fails to write rows when `not_null` is present", "user": {"value": 1231935, "label": "xavdid"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 9, "created_at": "2023-05-04T07:30:38Z", "updated_at": "2023-05-08T20:06:35Z", "closed_at": "2023-05-08T19:27:02Z", "author_association": "NONE", "pull_request": null, "body": "I found an odd bug today, where calls to `table.upsert_all` don't write rows if you include the `not_null` kwarg.\r\n\r\n## Repro Example\r\n\r\n```py\r\nfrom sqlite_utils import Database\r\n\r\ndb = Database(\"upsert-test.db\")\r\n\r\ndb[\"comments\"].upsert_all(\r\n [{\"id\": 1, \"name\": \"david\"}],\r\n pk=\"id\",\r\n not_null=[\"name\"],\r\n)\r\n\r\nassert list(db[\"comments\"].rows) # err!\r\n```\r\n\r\nThe schema is correctly created:\r\n\r\n```sql\r\nCREATE TABLE [comments] (\r\n [id] INTEGER PRIMARY KEY,\r\n [name] TEXT NOT NULL\r\n)\r\n```\r\n\r\nBut no rows are created. Removing either the `not_null` kwargs works as expected, as does an `insert_all` call.\r\n\r\n## Version Info\r\n\r\n- Python: `3.11.0`\r\n- sqlite-utils: `3.30`\r\n- sqlite: `3.39.5 2022-10-14`", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/538/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1701018909, "node_id": "I_kwDOCGYnMM5lY30d", "number": 543, "title": "Tests broken on Windows due to new convert() lambda names", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-05-08T22:11:29Z", "updated_at": "2023-05-08T22:19:04Z", "closed_at": "2023-05-08T22:19:04Z", "author_association": "OWNER", "pull_request": null, "body": "https://github.com/simonw/sqlite-utils/actions/runs/4920084038/jobs/8788501314\r\n```python\r\nsql = 'update [example] set [dt] = lambda_-9223371942137158589([dt]);'\r\n```\r\nFrom:\r\n- #526", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/543/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"}