id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason 807174161,MDU6SXNzdWU4MDcxNzQxNjE=,227,Error reading csv files with large column data,295329,closed,0,,,4,2021-02-12T11:51:47Z,2021-02-16T11:48:03Z,2021-02-14T21:17:19Z,NONE,,"*Feel free to close this issue - I mostly added it for reference for future folks that run into this :)* I have a CSV file with one column that has very long strings. When i try to import this file via the `insert` command I get the following error: ``` sqlite-utils insert database.db table_name file_with_large_column.csv Traceback (most recent call last): File ""/usr/local/bin/sqlite-utils"", line 10, in sys.exit(cli()) File ""/usr/local/lib/python3.7/site-packages/click/core.py"", line 829, in __call__ return self.main(*args, **kwargs) File ""/usr/local/lib/python3.7/site-packages/click/core.py"", line 782, in main rv = self.invoke(ctx) File ""/usr/local/lib/python3.7/site-packages/click/core.py"", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/usr/local/lib/python3.7/site-packages/click/core.py"", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/usr/local/lib/python3.7/site-packages/click/core.py"", line 610, in invoke return callback(*args, **kwargs) File ""/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py"", line 774, in insert default=default, File ""/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py"", line 705, in insert_upsert_implementation docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs File ""/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py"", line 1852, in insert_all first_record = next(records) File ""/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py"", line 703, in docs = (decode_base64_values(doc) for doc in docs) File ""/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py"", line 681, in docs = (dict(zip(headers, row)) for row in reader) _csv.Error: field larger than field limit (131072) ``` Built with the docker image `datasetteproject/datasette:0.54` with the following versions: ``` # sqlite-utils --version sqlite-utils, version 3.4.1 # datasette --version datasette, version 0.54 ``` It appears this is a [known issue](https://stackoverflow.com/a/54517228/2761423) reading in csv files in python and [doesn't look to be modifiable](https://github.com/python/cpython/blob/ea46579067fd2d4e164d6605719ffec690c4d621/Modules/_csv.c#L1685) through system / env vars (i may be very wrong on this). Noting that using sqlite3 `import` command work without error (not using the python csv reader) ``` sqlite3 database.db sqlite> .mode csv sqlite> .import file_with_large_column.csv table_name ``` Sadly I couldn't see an easy way around this while using the cli as it appears this value needs to be changed in python code. FWIW I've switched to using https://datasette.io/tools/csvs-to-sqlite for importing csv data and it's working well. Finally, I'm loving https://datasette.io/ thank you very much for an amazing tool and data ecosytem 🙇‍♀️ ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/227/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 807433181,MDU6SXNzdWU4MDc0MzMxODE=,1224,can't start immutable databases from configuration dir mode,295329,closed,0,,,0,2021-02-12T17:50:13Z,2021-03-29T00:17:31Z,2021-03-29T00:17:31Z,CONTRIBUTOR,,"Say I have a `/databases/` directory with multiple sqlite db files in that dir (`1.db` & `2.db`) and an `inspect-data.json` file. If I start datasette via `datasette -h 0.0.0.0 /databases/` then the resulting databases are set to `is_mutable: true` as inspected via http://127.0.0.1:8001/-/databases.json I don't want to have to list out the databases by name, e.g. `datasette -i /databases/1.db -i /databases/2.db` as i want the system to autodetect the sqlite dbs i have in the configuration directory According to the docs outlined in https://docs.datasette.io/en/latest/settings.html?highlight=immutable#configuration-directory-mode this should be possible > `inspect-data.json` the result of running datasette inspect - any database files listed here will be treated as immutable, so they should not be changed while Datasette is running I believe that if the `inspect-json.json` file present, then in theory the databases will be automatically set to immutable via this code https://github.com/simonw/datasette/blob/9603d893b9b72653895318c9104d754229fdb146/datasette/app.py#L211-L216 However it appears the Click Multiple Options will return a tuple via https://github.com/simonw/datasette/blob/9603d893b9b72653895318c9104d754229fdb146/datasette/cli.py#L311-L317 The resulting tuple is passed to the Datasette app via `kwargs` and overrides the behaviour to set the databases to immutable via this arg https://github.com/simonw/datasette/blob/9603d893b9b72653895318c9104d754229fdb146/datasette/app.py#L182 If you think this is a bug and needs fixing, I am willing to make a PR to check for the empty `immutable` tuple before calling the Datasette class initializer as I think leaving that class interface alone is the best path here. Thoughts? Also - i'm loving Datasette, it truly is a wonderful tool, thank you :)",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1224/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 810507413,MDExOlB1bGxSZXF1ZXN0NTc1MTg3NDU3,1229,ensure immutable databses when starting in configuration directory mode with,295329,closed,0,,,3,2021-02-17T20:18:26Z,2022-04-22T13:16:36Z,2021-03-29T00:17:32Z,CONTRIBUTOR,simonw/datasette/pulls/1229,"fixes #1224 This PR ensures all databases found in a configuration directory that match the files in `inspect-data.json` will be set to `immutable` as outlined in https://docs.datasette.io/en/latest/settings.html#configuration-directory-mode specifically on building the `datasette` instance it checks: - if `immutables` is an empty tuple - as passed by the cli code - if `immutables` is the default function value `None` - when it's not explicitly set And correctly builds the immutable database list from the `inspect-data[file]` keys. Note for this to work the `inspect-data.json` file must contain `file` paths which are relative to the configuration directory otherwise the file paths won't match and the dbs won't be set to immutable. I couldn't find an easy way to test this due to the way `make_app_client` works, happy to take directions on adding a test for this. I've updated the relevant docs as well, i.e. use the `inspect` cli cmd from the config directory path to create the relevant file ``` cd $config_dir datasette inspect *.db --inspect-file=inspect-data.json ``` https://docs.datasette.io/en/latest/performance.html#using-datasette-inspect",107914493,pull,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1229/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0,