github: issues: 6 rows where user = 96218

6 rows where user = 96218

Search:

descending

id ▼	node_id	number	title	user	state	comments	created_at	updated_at	closed_at	author_association	pull_request	body	repo	type	reactions	draft	state_reason
686978131	MDU6SXNzdWU2ODY5NzgxMzE=	139	insert_all(..., alter=True) should work for new columns introduced after the first 100 records	simonwiles 96218	closed	7	2020-08-27T06:25:25Z	2020-08-28T22:48:51Z	2020-08-28T22:30:14Z	CONTRIBUTOR		Is there a way to make `.insert_all()` work properly when new columns are introduced outside the first 100 records (with or without the `alter=True` argument)? I'm using `.insert_all()` to bulk insert ~3-4k records at a time and it is common for records to need to introduce new columns. However, if new columns are introduced after the first 100 records, `sqlite_utils` doesn't even raise the `OperationalError: table ... has no column named ...` exception; it just silently drops the extra data and moves on. It took me a while to find this little snippet in the [documentation for `.insert_all()`](https://sqlite-utils.readthedocs.io/en/stable/python-api.html#bulk-inserts) (it's not mentioned under [Adding columns automatically on insert/update](https://sqlite-utils.readthedocs.io/en/stable/python-api.html#bulk-inserts)): > The column types used in the CREATE TABLE statement are automatically derived from the types of data in that first batch of rows. _Any additional or missing columns in subsequent batches will be ignored._ I tried changing the `batch_size` argument to the total number of records, but it seems only to effect the number of rows that are committed at a time, and has no influence on this problem. Is there a way around this that you would suggest? It seems like it should raise an exception at least.	sqlite-utils 140912432	issue	{"url": "https://api.github.com/repos/simonw/sqlite-utils/issues/139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}		completed
688386219	MDExOlB1bGxSZXF1ZXN0NDc1NjY1OTg0	142	insert_all(..., alter=True) should work for new columns introduced after the first 100 records	simonwiles 96218	closed	3	2020-08-28T22:22:57Z	2020-08-30T07:28:23Z	2020-08-28T22:30:14Z	CONTRIBUTOR	simonw/sqlite-utils/pulls/142	Closes #139.	sqlite-utils 140912432	pull	{"url": "https://api.github.com/repos/simonw/sqlite-utils/issues/142/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	0
688659182	MDU6SXNzdWU2ODg2NTkxODI=	145	Bug when first record contains fewer columns than subsequent records	simonwiles 96218	closed	2	2020-08-30T05:44:44Z	2020-09-08T23:21:23Z	2020-09-08T23:21:23Z	CONTRIBUTOR		`insert_all()` selects the maximum batch size based on the number of fields in the first record. If the first record has fewer fields than subsequent records (and `alter=True` is passed), this can result in SQL statements with more than the maximum permitted number of host parameters. This situation is perhaps unlikely to occur, but could happen if the first record had, say, 10 columns, such that `batch_size` (based on `SQLITE_MAX_VARIABLE_NUMBER = 999`) would be 99. If the next 98 rows had 11 columns, the resulting SQL statement for the first batch would have `10 * 1 + 11 * 98 = 1088` host parameters (and subsequent batches, if the data were consistent from thereon out, would have `99 * 11 = 1089`). I suspect that this bug is masked somewhat by the fact that while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with `SQLITE_MAX_VARIABLE_NUMBER` set to 250,000, and I believe this is the case for homebrew installations too. A test for this issue might look like this: ```python def test_columns_not_in_first_record_should_not_cause_batch_to_be_too_large(fresh_db): # sqlite on homebrew and Debian/Ubuntu etc. is typically compiled with # SQLITE_MAX_VARIABLE_NUMBER set to 250,000, so we need to exceed this value to # trigger the error on these systems. THRESHOLD = 250000 extra_columns = 1 + (THRESHOLD - 1) // 99 records = [ {"c0": "first record"}, # one column in first record -> batch_size = 100 # fill out the batch with 99 records with enough columns to exceed THRESHOLD *[ dict([("c{}".format(i), j) for i in range(extra_columns)]) for j in range(99) ] ] try: fresh_db["too_many_columns"].insert_all(records, a…	sqlite-utils 140912432	issue	{"url": "https://api.github.com/repos/simonw/sqlite-utils/issues/145/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}		completed
688668680	MDExOlB1bGxSZXF1ZXN0NDc1ODc0NDkz	146	Handle case where subsequent records (after first batch) include extra columns	simonwiles 96218	closed	5	2020-08-30T07:13:58Z	2020-09-08T23:20:37Z	2020-09-08T23:20:37Z	CONTRIBUTOR	simonw/sqlite-utils/pulls/146	Addresses #145. I think this should do the job. If it meets with your approval I'll update this PR to include an update to the documentation -- I came across this bug while preparing a PR to update the documentation around `batch_size` in any event.	sqlite-utils 140912432	pull	{"url": "https://api.github.com/repos/simonw/sqlite-utils/issues/146/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	0
688670158	MDU6SXNzdWU2ODg2NzAxNTg=	147	SQLITE_MAX_VARS maybe hard-coded too low	simonwiles 96218	open	7	2020-08-30T07:26:45Z	2021-02-15T21:27:55Z		CONTRIBUTOR		I came across this while about to open an issue and PR against the documentation for `batch_size`, which is a bit incomplete. As mentioned in #145, while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too. In working to understand what `batch_size` was actually doing and why, I realized that by setting `SQLITE_MAX_VARS` in `db.py` to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to `insert_all()` my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save... Unfortunately, it seems that `sqlite3` in the python standard library doesn't expose the `get_limit()` C API (even though `pysqlite` used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding `SQLITE_MAX_VARS` to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is `echo "" \| sqlite3 -cmd ".limits variable_number"` (only available in `sqlite >= 2015-05-07 (3.8.10)`). Obviously this couldn't be relied upon in `sqlite_utils`, but I wonder what your opinion would be about exposing `SQLITE_MAX_VARS` as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.	sqlite-utils 140912432	issue	{"url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}
697030843	MDExOlB1bGxSZXF1ZXN0NDgzMDI3NTg3	156	Typos in tests	simonwiles 96218	closed	1	2020-09-09T18:00:58Z	2020-09-09T18:24:50Z	2020-09-09T18:21:23Z	CONTRIBUTOR	simonw/sqlite-utils/pulls/156	One of these is my fault, and the other is one I just happened to come across. They're harmless, but might as well be fixed.	sqlite-utils 140912432	pull	{"url": "https://api.github.com/repos/simonw/sqlite-utils/issues/156/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	0

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);