github: issue_comments: 18 rows where "created_at" is on date 2020-05-01 sorted by id descending

18 rows where "created_at" is on date 2020-05-01 sorted by id descending

Search:

descending

id ▲	html_url	issue_url	node_id	user	created_at	updated_at	author_association	body	reactions	issue
622599528	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622599528	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU5OTUyOA==	b0b5h4rp13 32605365	2020-05-01T22:49:12Z	2020-05-02T11:15:44Z	CONTRIBUTOR	With SQLITE_MAX_VARS = 999, or even 899, This hits the problem with the batch rows causing a overflow (works fine if SQLITE_MAX_VARS = 799). p.s. I have tried a few list of dicts to sqlite modules and this was the easiest to use/understand ------------- file begins ------------------ import sqlite_utils as su data = [ {'tickerId': 913324382, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'CONSTELLATION B', 'symbol': 'STZ B', 'disSymbol': 'STZ-B', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '163.13', 'change': '6.46', 'changeRatio': '0.0412', 'marketValue': '31180699895.63', 'volume': '417', 'turnoverRate': '0.0000'}, {'tickerId': 913323791, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Molina Health', 'symbol': 'MOH', 'disSymbol': 'MOH', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '173.25', 'change': '9.28', 'changeRatio': '0.0566', 'pPrice': '173.25', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '10520341695.50', 'volume': '1281557', 'turnoverRate': '0.0202'}, {'tickerId': 913257501, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Seattle Genetics', 'symbol': 'SGEN', 'disSymbol': 'SGEN', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '145.64', 'change': '8.41', 'changeRatio': '0.0613', 'pPrice': '146.45', 'pChange': '0.8100', 'pChRatio': '0.0056', 'marketValue': '25117961347.60', 'volume': '2791411', 'turnoverRate': '0.0162'}, {'tickerId': 925381971, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Bandwidth', 'symbol': 'BAND', 'disSymbol': 'BAND', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', '…	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622587177	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622587177	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU4NzE3Nw==	simonw 9599	2020-05-01T22:07:51Z	2020-05-01T22:07:51Z	OWNER	This is my failed attempt to recreate the bug (plus some extra debugging output): ```diff % git diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index dd49d5c..ea42aea 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -1013,7 +1013,11 @@ class Table(Queryable): assert ( num_columns <= SQLITE_MAX_VARS ), "Rows can have a maximum of {} columns".format(SQLITE_MAX_VARS) + print("default batch_size = ", batch_size) batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns)) + print("new batch_size = {},num_columns = {}, MAX_VARS // num_columns = {}".format( + batch_size, num_columns, SQLITE_MAX_VARS // num_columns + )) self.last_rowid = None self.last_pk = None for chunk in chunks(itertools.chain([first_record], records), batch_size): @@ -1124,6 +1128,9 @@ class Table(Queryable): ) flat_values = list(itertools.chain(*values)) queries_and_params = [(sql, flat_values)] + print(sql.count("?"), len(flat_values)) + + # print(json.dumps(queries_and_params, indent=4)) with self.db.conn: for query, params in queries_and_params: diff --git a/tests/test_create.py b/tests/test_create.py index 5290cd8..52940df 100644 --- a/tests/test_create.py +++ b/tests/test_create.py @@ -853,3 +853,33 @@ def test_create_with_nested_bytes(fresh_db): record = {"id": 1, "data": {"foo": b"bytes"}} fresh_db["t"].insert(record) assert [{"id": 1, "data": '{"foo": "b\'bytes\'"}'}] == list(fresh_db["t"].rows) + + +def test_create_throws_useful_error_with_increasing_number_of_columns(fresh_db): + # https://github.com/simonw/sqlite-utils/issues/103 + def rows(): + yield {"name": 0} + for i in range(1, 1001): + yield { + "name": i, + "age": i, + "size": i, + "name2": i, …	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622584433	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622584433	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU4NDQzMw==	simonw 9599	2020-05-01T21:57:52Z	2020-05-01T21:57:52Z	OWNER	@b0b5h4rp13 I'm having trouble creating a test that triggers this bug. Could you share a chunk of code that replicates what you're seeing here?	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622565276	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622565276	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU2NTI3Ng==	simonw 9599	2020-05-01T20:57:16Z	2020-05-01T20:57:16Z	OWNER	I'm reconsidering this: I think this is going to happen ANY time someone has at least one row that is wider than the first row. So at the very least I should show a more understandable error message.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622563188	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622563188	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU2MzE4OA==	simonw 9599	2020-05-01T20:51:24Z	2020-05-01T20:51:29Z	OWNER	Hopefully anyone who runs into this problem in the future will search for and find this issue thread!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622563059	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622563059	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU2MzA1OQ==	simonw 9599	2020-05-01T20:51:01Z	2020-05-01T20:51:01Z	OWNER	I'm not sure what to do about this. I was thinking the solution would be to look at ALL of the rows in a batch before deciding on the maximum number of columns, but that doesn't work because we calculate batch size based on the number of columns! I think my recommendation here is to manually pass a `batch_size=` argument to `.insert_all()` if you run into this error.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622561944	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622561944	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU2MTk0NA==	simonw 9599	2020-05-01T20:47:51Z	2020-05-01T20:47:51Z	OWNER	Yup we only take the number of columns in the first record into account at the moment: https://github.com/simonw/sqlite-utils/blob/d56029549acae0b0ea94c5a0f783e3b3895d9218/sqlite_utils/db.py#L1007-L1016	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622561585	https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622561585	https://api.github.com/repos/simonw/sqlite-utils/issues/103	MDEyOklzc3VlQ29tbWVudDYyMjU2MTU4NQ==	simonw 9599	2020-05-01T20:46:50Z	2020-05-01T20:46:50Z	OWNER	The varying number of columns thing is interesting - I don't think the tests cover that case much if at all.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472
622558889	https://github.com/simonw/sqlite-utils/issues/105#issuecomment-622558889	https://api.github.com/repos/simonw/sqlite-utils/issues/105	MDEyOklzc3VlQ29tbWVudDYyMjU1ODg4OQ==	simonw 9599	2020-05-01T20:40:06Z	2020-05-01T20:40:06Z	OWNER	Documentation: https://sqlite-utils.readthedocs.io/en/latest/cli.html#listing-views	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	"sqlite-utils views" command 610853576
622461948	https://github.com/dogsheep/github-to-sqlite/issues/37#issuecomment-622461948	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/37	MDEyOklzc3VlQ29tbWVudDYyMjQ2MTk0OA==	simonw 9599	2020-05-01T16:36:42Z	2020-05-01T16:36:42Z	MEMBER	It should only create views if the underlying tables exist.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Mechanism for creating views if they don't yet exist 610843136
622461537	https://github.com/dogsheep/github-to-sqlite/issues/37#issuecomment-622461537	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/37	MDEyOklzc3VlQ29tbWVudDYyMjQ2MTUzNw==	simonw 9599	2020-05-01T16:35:40Z	2020-05-01T16:35:40Z	MEMBER	This will check if the view exists and has the exact same matching definition as the one we want. If it doesn't, we will drop it (if it exists) and recreate it.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Mechanism for creating views if they don't yet exist 610843136
622461223	https://github.com/dogsheep/github-to-sqlite/issues/12#issuecomment-622461223	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/12	MDEyOklzc3VlQ29tbWVudDYyMjQ2MTIyMw==	simonw 9599	2020-05-01T16:34:52Z	2020-05-01T16:34:52Z	MEMBER	Blocked on #37	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add this view for seeing new releases 520756546
622461122	https://github.com/dogsheep/github-to-sqlite/issues/10#issuecomment-622461122	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/10	MDEyOklzc3VlQ29tbWVudDYyMjQ2MTEyMg==	simonw 9599	2020-05-01T16:34:39Z	2020-05-01T16:34:39Z	MEMBER	Blocked on #37	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add this repos_starred view 516967682
622461025	https://github.com/dogsheep/github-to-sqlite/issues/36#issuecomment-622461025	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/36	MDEyOklzc3VlQ29tbWVudDYyMjQ2MTAyNQ==	simonw 9599	2020-05-01T16:34:24Z	2020-05-01T16:34:24Z	MEMBER	Blocked on #37	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add view for better display of dependent repos 610842926
622450636	https://github.com/simonw/datasette/issues/749#issuecomment-622450636	https://api.github.com/repos/simonw/datasette/issues/749	MDEyOklzc3VlQ29tbWVudDYyMjQ1MDYzNg==	simonw 9599	2020-05-01T16:08:46Z	2020-05-01T16:08:46Z	OWNER	Proposed solution: on Cloud Run don't show the "download database" link if the database file is larger than 32MB. I can do this with a new config setting, `max_db_mb`, which is automatically set by the `publish cloudrun` command. This is consistent with the existing `max_csv_mb` setting: https://datasette.readthedocs.io/en/stable/config.html#max-csv-mb I should set `max_csv_mb` to 32MB on Cloud Run deploys as well.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Cloud Run fails to serve database files larger than 32MB 610829227
622279374	https://github.com/dogsheep/github-to-sqlite/issues/33#issuecomment-622279374	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/33	MDEyOklzc3VlQ29tbWVudDYyMjI3OTM3NA==	garethr 2029	2020-05-01T07:12:47Z	2020-05-01T07:12:47Z	NONE	I also go it working with: ```yaml run: echo ${{ secrets.github_token }} \| github-to-sqlite auth ```	{"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Fall back to authentication via ENV 609950090
622214262	https://github.com/dogsheep/github-to-sqlite/issues/35#issuecomment-622214262	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/35	MDEyOklzc3VlQ29tbWVudDYyMjIxNDI2Mg==	simonw 9599	2020-05-01T02:10:32Z	2020-05-01T02:11:19Z	MEMBER	This sped that query up even more - down to 4ms. ```sql create index issue_comments_issue on issue_comments(issue); ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Create index on issue_comments(user) and other foreign keys 610511450
622213950	https://github.com/dogsheep/github-to-sqlite/issues/35#issuecomment-622213950	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/35	MDEyOklzc3VlQ29tbWVudDYyMjIxMzk1MA==	simonw 9599	2020-05-01T02:09:04Z	2020-05-01T02:09:04Z	MEMBER	It sped up this query a lot - 2.5s down to 300ms: ```sql select repos.full_name, json_object( 'href', 'https://github.com/' \|\| repos.full_name \|\| '/issues/' \|\| issues.number, 'label', '#' \|\| issues.number ) as issue, issues.title, users.login, users.id, issues.state, issues.locked, issues.assignee, issues.milestone, issues.comments, issues.created_at, issues.updated_at, issues.closed_at, issues.author_association, issues.pull_request, issues.repo, issues.type from issues join repos on repos.id = issues.repo join users on issues.user = users.id where issues.state = 'open' and issues.user not in (9599, 27856297) and not exists ( select id from issue_comments where issue_comments.user = 9599 and issues.id = issue_comments.issue ) order by issues.updated_at desc; ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Create index on issue_comments(user) and other foreign keys 610511450

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);