{"html_url": "https://github.com/dogsheep/dogsheep-photos/issues/32#issuecomment-791053721", "issue_url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/32", "id": 791053721, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MTA1MzcyMQ==", "user": {"value": 6213, "label": "dsisnero"}, "created_at": "2021-03-05T00:31:27Z", "updated_at": "2021-03-05T00:31:27Z", "author_association": "NONE", "body": "I am getting the same thing for US West (N. California) us-west-1", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 803333769, "label": "KeyError: 'Contents' on running upload"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791089881", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 791089881, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MTA4OTg4MQ==", "user": {"value": 28565, "label": "maxhawkins"}, "created_at": "2021-03-05T02:03:19Z", "updated_at": "2021-03-05T02:03:19Z", "author_association": "NONE", "body": "I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout.\r\n\r\nIs it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/766#issuecomment-791509910", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/766", "id": 791509910, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MTUwOTkxMA==", "user": {"value": 6371750, "label": "JBPressac"}, "created_at": "2021-03-05T15:57:35Z", "updated_at": "2021-03-05T16:35:21Z", "author_association": "CONTRIBUTOR", "body": "Hello, \r\nI have the same wildcards search problems with an instance of Datasette. http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwerz&_sort=rowid is OK but http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwe* is not (FTS is activated on \"Reference\" \"IntituleAnalyse\" \"NomDuProducteur\" \"PresentationDuContenu\" \"Notes\"). \r\n\r\nNotice that a SQL query as below launched directly from SQLite in the server's shell, retrieves results.\r\n\r\n`select * from fonds_auguste_dupouy_1872_1967_fts where IntituleAnalyse MATCH \"gwe*\";`\r\n\r\nThanks,", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 617323873, "label": "Enable wildcard-searches by default"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791530093", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 791530093, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MTUzMDA5Mw==", "user": {"value": 306240, "label": "UtahDave"}, "created_at": "2021-03-05T16:28:07Z", "updated_at": "2021-03-05T16:28:07Z", "author_association": "NONE", "body": "> I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout.\r\n> \r\n> Is it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once?\r\n\r\n@maxhawkins a limitation of the python mbox module is it loads the entire mbox into memory. I did find another approach to this problem that didn't use the builtin python mbox module and created a generator so that it didn't have to load the whole mbox into memory. I was hoping to use standard library modules, but this might be a good reason to investigate that approach a bit more. My worry is making sure a custom processor handles all the ins and outs of the mbox format correctly.\r\n\r\nHm. As I'm writing this, I thought of something. I think I can parse each message one at a time, and then use an mbox function to load each message using the python mbox module. That way the mbox module can still deal with the specifics of the mbox format, but I can use a generator.\r\n\r\nI'll give that a try. Thanks for the feedback @maxhawkins and @simonw. I'll give that a try.\r\n\r\n@simonw can we hold off on merging this until I can test this new approach?", "reactions": "{\"total_count\": 3, \"+1\": 3, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}