issue_comments
525 rows where author_association = "MEMBER" sorted by node_id descending
This data as json, CSV (advanced)
Suggested facets: reactions
created_at (date) >30 ✖
- 2020-04-18 23
- 2020-03-23 22
- 2023-03-09 21
- 2020-05-02 15
- 2020-05-05 15
- 2020-09-17 14
- 2020-05-20 13
- 2020-09-03 13
- 2021-03-04 13
- 2020-10-11 10
- 2019-10-16 9
- 2019-10-17 9
- 2020-03-20 9
- 2020-07-18 9
- 2019-10-11 8
- 2020-04-30 8
- 2020-05-03 8
- 2020-09-02 8
- 2020-09-21 8
- 2019-11-03 7
- 2019-11-09 7
- 2020-04-01 7
- 2020-04-21 7
- 2020-05-01 7
- 2020-10-17 7
- 2021-07-25 7
- 2020-04-16 6
- 2020-04-28 6
- 2020-05-10 6
- 2020-07-06 6
- …
id | html_url | issue_url | node_id ▲ | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
879477586 | https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-879477586 | https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12 | MDEyOklzc3VlQ29tbWVudDg3OTQ3NzU4Ng== | simonw 9599 | 2021-07-13T23:50:06Z | 2021-07-13T23:50:06Z | MEMBER | Unfortunately I don't think updating the database is practical, because the export doesn't include unique identifiers which can be used to update existing records and create new ones. Recreating from scratch works around that limitation. I've not explored workouts with SpatiaLite but that's a really good idea. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Some workout columns should be float, not text 727848625 | |
861042050 | https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861042050 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64 | MDEyOklzc3VlQ29tbWVudDg2MTA0MjA1MA== | simonw 9599 | 2021-06-14T22:45:42Z | 2021-06-14T22:45:42Z | MEMBER | I'm definitely interested in supporting events in this tool - see #14. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | feature: support "events" 920636216 | |
861041597 | https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861041597 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64 | MDEyOklzc3VlQ29tbWVudDg2MTA0MTU5Nw== | simonw 9599 | 2021-06-14T22:44:54Z | 2021-06-14T22:44:54Z | MEMBER | Have you found a way to access events in GraphQL? I can only see way to access a timeline of events for a single issue or a single pull request. See also https://github.community/t/get-event-equivalent-for-v4/13600/2 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | feature: support "events" 920636216 | |
844250232 | https://github.com/dogsheep/github-to-sqlite/pull/59#issuecomment-844250232 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/59 | MDEyOklzc3VlQ29tbWVudDg0NDI1MDIzMg== | simonw 9599 | 2021-05-19T16:08:10Z | 2021-05-19T16:08:10Z | MEMBER | Thanks for catching this. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Remove unneeded exists=True for -a/--auth flag. 771872303 | |
844249385 | https://github.com/dogsheep/github-to-sqlite/pull/61#issuecomment-844249385 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/61 | MDEyOklzc3VlQ29tbWVudDg0NDI0OTM4NQ== | simonw 9599 | 2021-05-19T16:07:06Z | 2021-05-19T16:07:06Z | MEMBER | Thanks! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | fixing typo in get cli help text 797108702 | |
739058820 | https://github.com/dogsheep/dogsheep-photos/pull/29#issuecomment-739058820 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/29 | MDEyOklzc3VlQ29tbWVudDczOTA1ODgyMA== | simonw 9599 | 2020-12-04T22:32:35Z | 2020-12-04T22:32:35Z | MEMBER | Thanks for this! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Fixed bug in SQL query for photo scores 638375985 | |
735485677 | https://github.com/dogsheep/github-to-sqlite/issues/53#issuecomment-735485677 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/53 | MDEyOklzc3VlQ29tbWVudDczNTQ4NTY3Nw== | simonw 9599 | 2020-11-30T00:36:09Z | 2020-11-30T00:36:09Z | MEMBER | Given rate limits (see #51) this command might be better implemented by running a `git clone` into a temporary directory - doing so would retrieve all of the files in one go. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Command for fetching file contents 753000405 | |
735484186 | https://github.com/dogsheep/github-to-sqlite/issues/51#issuecomment-735484186 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51 | MDEyOklzc3VlQ29tbWVudDczNTQ4NDE4Ng== | simonw 9599 | 2020-11-30T00:29:31Z | 2020-11-30T00:29:31Z | MEMBER | This just caused a failure in deploying the demo: https://github.com/dogsheep/github-to-sqlite/runs/1471304407?check_suite_focus=true ``` File "/opt/hostedtoolcache/Python/3.8.6/x64/bin/github-to-sqlite", line 33, in <module> sys.exit(load_entry_point('github-to-sqlite', 'console_scripts', 'github-to-sqlite')()) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/cli.py", line 142, in issue_comments for comment in utils.fetch_issue_comments(repo, token, issue): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 380, in fetch_issue_comments for comments in paginate(url, headers): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 472, in paginate raise GitHubError.from_response(response) github_to_sqlite.utils.GitHubError: ('API rate limit exceeded for user ID 9599.', 403) Error: Process completed with exit code 1. ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | github-to-sqlite should handle rate limits better 703246031 | |
735483820 | https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735483820 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 | MDEyOklzc3VlQ29tbWVudDczNTQ4MzgyMA== | simonw 9599 | 2020-11-30T00:27:47Z | 2020-11-30T00:27:47Z | MEMBER | So it looks like anything that pulls reviews needs to pull each review, then for each one pull the comments. I'm going to consider this blocked on smarter rate limit handling in #51. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Feature: pull request reviews and comments 664485022 | |
735483604 | https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735483604 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 | MDEyOklzc3VlQ29tbWVudDczNTQ4MzYwNA== | simonw 9599 | 2020-11-30T00:26:50Z | 2020-11-30T00:26:50Z | MEMBER | It seems like there's a lot missing from that - those aren't particularly interesting given the data that is returned. From the docs at https://docs.github.com/en/free-pro-team@latest/rest/reference/pulls#reviews it looks like each review consists of multiple comments, and the comments are where the useful material is - https://docs.github.com/en/free-pro-team@latest/rest/reference/pulls#list-comments-for-a-pull-request-review `github-to-sqlite get https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48/reviews/503368921/comments --accept 'application/vnd.github.v3+json'` ```json [ { "id": 500603838, "node_id": "MDI0OlB1bGxSZXF1ZXN0UmV2aWV3Q29tbWVudDUwMDYwMzgzOA==", "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500603838", "pull_request_review_id": 503368921, "diff_hunk": "@@ -0,0 +1,370 @@\n+[\n+ {\n+ \"url\": \"https://api.github.com/repos/simonw/datasette/pulls/571\",\n+ \"id\": 313384926,\n+ \"node_id\": \"MDExOlB1bGxSZXF1ZXN0MzEzMzg0OTI2\",\n+ \"html_url\": \"https://github.com/simonw/datasette/pull/571\",\n+ \"diff_url\": \"https://github.com/simonw/datasette/pull/571.diff\",\n+ \"patch_url\": \"https://github.com/simonw/datasette/pull/571.patch\",\n+ \"issue_url\": \"https://api.github.com/repos/simonw/datasette/issues/571\",\n+ \"number\": 571,\n+ \"state\": \"closed\",\n+ \"locked\": false,\n+ \"title\": \"detect_fts now works with alternative table escaping\",\n+ \"user\": {\n+ \"login\": \"simonw\",\n+ \"id\": 9599,\n+ \"node_id\": \"MDQ6VXNlcjk1OTk=\",\n+ \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\",\n+ \"gravatar_id\": \"\",\n+ \"url\": \"https://api.github.com/users/simonw\",\n+ \"html_url\": \"https://github.com/simonw\",\n+ \"followers_url\": \"https://api.github.com/users/simonw/followers\",\n+ \"following_ur… | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Feature: pull request reviews and comments 664485022 | |
735482546 | https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735482546 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 | MDEyOklzc3VlQ29tbWVudDczNTQ4MjU0Ng== | simonw 9599 | 2020-11-30T00:22:02Z | 2020-11-30T00:22:02Z | MEMBER | As for reviews... here's the output of `github-to-sqlite get https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48/reviews --accept 'application/vnd.github.v3+json'` ```json [ { "id": 503368921, "node_id": "MDE3OlB1bGxSZXF1ZXN0UmV2aWV3NTAzMzY4OTIx", "user": { "login": "simonw", "id": 9599, "node_id": "MDQ6VXNlcjk1OTk=", "avatar_url": "https://avatars0.githubusercontent.com/u/9599?u=5968723deb1a55b82620e106f5ca58e9b11a0942&v=4", "gravatar_id": "", "url": "https://api.github.com/users/simonw", "html_url": "https://github.com/simonw", "followers_url": "https://api.github.com/users/simonw/followers", "following_url": "https://api.github.com/users/simonw/following{/other_user}", "gists_url": "https://api.github.com/users/simonw/gists{/gist_id}", "starred_url": "https://api.github.com/users/simonw/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/simonw/subscriptions", "organizations_url": "https://api.github.com/users/simonw/orgs", "repos_url": "https://api.github.com/users/simonw/repos", "events_url": "https://api.github.com/users/simonw/events{/privacy}", "received_events_url": "https://api.github.com/users/simonw/received_events", "type": "User", "site_admin": false }, "body": "", "state": "CHANGES_REQUESTED", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-503368921", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "MEMBER", "_links": { "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-503368921" }, "pull_request": { "href": "https://api.github.c… | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Feature: pull request reviews and comments 664485022 | |
735482187 | https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735482187 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 | MDEyOklzc3VlQ29tbWVudDczNTQ4MjE4Nw== | simonw 9599 | 2020-11-30T00:20:11Z | 2020-11-30T00:20:11Z | MEMBER | Pull request are now added, thanks to @adamjonas. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Feature: pull request reviews and comments 664485022 | |
735465708 | https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735465708 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54 | MDEyOklzc3VlQ29tbWVudDczNTQ2NTcwOA== | simonw 9599 | 2020-11-29T22:08:46Z | 2020-11-29T22:08:46Z | MEMBER | Demo: - https://github-to-sqlite.dogsheep.net/github/steps?_facet=repo - https://github-to-sqlite.dogsheep.net/github/workflows - https://github-to-sqlite.dogsheep.net/github/jobs | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | github-to-sqlite workflows command 753026003 | |
735464438 | https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735464438 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54 | MDEyOklzc3VlQ29tbWVudDczNTQ2NDQzOA== | simonw 9599 | 2020-11-29T21:57:08Z | 2020-11-29T21:57:08Z | MEMBER | Inspired by this tweet from Michael Heap https://twitter.com/mheap/status/1333108608817631238 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | github-to-sqlite workflows command 753026003 | |
735464493 | https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735464493 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54 | MDEyOklzc3VlQ29tbWVudDczNTQ2NDQ5Mw== | simonw 9599 | 2020-11-29T21:57:32Z | 2020-11-29T21:57:32Z | MEMBER | `$ github-to-sqlite workflows github.db simonw/datasette dogsheep/github-to-sqlite` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | github-to-sqlite workflows command 753026003 | |
727692413 | https://github.com/dogsheep/swarm-to-sqlite/issues/11#issuecomment-727692413 | https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDcyNzY5MjQxMw== | simonw 9599 | 2020-11-16T02:15:22Z | 2020-11-16T02:15:22Z | MEMBER | Thanks, I'll look into this. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Error thrown: sqlite3.OperationalError: table users has no column named lastName 743400216 | |
712266834 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-712266834 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDcxMjI2NjgzNA== | simonw 9599 | 2020-10-19T16:01:23Z | 2020-10-19T16:01:23Z | MEMBER | Might just be a documented pattern for how to configure this in YAML templates. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
711569063 | https://github.com/dogsheep/github-to-sqlite/issues/50#issuecomment-711569063 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/50 | MDEyOklzc3VlQ29tbWVudDcxMTU2OTA2Mw== | simonw 9599 | 2020-10-19T05:01:29Z | 2020-10-19T05:01:29Z | MEMBER | Demo of `--accept`: github-to-sqlite get /repos/simonw/datasette/readme --accept 'application/vnd.github.VERSION.html' | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Commands for making authenticated API calls 703218756 | |
711089647 | https://github.com/dogsheep/dogsheep-beta/issues/28#issuecomment-711089647 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/28 | MDEyOklzc3VlQ29tbWVudDcxMTA4OTY0Nw== | simonw 9599 | 2020-10-17T22:43:13Z | 2020-10-17T22:43:13Z | MEMBER | Since my personal Dogsheep uses Datasette authentication, I'm going to need to pass through cookies. https://github.com/simonw/datasette/issues/1020 will solve that in the future but for now I need to solve it explicitly. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Switch to using datasette.client 723861683 | |
711081703 | https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711081703 | https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDcxMTA4MTcwMw== | simonw 9599 | 2020-10-17T21:18:35Z | 2020-10-17T21:18:35Z | MEMBER | OK, if you upgrade to the just-released 1.0 this should work (it worked against my Spanish export). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | export.xml file name varies with different language settings 723838331 | |
711079760 | https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711079760 | https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDcxMTA3OTc2MA== | simonw 9599 | 2020-10-17T21:00:05Z | 2020-10-17T21:00:05Z | MEMBER | Checking for either `<!DOCTYPE HealthData` or `<HealthData` in the first 1000 bytes should do it. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | export.xml file name varies with different language settings 723838331 | |
711079056 | https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711079056 | https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDcxMTA3OTA1Ng== | simonw 9599 | 2020-10-17T20:53:00Z | 2020-10-17T20:53:00Z | MEMBER | I think the safest thing is to sniff the first few lines of the file. Those should be the same no matter the language that was used: ```xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE HealthData [ ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | export.xml file name varies with different language settings 723838331 | |
711078917 | https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711078917 | https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDcxMTA3ODkxNw== | simonw 9599 | 2020-10-17T20:51:55Z | 2020-10-17T20:52:03Z | MEMBER | I switched my phone to Spanish and ran an export - I got a file called `exportar.zip`. Unzipped I still got a `apple_ health_export` folder but the root contained: ``` electrocardiograms/ export_cda.xml exportar.xml workout-routes/ ``` It looks like `export_cda.xml` does not have a translated name, so maybe I can ignore it and look for the _other_ `.xml` file in that directory. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | export.xml file name varies with different language settings 723838331 | |
711074306 | https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711074306 | https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDcxMTA3NDMwNg== | simonw 9599 | 2020-10-17T20:16:22Z | 2020-10-17T20:16:22Z | MEMBER | The "first XML file in the root" solution is probably easier though! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | export.xml file name varies with different language settings 723838331 | |
711074031 | https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711074031 | https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDcxMTA3NDAzMQ== | simonw 9599 | 2020-10-17T20:14:01Z | 2020-10-17T20:14:01Z | MEMBER | I'd be happy to teach the tool to look for `export.xml` or `eksport.xml` - and then expand that list to other languages. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | export.xml file name varies with different language settings 723838331 | |
707332912 | https://github.com/dogsheep/swarm-to-sqlite/issues/8#issuecomment-707332912 | https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/8 | MDEyOklzc3VlQ29tbWVudDcwNzMzMjkxMg== | simonw 9599 | 2020-10-12T20:35:06Z | 2020-10-12T20:35:06Z | MEMBER | Shipped a fix for this in [swarm-to-sqlite 0.3.2](https://github.com/dogsheep/swarm-to-sqlite/releases/tag/0.3.2). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Error thrown: table photos has no column named hasSticker 648245071 | |
706834800 | https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706834800 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDcwNjgzNDgwMA== | simonw 9599 | 2020-10-12T03:24:57Z | 2020-10-16T20:16:28Z | MEMBER | Here's my first attempt at a plugin for this: ```python from datasette import hookimpl import jinja2 START = "<en-note" END = "</en-note>" TEMPLATE = """ <div style="max-width: 500px; white-space: normal; overflow-wrap: break-word;">{}</div> """.strip() EN_MEDIA_SCRIPT = """ Array.from(document.querySelectorAll('en-media')).forEach(el => { let hash = el.getAttribute('hash'); let type = el.getAttribute('type'); let path = `/evernote/resources_data/${hash}.json?_shape=array`; fetch(path).then(r => r.json()).then(rows => { let b64 = rows[0].data.encoded; let data = `data:${type};base64,${b64}`; el.innerHTML = `<img style="max-width: 300px" src="${data}">`; }); }); """ @hookimpl def render_cell(value, table): if not table: # Don't render content from arbitrary SQL queries, could be XSS hole return if not value or not isinstance(value, str): return value = value.strip() if value.startswith(START) and value.endswith(END): trimmed = value[len(START) : -len(END)] trimmed = trimmed.split(">", 1)[1] # Replace those horrible double newlines trimmed = trimmed.replace("<div><br /></div>", "<br>") return jinja2.Markup(TEMPLATE.format(trimmed)) @hookimpl def extra_body_script(): return EN_MEDIA_SCRIPT ``` It works! It does however demonstrate that Evernote's "clip this webpage" feature means there is a LOT of weird HTML that can get into a note. It looks like they've filtered out the scripts but I wouldn't bet on it - they certainly don't filter out many of the inline styles. So running Bleach is almost certainly a good idea. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out how to display images from <en-media> tags inline in Datasette 718938889 | |
706786548 | https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706786548 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4 | MDEyOklzc3VlQ29tbWVudDcwNjc4NjU0OA== | simonw 9599 | 2020-10-11T23:39:46Z | 2020-10-11T23:39:46Z | MEMBER | Should have used porter stemming for this. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Configure FTS + add an index on the date columns 718938508 | |
706785201 | https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785201 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6 | MDEyOklzc3VlQ29tbWVudDcwNjc4NTIwMQ== | simonw 9599 | 2020-10-11T23:29:39Z | 2020-10-11T23:29:39Z | MEMBER | It looks to me like each of those `<item>` blocks has a number of guesses in order of confidence: ```xml <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">wonders.</t> </item> ``` So maybe the best approach here is to just take the first `t` element within each `item`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Better handling of OCR data 718949182 | |
706785086 | https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785086 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6 | MDEyOklzc3VlQ29tbWVudDcwNjc4NTA4Ng== | simonw 9599 | 2020-10-11T23:28:50Z | 2020-10-11T23:28:50Z | MEMBER | The XML for the OCR stuff is a bit weird. Currently I'm doing this to it: https://github.com/dogsheep/evernote-to-sqlite/blob/c33d7b043a45eb3e88676e5fa3ce31755199d9f8/evernote_to_sqlite/utils.py#L70-L78 This can produce some odd results, for example: > Sure 'Sure, 'Sure. Sure, Sure. sure sure. sure ? If you If Yau [you live jive In m 1n an area devoid of natural wonders, wanders, wonders ? wonders wonders. your mind will be blown, blown' blown. blown ? -e i ? ,1 IL it ? at ? KY ? fl ft bat at Which came from this image: ![image](https://user-images.githubusercontent.com/9599/95692952-5dd7c880-0bde-11eb-939a-d10b800a4105.png) The XML for that is: ```xml <recoIndex docType="unknown" objType="image" objID="05ffb72b307bf495f064243c7099d94f" engineVersion="6.5.17.7" recoType="service" lang="en" objWidth="1000" objHeight="1504"> <item x="68" y="75" w="104" h="37"> <t w="60">Sure</t> <t w="52">'Sure,</t> <t w="47">'Sure.</t> <t w="33">Sure,</t> <t w="26">Sure.</t> </item> <item x="182" y="83" w="92" h="26"> <t w="62">sure</t> <t w="58">sure.</t> <t w="46">sure ?</t> </item> <item x="69" y="132" w="107" h="45"> <t w="81">If you</t> <t w="64">If Yau</t> <t w="31">[you</t> </item> <item x="186" y="132" w="67" h="35"> <t w="85">live</t> <t w="51">jive</t> </item> <item x="263" y="140" w="36" h="27"> <t w="82">In</t> <t w="56">m</t> <t w="53">1n</t> </item> <item x="309" y="140" w="53" h="27"> <t w="82">an</t> </item> <item x="372" y="141" w="90" h="26"> <t w="94">area</t> </item> <item x="472" y="132" w="138" h="35"> <t w="85">devoid</t> </item> <item x="620" y="132" w="43" h="35"> <t w="82">of</t> </item> <item x="68" y="190" w="137" h="35"> <t w="87">natural</t> </item> <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">won… | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Better handling of OCR data 718949182 | |
706784028 | https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706784028 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4 | MDEyOklzc3VlQ29tbWVudDcwNjc4NDAyOA== | simonw 9599 | 2020-10-11T23:20:32Z | 2020-10-11T23:20:32Z | MEMBER | I haven't done the FTS on OCR yet. I'm going to move that to another ticket because it requires more thought. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Configure FTS + add an index on the date columns 718938508 | |
706776808 | https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776808 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDcwNjc3NjgwOA== | simonw 9599 | 2020-10-11T22:23:14Z | 2020-10-11T22:23:14Z | MEMBER | ... but it's still important to be able to get to the rendered note directly from the browse notes `/evernote/notes` page. Maybe use a simple `render_cell()` hook that just knows how to generate the link to the rendered note page? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out how to display images from <en-media> tags inline in Datasette 718938889 | |
706776680 | https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776680 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDcwNjc3NjY4MA== | simonw 9599 | 2020-10-11T22:22:16Z | 2020-10-11T22:22:16Z | MEMBER | Maybe the best way do this is with a custom route, `/-/evernote/note-id` - that way I can clean the HTML and resolve the other things in the `<en-note>` structure without using `render_cell()` and the like. My concern about using `render_cell()` is that it could lead to weird security problems when combined with `?sql=` queries. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out how to display images from <en-media> tags inline in Datasette 718938889 | |
706776447 | https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776447 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDcwNjc3NjQ0Nw== | simonw 9599 | 2020-10-11T22:20:32Z | 2020-10-11T22:20:32Z | MEMBER | Or... I could do this client-side. JavaScript that looks for `<en-media>` tags and fetches the data using `fetch()` wouldn't be too hard to write. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out how to display images from <en-media> tags inline in Datasette 718938889 | |
706776242 | https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776242 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDcwNjc3NjI0Mg== | simonw 9599 | 2020-10-11T22:18:30Z | 2020-10-11T22:19:48Z | MEMBER | Alternatively, rather than relying on `datasette-media` this could base64-embed the images. `evernote-to-sqlite` could register itself as a Datasette plugin that knows how to do this. Maybe rename the column to `evernote_content` and register a render cell hook that knows how to rewrite those note bodies so that they are visible? Might need to feed them through Bleach too, just in case any nasty code can get into them. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out how to display images from <en-media> tags inline in Datasette 718938889 | |
706776180 | https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776180 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDcwNjc3NjE4MA== | simonw 9599 | 2020-10-11T22:17:55Z | 2020-10-11T22:17:55Z | MEMBER | We could even do server-side thumbnailing for some of these images, but I'm inclined to serve up the full size ones and set a width on the image element based on the `width` attribute on `<en-media>`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out how to display images from <en-media> tags inline in Datasette 718938889 | |
706775706 | https://github.com/dogsheep/evernote-to-sqlite/issues/1#issuecomment-706775706 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/1 | MDEyOklzc3VlQ29tbWVudDcwNjc3NTcwNg== | simonw 9599 | 2020-10-11T22:14:00Z | 2020-10-11T22:14:00Z | MEMBER | A live demo would be good too. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Documentation on how to use this with Datasette 718934942 | |
704553385 | https://github.com/dogsheep/github-to-sqlite/pull/48#issuecomment-704553385 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/48 | MDEyOklzc3VlQ29tbWVudDcwNDU1MzM4NQ== | simonw 9599 | 2020-10-06T21:07:44Z | 2020-10-06T21:07:44Z | MEMBER | Sorry for not looking at this sooner, trying it out now - pull request looks great! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add pull requests 681228542 | |
790695126 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790695126 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDY5NTEyNg== | simonw 9599 | 2021-03-04T15:20:42Z | 2021-03-04T15:20:42Z | MEMBER | I'm not sure why but my most recent import, when displayed in Datasette, looks like this: <img width="574" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109985836-0ab00080-7cba-11eb-97d5-0631a0835b61.png"> Sorting by `id` in the opposite order gives me the data I would expect - so it looks like a bunch of null/blank messages are being imported at some point and showing up first due to ID ordering. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790693674 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790693674 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDY5MzY3NA== | simonw 9599 | 2021-03-04T15:18:36Z | 2021-03-04T15:18:36Z | MEMBER | I imported my 10GB mbox with 750,000 emails in it, ran this tool (with a hacked fix for the blob column problem) - and now a search that returns 92 results takes 25.37ms! This is fantastic. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790669767 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790669767 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDY2OTc2Nw== | simonw 9599 | 2021-03-04T14:46:06Z | 2021-03-04T14:46:06Z | MEMBER | Solution could be to pre-process that string by splitting on `(` and dropping everything afterwards, assuming that the `(...)` bit isn't necessary for correctly parsing the date. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790668263 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790668263 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDY2ODI2Mw== | simonw 9599 | 2021-03-04T14:43:58Z | 2021-03-04T14:43:58Z | MEMBER | I added this code to output a message ID on errors: ```diff print("Errors: {}".format(num_errors)) print(traceback.format_exc()) + print("Message-Id: {}".format(email.get("Message-Id", "None"))) continue ``` Having found a message ID that had an error, I ran this command to see the context: rg --text --context 20 '44F289B0.000001.02100@SCHWARZE-DWFXMI' ~/gmail.mbox This was for the following error: ``` File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 102, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 178, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' ``` Here's what I spotted in the `ripgrep` output: ``` 177133570:Message-Id: <44F289B0.000001.02100@SCHWARZE-DWFXMI> 177133571-Date: Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit) 177133572-X-Mailer: IncrediMail (5002253) ``` So it could it be that `_parsedate_tz` is having trouble with that `Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit)` string. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790312268 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790312268 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDMxMjI2OA== | simonw 9599 | 2021-03-04T05:48:16Z | 2021-03-04T05:48:16Z | MEMBER | Wow, my mbox is a 10.35 GB download! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790384087 | https://github.com/dogsheep/google-takeout-to-sqlite/issues/6#issuecomment-790384087 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/6 | MDEyOklzc3VlQ29tbWVudDc5MDM4NDA4Nw== | simonw 9599 | 2021-03-04T07:22:51Z | 2021-03-04T07:22:51Z | MEMBER | #3 also mentions the conflicting version with other tools. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Upgrade to latest sqlite-utils 821841046 | |
790380839 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790380839 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDM4MDgzOQ== | simonw 9599 | 2021-03-04T07:17:05Z | 2021-03-04T07:17:05Z | MEMBER | Looks like you're doing this: ```python elif message.get_content_type() == "text/plain": body = message.get_payload(decode=True) ``` So presumably that decodes to a unicode string? I imagine the reason the column is a `BLOB` for me is that `sqlite-utils` determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790379629 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790379629 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDM3OTYyOQ== | simonw 9599 | 2021-03-04T07:14:41Z | 2021-03-04T07:14:41Z | MEMBER | Confirmed: removing the `len()` call does not speed things up, so it's reading through the entire file for some other purpose too. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790378658 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790378658 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDM3ODY1OA== | simonw 9599 | 2021-03-04T07:12:48Z | 2021-03-04T07:12:48Z | MEMBER | It looks like the `body` is being loaded into a BLOB column - so in Datasette default it looks like this: <img width="1650" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109924808-b4b96980-7c75-11eb-8c9e-307f2ae32d5a.png"> If I `datasette install datasette-render-binary` and then try again I get this: <img width="1487" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109924944-ea5e5280-7c75-11eb-9a32-404f3d68455f.png"> It would be great if we could store the `body` as unicode text instead. May have to do something clever to decode it based on some kind of charset header? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790373024 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790373024 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDM3MzAyNA== | simonw 9599 | 2021-03-04T07:01:58Z | 2021-03-04T07:04:06Z | MEMBER | I got 9 warnings that look like this: ``` Errors: 1 Traceback (most recent call last): File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 103, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 167, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' ``` It would be useful if those warnings told me the message ID (or similar) of the affected message so I could grep for it in the `mbox` and see what was going on. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790372621 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790372621 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDM3MjYyMQ== | simonw 9599 | 2021-03-04T07:01:18Z | 2021-03-04T07:01:18Z | MEMBER | I'm not sure if it would work, but there is an alternative pattern for showing a progress bar against a really large file that I've used in `healthkit-to-sqlite` - you set the progress bar size to the size of the file in bytes, then update a counter as you read the file. https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/cli.py#L24-L57 and https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/utils.py#L4-L19 (the `progress_callback()` bit) is where that happens. It can be a bit of a convoluted pattern, and I'm not at all sure it would work for `mbox` files since it looks like that library has other reasons it needs to do a file scan rather than streaming it through one chunk of bytes at a time. So I imagine this would not work here. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790370485 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790370485 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDM3MDQ4NQ== | simonw 9599 | 2021-03-04T06:57:25Z | 2021-03-04T06:57:48Z | MEMBER | The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count: https://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67 I'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
790369076 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790369076 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MDM2OTA3Ng== | simonw 9599 | 2021-03-04T06:54:46Z | 2021-03-04T06:54:46Z | MEMBER | The Rich-powered progress bar is pretty: ![rich](https://user-images.githubusercontent.com/9599/109923307-71f69200-7c73-11eb-9ee2-8f0a240f3994.gif) | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
786925280 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-786925280 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc4NjkyNTI4MA== | simonw 9599 | 2021-02-26T22:23:10Z | 2021-02-26T22:23:10Z | MEMBER | Thanks! I requested my Gmail export from takeout - once that arrives I'll test it against this and then merge the PR. | {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | WIP: Add Gmail takeout mbox import 813880401 | |
777839351 | https://github.com/dogsheep/evernote-to-sqlite/pull/10#issuecomment-777839351 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/10 | MDEyOklzc3VlQ29tbWVudDc3NzgzOTM1MQ== | simonw 9599 | 2021-02-11T22:37:55Z | 2021-02-11T22:37:55Z | MEMBER | I've merged these changes by hand now, thanks! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | BugFix for encoding and not update info. 770712149 | |
777827396 | https://github.com/dogsheep/evernote-to-sqlite/issues/7#issuecomment-777827396 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7 | MDEyOklzc3VlQ29tbWVudDc3NzgyNzM5Ng== | simonw 9599 | 2021-02-11T22:13:14Z | 2021-02-11T22:13:14Z | MEMBER | My best guess is that you have an older version of `sqlite-utils` installed here - the `replace=True` argument was added in version 2.0. I've bumped the dependency in `setup.py`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | evernote-to-sqlite on windows 10 give this error: TypeError: insert() got an unexpected keyword argument 'replace' 743297582 | |
777821383 | https://github.com/dogsheep/evernote-to-sqlite/issues/9#issuecomment-777821383 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9 | MDEyOklzc3VlQ29tbWVudDc3NzgyMTM4Mw== | simonw 9599 | 2021-02-11T22:01:28Z | 2021-02-11T22:01:28Z | MEMBER | Aha! I think I've figured out what's going on here. The CData blocks containing the notes look like this: `<![CDATA[<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd"><en-note><div>This note includes two images.</div><div><br /></div>...` The DTD at http://xml.evernote.com/pub/enml2.dtd includes some entities: ``` <!--=========== External character mnemonic entities ===================--> <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlat1; <!ENTITY % HTMLsymbol PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"> %HTMLsymbol; <!ENTITY % HTMLspecial PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"> %HTMLspecial; ``` So I need to be able to handle all of those different entities. I think I can do that using `html.entities.entitydefs` from the Python standard library, which looks a bit like this: ```python {'Aacute': 'Á', 'aacute': 'á', 'Aacute;': 'Á', 'aacute;': 'á', 'Abreve;': 'Ă', 'abreve;': 'ă', 'ac;': '∾', 'acd;': '∿', # ... } ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | ParseError: undefined entity š 748372469 | |
777798330 | https://github.com/dogsheep/evernote-to-sqlite/issues/11#issuecomment-777798330 | https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDc3Nzc5ODMzMA== | simonw 9599 | 2021-02-11T21:18:58Z | 2021-02-11T21:18:58Z | MEMBER | Thanks for the fix! | {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | XML parse error 792851444 | |
770071568 | https://github.com/dogsheep/github-to-sqlite/issues/60#issuecomment-770071568 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/60 | MDEyOklzc3VlQ29tbWVudDc3MDA3MTU2OA== | simonw 9599 | 2021-01-29T21:56:15Z | 2021-01-29T21:56:15Z | MEMBER | I really like the way you're using pipes here - really smart. It's similar to how I build the demo database in this GitHub Actions workflow: https://github.com/dogsheep/github-to-sqlite/blob/62dfd3bc4014b108200001ef4bc746feb6f33b45/.github/workflows/deploy-demo.yml#L52-L82 `twitter-to-sqlite` actually has a mechanism for doing this kind of thing, documented at https://github.com/dogsheep/twitter-to-sqlite#providing-input-from-a-sql-query-with---sql-and---attach It lets you do things like: ``` $ twitter-to-sqlite users-lookup my.db --sql="select follower_id from following" --ids ``` Maybe I should add something similar to `github-to-sqlite`? Feels like it could be really useful. | {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Use Data from SQLite in other commands 797097140 | |
769957751 | https://github.com/dogsheep/twitter-to-sqlite/issues/56#issuecomment-769957751 | https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/56 | MDEyOklzc3VlQ29tbWVudDc2OTk1Nzc1MQ== | simonw 9599 | 2021-01-29T17:59:40Z | 2021-01-29T17:59:40Z | MEMBER | This is interesting - how did you create that initial table? Was this using the `twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2.zip` command, or something else? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Not all quoted statuses get fetched? 796736607 | |
761967094 | https://github.com/dogsheep/swarm-to-sqlite/issues/11#issuecomment-761967094 | https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/11 | MDEyOklzc3VlQ29tbWVudDc2MTk2NzA5NA== | simonw 9599 | 2021-01-18T04:11:13Z | 2021-01-18T04:11:13Z | MEMBER | I just got a similar error: ``` File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/swarm_to_sqlite/utils.py", line 79, in save_checkin checkins_table.m2m("users", user, m2m_table="with", pk="id") File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 2048, in m2m id = other_table.insert(record, pk=pk, replace=True).last_pk File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1781, in insert return self.insert_all( File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1899, in insert_all self.insert_chunk( File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1709, in insert_chunk result = self.db.execute(query, params) File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 226, in execute return self.conn.execute(sql, parameters) pysqlite3.dbapi2.OperationalError: table users has no column named countryCode ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Error thrown: sqlite3.OperationalError: table users has no column named lastName 743400216 | |
748426877 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426877 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjg3Nw== | simonw 9599 | 2020-12-19T06:16:11Z | 2020-12-19T06:16:11Z | MEMBER | Here's why: if "fts5" in str(e): But the error being raised here is: sqlite3.OperationalError: no such column: to I'm going to attempt the escaped on on every error. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
748426663 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426663 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjY2Mw== | simonw 9599 | 2020-12-19T06:14:06Z | 2020-12-19T06:14:06Z | MEMBER | Looks like I already do that here: https://github.com/dogsheep/dogsheep-beta/blob/9ba4401017ac24ffa3bc1db38e0910ea49de7616/dogsheep_beta/__init__.py#L141-L146 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
748426501 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426501 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjUwMQ== | simonw 9599 | 2020-12-19T06:12:22Z | 2020-12-19T06:12:22Z | MEMBER | I deliberately added support for advanced FTS in https://github.com/dogsheep/dogsheep-beta/commit/cbb2491b85d7ff416d6d429b60109e6c2d6d50b9 for #13 but that's the cause of this bug. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
748426581 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426581 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjU4MQ== | simonw 9599 | 2020-12-19T06:13:17Z | 2020-12-19T06:13:17Z | MEMBER | One fix for this could be to try running the raw query, but if it throws an error run it again with the query escaped. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
747126777 | https://github.com/dogsheep/google-takeout-to-sqlite/issues/2#issuecomment-747126777 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/2 | MDEyOklzc3VlQ29tbWVudDc0NzEyNjc3Nw== | simonw 9599 | 2020-12-17T00:36:52Z | 2020-12-17T00:36:52Z | MEMBER | The memory profiler tricks I used in https://github.com/dogsheep/healthkit-to-sqlite/issues/7 could help figure out what's going on here. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | killed by oomkiller on large location-history 769376447 | |
747034481 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747034481 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAzNDQ4MQ== | simonw 9599 | 2020-12-16T21:17:05Z | 2020-12-16T21:17:05Z | MEMBER | I'm just going to add `q` for the moment. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
747031608 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747031608 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAzMTYwOA== | simonw 9599 | 2020-12-16T21:15:18Z | 2020-12-16T21:15:18Z | MEMBER | Should I pass any other details to the `display_sql` here as well? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
747030964 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747030964 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAzMDk2NA== | simonw 9599 | 2020-12-16T21:14:54Z | 2020-12-16T21:14:54Z | MEMBER | To do this I'll need the search term to be passed to the `display_sql` SQL query: https://github.com/dogsheep/dogsheep-beta/blob/4890ec87b5e2ec48940f32c9ad1f5aae25c75a4d/dogsheep_beta/__init__.py#L164-L171 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
747029636 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747029636 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAyOTYzNg== | simonw 9599 | 2020-12-16T21:14:03Z | 2020-12-16T21:14:03Z | MEMBER | I think I can do this as a cunning trick in `display_sql`. Consider this example query: https://til.simonwillison.net/tils?sql=select%0D%0A++path%2C%0D%0A++snippet%28til_fts%2C+-1%2C+%27b4de2a49c8%27%2C+%278c94a2ed4b%27%2C+%27...%27%2C+60%29+as+snippet%0D%0Afrom%0D%0A++til%0D%0A++join+til_fts+on+til.rowid+%3D+til_fts.rowid%0D%0Awhere%0D%0A++til_fts+match+escape_fts%28%3Aq%29%0D%0A++and+path+%3D+%27asgi_lifespan-test-httpx.md%27%0D%0A&q=pytest ```sql select path, snippet(til_fts, -1, 'b4de2a49c8', '8c94a2ed4b', '...', 60) as snippet from til join til_fts on til.rowid = til_fts.rowid where til_fts match escape_fts(:q) and path = 'asgi_lifespan-test-httpx.md' ``` The `and path = 'asgi_lifespan-test-httpx.md'` bit means we only get back a specific document - but the snippet highlighting is applied to it. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
746735889 | https://github.com/dogsheep/github-to-sqlite/issues/58#issuecomment-746735889 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/58 | MDEyOklzc3VlQ29tbWVudDc0NjczNTg4OQ== | simonw 9599 | 2020-12-16T17:59:50Z | 2020-12-16T17:59:50Z | MEMBER | I don't want to add a full HTML parser (like BeautifulSoup) as a dependency for this feature. Since the HTML comes from a single, trusted source (GitHub) I could probably handle this using [regular expressions](https://stackoverflow.com/a/1732454). | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Readme HTML has broken internal links 769150394 | |
746734412 | https://github.com/dogsheep/github-to-sqlite/issues/58#issuecomment-746734412 | https://api.github.com/repos/dogsheep/github-to-sqlite/issues/58 | MDEyOklzc3VlQ29tbWVudDc0NjczNDQxMg== | simonw 9599 | 2020-12-16T17:58:56Z | 2020-12-16T17:58:56Z | MEMBER | I'm going to rewrite those `<a href="#filtering-tables">` links to `<a href="#user-content-filtering-tables">` - but only if a corresponding `id="user-content-filtering-tables"` element exists. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Readme HTML has broken internal links 769150394 | |
633704127 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633704127 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYzMzcwNDEyNw== | simonw 9599 | 2020-05-25T20:14:22Z | 2020-05-25T20:14:22Z | MEMBER | https://github.com/dogsheep/dogsheep-photos/blob/0.4.1/README.md#serving-photos-locally-with-datasette-media | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 | |
633629944 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633629944 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYzMzYyOTk0NA== | simonw 9599 | 2020-05-25T15:47:42Z | 2020-05-25T15:47:42Z | MEMBER | I'll add a proper section to the README, but for the moment here's how I do this. First, install `datasette` and the `datasette-media` plugin. Create a `metadata.yaml` file with the following content: ```yaml plugins: datasette-media: photo: sql: |- select path as filepath, 200 as resize_height from apple_photos where uuid = :key photo-big: sql: |- select path as filepath, 1024 as resize_height from apple_photos where uuid = :key ``` Now run `datasette -m metadata.yaml photos.db` - thumbnails will be served at http://127.0.0.1:8001/-/media/photo/F4469918-13F3-43D8-9EC1-734C0E6B60AD and larger sizes of the image at http://127.0.0.1:8001/-/media/photo-big/A8B02C7D-365E-448B-9510-69F80C26304D I also made myself two custom pages, one showing recent images and one showing random images. To do this, install the `datasette-template-sql` plugin and then create a `templates/pages` directory and add these files: `recent-photos.html` ```html <h1>Recent photos</h1> <div> {% for photo in sql("select * from apple_photos order by date desc limit 100") %} <img src="/-/media/photo/{{ photo['uuid'] }}"> {% endfor %} </div> ``` `random-photos.html` ```html <h1>Random photos</h1> <div> {% for photo in sql("with foo as (select * from apple_photos order by date desc limit 5000) select * from foo order by random() limit 100") %} <img src="/-/media/photo/{{ photo['uuid'] }}"> {% endfor %} </div> ``` Now run `datasette -m metadata.yaml photos.db --template-dir=templates/` Visit http://127.0.0.1:8001/random-photos to see some random photos or http://127.0.0.1:8002/recent-photos for recent photos. This is using this mechanism: https://datasette.readthedocs.io/en/stable/custom_templates.html#custom-pages | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 | |
633626741 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633626741 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYzMzYyNjc0MQ== | simonw 9599 | 2020-05-25T15:38:55Z | 2020-05-25T15:38:55Z | MEMBER | Sure, I should absolutely document this! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 | |
633644225 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633644225 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYzMzY0NDIyNQ== | simonw 9599 | 2020-05-25T16:30:44Z | 2020-05-25T16:30:44Z | MEMBER | I'll add docs on using `datasette-json-html` too. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 | |
633643921 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633643921 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYzMzY0MzkyMQ== | simonw 9599 | 2020-05-25T16:29:44Z | 2020-05-25T16:29:44Z | MEMBER | https://github.com/dogsheep/dogsheep-photos/blob/dc43fa8653cb9c7238a36f52239b91d1ec916d5c/README.md#serving-photos-locally-with-datasette-media | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 | |
631229409 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631229409 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyOTQwOQ== | simonw 9599 | 2020-05-20T04:30:40Z | 2020-05-20T04:30:40Z | MEMBER | https://pypi.org/project/photos-to-sqlite/ now links to dogsheep-photos. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631229485 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631229485 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyOTQ4NQ== | simonw 9599 | 2020-05-20T04:31:02Z | 2020-05-20T04:31:02Z | MEMBER | https://pypi.org/project/dogsheep-photos/ is live. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631227245 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227245 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyNzI0NQ== | simonw 9599 | 2020-05-20T04:21:38Z | 2020-05-20T04:21:38Z | MEMBER | I'm going to release 0.4 now. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631227105 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227105 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyNzEwNQ== | simonw 9599 | 2020-05-20T04:21:06Z | 2020-05-20T04:21:06Z | MEMBER | Then I just need to push a final photos-to-sqlite release that updates the README to tell people about the name change. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631227020 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227020 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyNzAyMA== | simonw 9599 | 2020-05-20T04:20:48Z | 2020-05-20T04:21:16Z | MEMBER | Next time I push a release it will create `dogsheep-photos` on PyPI. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631226953 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226953 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyNjk1Mw== | simonw 9599 | 2020-05-20T04:20:34Z | 2020-05-20T04:20:34Z | MEMBER | Huh, it looks like Circle CI picked up the name change automatically. https://app.circleci.com/pipelines/github/dogsheep/dogsheep-photos | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631226572 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226572 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyNjU3Mg== | simonw 9599 | 2020-05-20T04:18:52Z | 2020-05-20T04:18:52Z | MEMBER | Need to reconfigure Circle CI. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631226481 | https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226481 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26 | MDEyOklzc3VlQ29tbWVudDYzMTIyNjQ4MQ== | simonw 9599 | 2020-05-20T04:18:29Z | 2020-05-20T04:18:29Z | MEMBER | I just renamed the repository. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename project to dogsheep-photos 621444763 | |
631255206 | https://github.com/dogsheep/dogsheep-photos/issues/24#issuecomment-631255206 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/24 | MDEyOklzc3VlQ29tbWVudDYzMTI1NTIwNg== | simonw 9599 | 2020-05-20T06:00:25Z | 2020-05-20T06:00:25Z | MEMBER | This needs documentation. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Configurable URL for images 621323348 | |
631253852 | https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253852 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25 | MDEyOklzc3VlQ29tbWVudDYzMTI1Mzg1Mg== | simonw 9599 | 2020-05-20T05:56:17Z | 2020-05-21T22:26:16Z | MEMBER | I have a `deploy-demo.sh` script now: ```bash #!/bin/bash if [ -f public.db ]; then rm public.db fi pipenv run dogsheep-photos create-subset photos.db public.db \ "select sha256 from apple_photos where albums like '%Public%'" pipenv run sqlite-utils create-view public.db photos_on_a_map \ "select date, latitude, longitude, apple_photos.sha256, uploads.ext, json_object( 'title', 'Taken on ' || date, 'image', 'https://photos.simonwillison.net/i/' || uploads.sha256 || '.' || uploads.ext || '?w=400', 'link', 'https://photos.simonwillison.net/i/' || uploads.sha256 || '.' || uploads.ext || '?w=1200' ) as popup from apple_photos join uploads on apple_photos.sha256 = uploads.sha256 where latitude is not null order by date desc" \ --replace pipenv run datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-pretty-json \ --install=datasette-cluster-map>=0.10 \ --title "Dogsheep Photos demo" ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Create a public demo 621332242 | |
631253248 | https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253248 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25 | MDEyOklzc3VlQ29tbWVudDYzMTI1MzI0OA== | simonw 9599 | 2020-05-20T05:54:18Z | 2020-05-20T05:54:18Z | MEMBER | https://dogsheep-photos.dogsheep.net/ | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Create a public demo 621332242 | |
631253136 | https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253136 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25 | MDEyOklzc3VlQ29tbWVudDYzMTI1MzEzNg== | simonw 9599 | 2020-05-20T05:53:58Z | 2020-05-20T05:53:58Z | MEMBER | Updated deploy command: ``` datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-cluster-map \ --title "Dogsheep Photos demo" ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Create a public demo 621332242 | |
631251707 | https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631251707 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25 | MDEyOklzc3VlQ29tbWVudDYzMTI1MTcwNw== | simonw 9599 | 2020-05-20T05:49:27Z | 2020-05-21T15:58:42Z | MEMBER | Renaming this demo to `dogsheep-photos.dogsheep.net` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Create a public demo 621332242 | |
631127454 | https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631127454 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25 | MDEyOklzc3VlQ29tbWVudDYzMTEyNzQ1NA== | simonw 9599 | 2020-05-19T22:48:00Z | 2020-05-21T15:58:32Z | MEMBER | I built #23 to help with this. $ dogsheep-photos create-subset photos.db public.db \ "select sha256 from apple_photos where albums like '%Public%'" And publish with Vercel: $ datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-cluster-map | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Create a public demo 621332242 | |
631120771 | https://github.com/dogsheep/dogsheep-photos/issues/23#issuecomment-631120771 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/23 | MDEyOklzc3VlQ29tbWVudDYzMTEyMDc3MQ== | simonw 9599 | 2020-05-19T22:32:48Z | 2020-05-19T22:32:48Z | MEMBER | Documentation: https://github.com/dogsheep/photos-to-sqlite/blob/e2fab012551eed05278040b5d57e7373a1b9a0bf/README.md#creating-a-subset-database | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | create-subset command for creating a publishable subset of a photos database 621280529 | |
626941278 | https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-626941278 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 | MDEyOklzc3VlQ29tbWVudDYyNjk0MTI3OA== | simonw 9599 | 2020-05-11T20:25:58Z | 2020-05-11T20:25:58Z | MEMBER | Interesting - do you know if there's anything the `exiftool` process handles that `ExifReader` doesn't? I'm actually just going to extract a subset of the EXIF data at first - since the original photo files will always be available I don't feel the need to get everything out for the first step. My plan is to use EXIF to help support photo collections that aren't in Apple Photos - I'm going to build a database table keyed by the `sha256` of each photo that extracts the camera make, lens, a few settings (ISO, aperture etc) and the GPS lat/lon. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Try out ExifReader 615626118 | |
626395781 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395781 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5NTc4MQ== | simonw 9599 | 2020-05-10T21:57:09Z | 2020-05-10T21:57:09Z | MEMBER | Yes, I just recreated my virtual environment from scratch and the error went away. The problem occurred when I ran `pip install datasette-bplist` in the same virtual environment - https://github.com/simonw/datasette-bplist/blob/master/setup.py depends on `bpylist` which is incompatible with `bpylist2`. | {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626395209 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395209 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5NTIwOQ== | simonw 9599 | 2020-05-10T21:52:42Z | 2020-05-10T21:52:42Z | MEMBER | Aha! It looks like I accidentally installed the old bplist into the same environment: ``` $ pip freeze | grep bpylist bpylist==0.1.4 bpylist2==3.0.0 ``` | {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626395103 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395103 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5NTEwMw== | simonw 9599 | 2020-05-10T21:51:36Z | 2020-05-10T21:51:36Z | MEMBER | @RhetTbull I tried that workaround and it turns out I'm getting this error on ALL of my photos now! It's weird: a few day ago this wasn't happening. Now it's happening to everything. I'm not sure what I might have changed. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626394989 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626394989 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM5NDk4OQ== | simonw 9599 | 2020-05-10T21:50:36Z | 2020-05-10T21:50:36Z | MEMBER | https://github.com/Marketcircle/bpylist/pull/2 looks relevant here. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626388837 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626388837 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM4ODgzNw== | simonw 9599 | 2020-05-10T20:59:32Z | 2020-05-10T20:59:32Z | MEMBER | So it appears it's possible for `photo.place` to raise that exception. A workaround could be to catch that and treat those photos as not having a place. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
626388764 | https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626388764 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 | MDEyOklzc3VlQ29tbWVudDYyNjM4ODc2NA== | simonw 9599 | 2020-05-10T20:58:52Z | 2020-05-10T20:58:52Z | MEMBER | More from the debugger: ``` > /Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/photoinfo.py(614)place() -> self._place = PlaceInfo5(self._info["reverse_geolocation"]) ``` And: ``` > /Users/simon/Dropbox/Development/photos-to-sqlite/photos_to_sqlite/utils.py(91)osxphoto_to_row() -> place = photo.place ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990 | |
625947133 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-625947133 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYyNTk0NzEzMw== | simonw 9599 | 2020-05-08T18:13:06Z | 2020-05-08T18:13:06Z | MEMBER | `datasette-media` will be able to handle this once I implement https://github.com/simonw/datasette-media/issues/3 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 | |
624408738 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408738 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYyNDQwODczOA== | simonw 9599 | 2020-05-06T02:21:05Z | 2020-05-06T02:21:32Z | MEMBER | Here's rendering code from my hacked-together not-yet-released S3 image proxy: ```python from starlette.responses import Response from PIL import Image, ExifTags import pyheif for ORIENTATION_TAG in ExifTags.TAGS.keys(): if ExifTags.TAGS[ORIENTATION_TAG] == "Orientation": break ... # Load it into Pillow if ext == "heic": heic = pyheif.read_heif(image_response.content) image = Image.frombytes(mode=heic.mode, size=heic.size, data=heic.data) else: image = Image.open(io.BytesIO(image_response.content)) # Does EXIF tell us to rotate it? try: exif = dict(image._getexif().items()) if exif[ORIENTATION_TAG] == 3: image = image.rotate(180, expand=True) elif exif[ORIENTATION_TAG] == 6: image = image.rotate(270, expand=True) elif exif[ORIENTATION_TAG] == 8: image = image.rotate(90, expand=True) except (AttributeError, KeyError, IndexError): pass # Resize based on ?w= and ?h=, if set width, height = image.size w = request.query_params.get("w") h = request.query_params.get("h") if w is not None or h is not None: if h is None: # Set h based on w w = int(w) h = int((float(height) / width) * w) elif w is None: h = int(h) # Set w based on h w = int((float(width) / height) * h) w = int(w) h = int(h) image.thumbnail((w, h)) # ?bw= converts to black and white if request.query_params.get("bw"): image = image.convert("L") # ?q= sets the quality - defaults to 75 quality = 75 q = request.query_params.get("q") if q and q.isdigit() and 1 <= int(q) <= 100: quality = int(q) # Output as JPEG or PNG output_image = io.BytesIO() image_type = "JPEG" kwargs = {"quality": quality} if image.format == "PNG": image_type = "PNG" kwargs = {} … | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 | |
624408370 | https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408370 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20 | MDEyOklzc3VlQ29tbWVudDYyNDQwODM3MA== | simonw 9599 | 2020-05-06T02:19:27Z | 2020-05-06T02:19:27Z | MEMBER | The plugin can be generalized: it can be configured to know how to take the URL path, look it up in ANY table (via a custom SQL query) to get a path on disk and then serve that. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Ability to serve thumbnailed Apple Photo from its place on disk 613006393 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
updated_at (date) >30 ✖