github: issue_comments: 525 rows where author_association = "MEMBER" sorted by id descending

525 rows where author_association = "MEMBER" sorted by id descending

Search:

descending

id ▲	html_url	issue_url	node_id	user	created_at	updated_at	author_association	body	reactions	issue
622171097	https://github.com/dogsheep/github-to-sqlite/issues/33#issuecomment-622171097	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/33	MDEyOklzc3VlQ29tbWVudDYyMjE3MTA5Nw==	simonw 9599	2020-04-30T23:22:45Z	2020-04-30T23:23:57Z	MEMBER	The `auth.json` mechanism this uses is standard across all of the other Dogsheep tools - it's actually designed so you can have one `auth.json` with a bunch of different credentials for different tools: ```json { "goodreads_personal_token": "...", "goodreads_user_id": "...", "github_personal_token": "...", "pocket_consumer_key": "...", "pocket_username": "...", "pocket_access_token": "..." } ``` But... `github-to-sqlite` does feel like it deserves a special case here, since it's such a good fit for running inside of GitHub Actions - which even provide a `GITHUB_TOKEN` for you to use! So I don't think it will harm the family of tools too much if this has an environment variable alternative to the `-a` file.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Fall back to authentication via ENV 609950090
622169728	https://github.com/dogsheep/github-to-sqlite/issues/33#issuecomment-622169728	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/33	MDEyOklzc3VlQ29tbWVudDYyMjE2OTcyOA==	simonw 9599	2020-04-30T23:18:51Z	2020-04-30T23:18:51Z	MEMBER	Sure, that sounds fine to me.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Fall back to authentication via ENV 609950090
622162835	https://github.com/dogsheep/github-to-sqlite/issues/34#issuecomment-622162835	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/34	MDEyOklzc3VlQ29tbWVudDYyMjE2MjgzNQ==	simonw 9599	2020-04-30T22:59:26Z	2020-04-30T22:59:26Z	MEMBER	Documentation: https://github.com/dogsheep/github-to-sqlite/blob/c9f48404481882e8b3af06f35e4801a80ac79ed6/README.md#scraping-dependents-for-a-repository	{"total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 2, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Command for retrieving dependents for a repo 610408908
622136585	https://github.com/dogsheep/github-to-sqlite/issues/34#issuecomment-622136585	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/34	MDEyOklzc3VlQ29tbWVudDYyMjEzNjU4NQ==	simonw 9599	2020-04-30T21:55:51Z	2020-04-30T21:55:51Z	MEMBER	And to find the "Next" pagination link: ```python soup.select(".paginate-container")[0].find("a", text="Next") ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Command for retrieving dependents for a repo 610408908
622135654	https://github.com/dogsheep/github-to-sqlite/issues/34#issuecomment-622135654	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/34	MDEyOklzc3VlQ29tbWVudDYyMjEzNTY1NA==	simonw 9599	2020-04-30T21:53:44Z	2020-04-30T21:56:06Z	MEMBER	I think this is the neatest scraping pattern: ```python [a["href"].lstrip("/") for a in soup.select("a[data-hovercard-type=repository]")] ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Command for retrieving dependents for a repo 610408908
622133775	https://github.com/dogsheep/github-to-sqlite/issues/34#issuecomment-622133775	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/34	MDEyOklzc3VlQ29tbWVudDYyMjEzMzc3NQ==	simonw 9599	2020-04-30T21:49:27Z	2020-04-30T21:49:27Z	MEMBER	Proposed command: github-to-sqlite scrape-dependents github.db simonw/datasette I'll pull full details of the scraped repos from the regular API. I'll also record when they were "first seen" by the command.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Command for retrieving dependents for a repo 610408908
622133422	https://github.com/dogsheep/github-to-sqlite/issues/34#issuecomment-622133422	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/34	MDEyOklzc3VlQ29tbWVudDYyMjEzMzQyMg==	simonw 9599	2020-04-30T21:48:39Z	2020-04-30T21:48:39Z	MEMBER	It looks like the only option is to scrape them. I'll do that and then replace with an API as soon as one becomes available.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Command for retrieving dependents for a repo 610408908
622133298	https://github.com/dogsheep/github-to-sqlite/issues/34#issuecomment-622133298	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/34	MDEyOklzc3VlQ29tbWVudDYyMjEzMzI5OA==	simonw 9599	2020-04-30T21:48:24Z	2020-04-30T21:48:24Z	MEMBER	Unfortunately it's not available through any GitHub API - I managed to figure out how to get dependencies, but I need dependents. https://github.com/simonw/til/blob/master/github/dependencies-graphql-api.md	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Command for retrieving dependents for a repo 610408908
620774507	https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620774507	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14	MDEyOklzc3VlQ29tbWVudDYyMDc3NDUwNw==	simonw 9599	2020-04-28T18:19:06Z	2020-04-28T18:19:06Z	MEMBER	The default timeout is a bit aggressive and sometimes failed for me if my resizing proxy took too long to fetch and resize the image. `client.annotate_image(..., timeout=3.0)` may be worth trying.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Annotate photos using the Google Cloud Vision API 608512747
620772190	https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620772190	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14	MDEyOklzc3VlQ29tbWVudDYyMDc3MjE5MA==	simonw 9599	2020-04-28T18:14:43Z	2020-04-28T18:14:43Z	MEMBER	Database schema for this will require some thought. Just dumping the output into a JSON column isn't going to be flexible enough - I want to be able to FTS against labels and OCR text, and potentially query against other characteristics too.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Annotate photos using the Google Cloud Vision API 608512747
620771698	https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620771698	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14	MDEyOklzc3VlQ29tbWVudDYyMDc3MTY5OA==	simonw 9599	2020-04-28T18:13:48Z	2020-04-28T18:13:48Z	MEMBER	For face detection: ``` {"type": vision.enums.Feature.Type.Type.FACE_DETECTION} ``` For OCR: ``` {"type": vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION} ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Annotate photos using the Google Cloud Vision API 608512747
620771067	https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620771067	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14	MDEyOklzc3VlQ29tbWVudDYyMDc3MTA2Nw==	simonw 9599	2020-04-28T18:12:34Z	2020-04-28T18:15:38Z	MEMBER	Python library docs: https://googleapis.dev/python/vision/latest/index.html I'm creating a new project for this called simonwillison-photos: https://console.cloud.google.com/projectcreate https://console.cloud.google.com/home/dashboard?project=simonwillison-photos Then I enabled the Vision API. The direct link to https://console.cloud.google.com/flows/enableapi?apiid=vision-json.googleapis.com which they provided in the docs didn't work - it gave me a "You don't have sufficient permissions to use the requested API" error - but starting at the "Enable APIs" page and searching for it worked fine. I created a new service account as an "owner" of that project: https://console.cloud.google.com/apis/credentials/serviceaccountkey (and complained about it on Twitter and through their feedback form) `pip install google-cloud-vision` ```python from google.cloud import vision client = vision.ImageAnnotatorClient.from_service_account_file("simonwillison-photos-18c570b301fe.json") # Photo of a lemur response = client.annotate_image( { "image": { "source": { "image_uri": "https://photos.simonwillison.net/i/1b3414ee9ade67ce04ade9042e6d4b433d1e523c9a16af17f490e2c0a619755b.jpeg" } }, "features": [ {"type": vision.enums.Feature.Type.IMAGE_PROPERTIES}, {"type": vision.enums.Feature.Type.OBJECT_LOCALIZATION}, {"type": vision.enums.Feature.Type.LABEL_DETECTION}, ], } ) response ``` Output is: ``` label_annotations { mid: "/m/09686" description: "Vertebrate" score: 0.9851104021072388 topicality: 0.9851104021072388 } label_annotations { mid: "/m/04rky" description: "Mammal" score: 0.975814163684845 topicality: 0.975814163684845 } label_annotations { mid: "/m/01280g" description: "Wildlife" score: 0.8973650336265564 topicality: 0.8973650336265564 } label_annotations { mid: "/m/02f9pk" description: "Lemur" score: 0.8270352482795715 …	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Annotate photos using the Google Cloud Vision API 608512747
620769348	https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620769348	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14	MDEyOklzc3VlQ29tbWVudDYyMDc2OTM0OA==	simonw 9599	2020-04-28T18:09:21Z	2020-04-28T18:09:21Z	MEMBER	Pricing is pretty good: free for first 1,000 calls per month, then $1.50 per thousand after that.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Annotate photos using the Google Cloud Vision API 608512747
620309185	https://github.com/dogsheep/dogsheep-photos/issues/13#issuecomment-620309185	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/13	MDEyOklzc3VlQ29tbWVudDYyMDMwOTE4NQ==	simonw 9599	2020-04-28T00:39:45Z	2020-04-28T00:39:45Z	MEMBER	I'm going to leave this until I have the mechanism for associating a live photo video with the photo.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Also upload movie files 607888367
620273692	https://github.com/dogsheep/dogsheep-photos/issues/13#issuecomment-620273692	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/13	MDEyOklzc3VlQ29tbWVudDYyMDI3MzY5Mg==	simonw 9599	2020-04-27T22:42:50Z	2020-04-27T22:42:50Z	MEMBER	``` >>> def ext_counts(directory): ... counts = {} ... for path in pathlib.Path(directory).glob("*/"): ... ext = path.suffix ... counts[ext] = counts.get(ext, 0) + 1 ... return counts ... >>> >>> ext_counts("/Users/simon/Pictures/Photos Library.photoslibrary/originals") {'': 16, '.heic': 15478, '.jpeg': 21691, '.mov': 946, '.png': 2262, '.gif': 38, '.mp4': 116, '.aae': 2} ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Also upload movie files 607888367
618796564	https://github.com/dogsheep/dogsheep-photos/issues/12#issuecomment-618796564	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/12	MDEyOklzc3VlQ29tbWVudDYxODc5NjU2NA==	simonw 9599	2020-04-24T04:35:25Z	2020-04-24T04:35:25Z	MEMBER	Code: https://github.com/dogsheep/photos-to-sqlite/blob/a388cf1f1b6b67752d669466cda8b171b6582171/photos_to_sqlite/cli.py#L109-L114	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	If less than 500MB, show size in MB not GB 606033104
618725155	https://github.com/dogsheep/dogsheep-photos/issues/9#issuecomment-618725155	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/9	MDEyOklzc3VlQ29tbWVudDYxODcyNTE1NQ==	simonw 9599	2020-04-23T23:39:14Z	2020-04-23T23:39:14Z	MEMBER	A few minutes later... ``` Fetching existing keys from S3... Got 22,446 existing keys Calculating hashes [####################################] 100% 22,441 hashed files, 610 are not yet in S3 Uploading 0.99 GB Uploading 610 photos [------------------------------------] 1/610 03:10:35 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	upload command should be resumable, should only upload photos not already uploaded 605938063
618724149	https://github.com/dogsheep/dogsheep-photos/issues/9#issuecomment-618724149	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/9	MDEyOklzc3VlQ29tbWVudDYxODcyNDE0OQ==	simonw 9599	2020-04-23T23:35:29Z	2020-04-23T23:35:29Z	MEMBER	``` % photos-to-sqlite upload photos.db ~/Pictures/Photos\ Library.photoslibrary/originals Fetching existing keys from S3... Got 22,446 existing keys Calculating hashes [####--------------------------------] 13% 00:04:14 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	upload command should be resumable, should only upload photos not already uploaded 605938063
618100658	https://github.com/dogsheep/dogsheep-photos/issues/8#issuecomment-618100658	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/8	MDEyOklzc3VlQ29tbWVudDYxODEwMDY1OA==	simonw 9599	2020-04-23T00:03:35Z	2020-04-23T00:03:35Z	MEMBER	Also MD5 isn't guaranteed for the ETag: > If an object is created by either the Multipart Upload or Part Copy operation, the ETag is not an MD5 digest, regardless of the method of encryption.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Should I have used MD5 instead of SHA256? 605147638
618100434	https://github.com/dogsheep/dogsheep-photos/issues/8#issuecomment-618100434	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/8	MDEyOklzc3VlQ29tbWVudDYxODEwMDQzNA==	simonw 9599	2020-04-23T00:02:53Z	2020-04-23T00:02:53Z	MEMBER	I don't think it matters one way or the other - I'm storing the sha256 in the filename, so the fact that I could read the MD5 back from the list bucket operation doesn't give me any benefits.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Should I have used MD5 instead of SHA256? 605147638
617491607	https://github.com/dogsheep/github-to-sqlite/issues/31#issuecomment-617491607	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/31	MDEyOklzc3VlQ29tbWVudDYxNzQ5MTYwNw==	simonw 9599	2020-04-22T01:20:19Z	2020-04-22T01:20:19Z	MEMBER	https://github-to-sqlite.dogsheep.net/github/milestones now link to repo: <img width="1095" alt="Screen Shot 2020-04-21 at 6 19 03 PM" src="https://user-images.githubusercontent.com/9599/79930145-956a7280-83fc-11ea-9b47-b0c5674a0d4f.png"> And so do issues: https://github-to-sqlite.dogsheep.net/github/issues <img width="1095" alt="Screen Shot 2020-04-21 at 6 19 53 PM" src="https://user-images.githubusercontent.com/9599/79930200-bd59d600-83fc-11ea-8fb8-b40772140409.png">	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issue and milestone should have foreign key to repo 603624862
617490914	https://github.com/dogsheep/github-to-sqlite/issues/32#issuecomment-617490914	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/32	MDEyOklzc3VlQ29tbWVudDYxNzQ5MDkxNA==	simonw 9599	2020-04-22T01:17:44Z	2020-04-22T01:17:44Z	MEMBER	https://github-to-sqlite.dogsheep.net/github?sql=select+html_url%2C+id%2C+issue+from+issue_comments+order+by+updated_at+desc+limit+101 now shows issues. And https://github-to-sqlite.dogsheep.net/github/issue_comments links to them: <img width="395" alt="Screen Shot 2020-04-21 at 6 17 33 PM" src="https://user-images.githubusercontent.com/9599/79930066-6227e380-83fc-11ea-8416-8b9d55660cb5.png">	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issue comments don't appear to populate issues foreign key 604222295
617369247	https://github.com/dogsheep/github-to-sqlite/issues/32#issuecomment-617369247	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/32	MDEyOklzc3VlQ29tbWVudDYxNzM2OTI0Nw==	simonw 9599	2020-04-21T19:33:03Z	2020-04-21T19:33:03Z	MEMBER	Caused by #31.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issue comments don't appear to populate issues foreign key 604222295
617364956	https://github.com/dogsheep/github-to-sqlite/issues/32#issuecomment-617364956	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/32	MDEyOklzc3VlQ29tbWVudDYxNzM2NDk1Ng==	simonw 9599	2020-04-21T19:24:45Z	2020-04-21T19:24:45Z	MEMBER	That's because I just broke this code: https://github.com/dogsheep/github-to-sqlite/blob/2cf75a0a036719eb7e57fdc7c5c2ea0f4c26978a/github_to_sqlite/utils.py#L131-L139 It expects the `repo` column to be `simonw/datasette` but it's now an ID instead. I should add a test for this as part of the fix.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issue comments don't appear to populate issues foreign key 604222295
617348174	https://github.com/dogsheep/github-to-sqlite/issues/31#issuecomment-617348174	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/31	MDEyOklzc3VlQ29tbWVudDYxNzM0ODE3NA==	simonw 9599	2020-04-21T18:50:29Z	2020-04-21T18:50:29Z	MEMBER	Since this represents a breaking schema change for anyone running SQL queries against these tables, I'm going to do a major version bump to 2.0 when I release this.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issue and milestone should have foreign key to repo 603624862
616884647	https://github.com/dogsheep/github-to-sqlite/issues/31#issuecomment-616884647	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/31	MDEyOklzc3VlQ29tbWVudDYxNjg4NDY0Nw==	simonw 9599	2020-04-21T00:49:16Z	2020-04-21T00:50:20Z	MEMBER	The API just gives us the `repository_url`: https://api.github.com/repos/simonw/datasette/issues ![Mozilla_Firefox_and_Topic__Week_2__Discussion__Submit_your_six_story_points_here](https://user-images.githubusercontent.com/9599/79812950-283cdb80-832f-11ea-8759-9633087d1e7e.png) We currently turn that into a `simonw/datasette` string here: https://github.com/dogsheep/github-to-sqlite/blob/e0e8d8caa9657b04bfb8a2cf16c9b580f38b1805/github_to_sqlite/utils.py#L43-L46	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issue and milestone should have foreign key to repo 603624862
616883726	https://github.com/dogsheep/github-to-sqlite/issues/30#issuecomment-616883726	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/30	MDEyOklzc3VlQ29tbWVudDYxNjg4MzcyNg==	simonw 9599	2020-04-21T00:45:23Z	2020-04-21T00:45:23Z	MEMBER	Demo of fix: https://github-to-sqlite.dogsheep.net/github/issues?assignee__notblank=1&milestone__notblank=1 ![github__issues__4_rows_where_where_assignee_is_not_blank_and_milestone_is_not_blank_sorted_by_updated_at_descending](https://user-images.githubusercontent.com/9599/79812758-b49ace80-832e-11ea-81db-bdf993b872cc.png)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issues milestone column is the wrong type 603618244
616883275	https://github.com/dogsheep/github-to-sqlite/issues/29#issuecomment-616883275	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/29	MDEyOklzc3VlQ29tbWVudDYxNjg4MzI3NQ==	simonw 9599	2020-04-21T00:43:28Z	2020-04-21T00:43:28Z	MEMBER	I'm copying repo from issue, which surprisingly is a string, not an integer ID.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Milestones should have foreign key to creator and repo 603617013
616879753	https://github.com/dogsheep/github-to-sqlite/issues/30#issuecomment-616879753	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/30	MDEyOklzc3VlQ29tbWVudDYxNjg3OTc1Mw==	simonw 9599	2020-04-21T00:29:29Z	2020-04-21T00:29:29Z	MEMBER	`assignee` looks like it's the incorrect type too.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Issues milestone column is the wrong type 603618244
616029262	https://github.com/dogsheep/twitter-to-sqlite/issues/45#issuecomment-616029262	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/45	MDEyOklzc3VlQ29tbWVudDYxNjAyOTI2Mg==	simonw 9599	2020-04-19T04:39:21Z	2020-04-19T04:39:21Z	MEMBER	![44714E00-8CC5-46CD-9E48-1F4DD148FCC8](https://user-images.githubusercontent.com/9599/79679696-09b6d300-81bd-11ea-80e4-0653d92e4f58.jpeg)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Use raise_for_status() everywhere 602619330
615993178	https://github.com/dogsheep/dogsheep-photos/issues/7#issuecomment-615993178	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/7	MDEyOklzc3VlQ29tbWVudDYxNTk5MzE3OA==	simonw 9599	2020-04-19T00:37:08Z	2020-04-19T00:37:08Z	MEMBER	https://pypi.org/project/ImageHash/ Is one option.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Integrate image content hashing 602585497
615983393	https://github.com/dogsheep/dogsheep-photos/issues/6#issuecomment-615983393	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/6	MDEyOklzc3VlQ29tbWVudDYxNTk4MzM5Mw==	simonw 9599	2020-04-18T23:53:10Z	2020-04-18T23:53:10Z	MEMBER	``` $ photos-to-sqlite upload photos3.db ~/Pictures/Photos\ Library.photoslibrary/Masters/2020 Uploading 2.09 GB [##----------------------------------] 6% 00:36:37 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add progress bar to upload command 602575575
615979923	https://github.com/dogsheep/dogsheep-photos/issues/6#issuecomment-615979923	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/6	MDEyOklzc3VlQ29tbWVudDYxNTk3OTkyMw==	simonw 9599	2020-04-18T23:36:02Z	2020-04-18T23:36:02Z	MEMBER	I'll use a Click progress bar. To do this I need to first calculate the sum number of bytes in the photos that are going to be uploaded, then run the upload.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add progress bar to upload command 602575575
615957385	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615957385	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk1NzM4NQ==	simonw 9599	2020-04-18T21:56:16Z	2020-04-18T21:58:11Z	MEMBER	Got this working! I'll do EXIF in a separate ticket #3.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615949574	https://github.com/dogsheep/dogsheep-photos/issues/5#issuecomment-615949574	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/5	MDEyOklzc3VlQ29tbWVudDYxNTk0OTU3NA==	simonw 9599	2020-04-18T21:06:07Z	2020-04-18T21:06:07Z	MEMBER	``` $ photos-to-sqlite s3-auth Create S3 credentials and paste them here: Access key ID: xxx Secret access key: yyy $ cat auth.json { "access_key_id": "xxx", "secret_access_key": "yyy" } ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	photos-to-sqlite s3-auth command 602551638
615948102	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615948102	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0ODEwMg==	simonw 9599	2020-04-18T20:56:59Z	2020-04-18T20:56:59Z	MEMBER	I'm going to start with this: `photos-to-sqlite upload photos.db ~/path/to/directory` This will scan the provided directory (and all sub-directories) for image files. It will then: * Calculate a sha256 of the contents of that file * Upload the file to a key that's `sha256.jpg` or `.heic` * Upload a `sha256.json` file with the original path to the image * Add that image to a `uploads` table in `photos.db` Stretch goal: grab the EXIF data and include that in the `.json` upload AND the `uploads` database table.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615947370	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615947370	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NzM3MA==	simonw 9599	2020-04-18T20:52:13Z	2020-04-18T20:52:13Z	MEMBER	This is great! I now have a key that can upload photos, and a separate key that can download photos OR generate signed URLs to access those photos. Next step: a script that starts uploading my photos.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615947229	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615947229	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NzIyOQ==	simonw 9599	2020-04-18T20:51:26Z	2020-04-18T20:51:26Z	MEMBER	Running the upload again like this resulted in the correct content-type: ```python client.upload_file( "/Users/simonw/Desktop/this_is_fine.jpg", "dogsheep-photos-simon", "this_is_fine.jpg", ExtraArgs={ "ContentType": "image/jpeg" } ) ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615946537	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615946537	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NjUzNw==	simonw 9599	2020-04-18T20:48:13Z	2020-04-18T20:48:13Z	MEMBER	How about generating a signed URL? ```python read_client.generate_presigned_url( "get_object", Params={ "Bucket": "dogsheep-photos-simon", "Key": "this_is_fine.jpg", }, ExpiresIn=600 ) ``` Gave me https://dogsheep-photos-simon.s3.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398 Which does this: ``` ~ $ curl -i 'https://dogsheep-photos-simon.s3.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398' HTTP/1.1 307 Temporary Redirect x-amz-bucket-region: us-west-1 x-amz-request-id: E78CD859AEE21D33 x-amz-id-2: 648mx+1+YSGga7NDOU7Q6isfsKnEPWOLC+DI4+x2o9FCc6pSCdIaoHJUbFMI8Vsuh1ADtx46ymU= Location: https://dogsheep-photos-simon.s3-us-west-1.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398 Content-Type: application/xml Transfer-Encoding: chunked Date: Sat, 18 Apr 2020 20:47:21 GMT Server: AmazonS3 <?xml version="1.0" encoding="UTF-8"?> <Error><Code>TemporaryRedirect</Code><Message>Please re-send this request to the specified temporary endpoint. Continue to use the original request endpoint for future requests.</Message><Endpoint>dogsheep-photos-simon.s3-us-west-1.amazonaws.com</Endpoint><Bucket>dogsheep-photos-simon</Bucket><RequestId>E78CD859AEE21D33</RequestId><HostId>648mx+1+YSGga7NDOU7Q6isfsKnEPWOLC+DI4+x2o9FCc6pSCdIaoHJUbFMI8Vsuh1ADtx46ymU=</HostId></Error>~ $ ``` So it redirects to another URL... which returns this: ``` ~ $ curl -i 'https://dogsheep-photos-simon.s3-us-west-1.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398' HTTP/1.1 200 OK x-amz-id-2: XafOl6mswj3yz0GJC9+Ptot1ll5sROVwqsMc10CUUfgpaUANTdIx2GhnONb5d1GVFJ6wlS2j3UY= x-amz-request-id: 258387C180411AFE Date: Sat, 18 Apr 2020 20:47:52 GMT Last-Modified: Sat, 18 Apr 2020 20:37:35 GMT E…	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615945056	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615945056	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NTA1Ng==	simonw 9599	2020-04-18T20:42:41Z	2020-04-18T20:42:41Z	MEMBER	But... `list_objects` failed for both of my keys (read and write): ![Dogsheep_Photos_S3_access](https://user-images.githubusercontent.com/9599/79670798-75c41780-817a-11ea-9907-2cbc4a2e497c.png)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615944806	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615944806	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NDgwNg==	simonw 9599	2020-04-18T20:41:39Z	2020-04-18T20:41:39Z	MEMBER	This worked! ![Dogsheep_Photos_S3_access](https://user-images.githubusercontent.com/9599/79670712-d868e380-8179-11ea-82a5-5dfd17356113.png) And this worked: ![Dogsheep_Photos_S3_access](https://user-images.githubusercontent.com/9599/79670777-50370e00-817a-11ea-83cd-18ebf5702878.png)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615942116	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615942116	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0MjExNg==	simonw 9599	2020-04-18T20:30:56Z	2020-04-18T20:30:56Z	MEMBER	Next step: attempt a programmatic upload using the `dogsheep-photos-simon-read-write` credentials from a Jupyter notebook. Also attempt a programmatic bucket listing and read using `dogsheep-photos-simon-read` credentials.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615941746	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615941746	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0MTc0Ng==	simonw 9599	2020-04-18T20:29:36Z	2020-04-18T20:29:36Z	MEMBER	I'm going to create another user just for Transmit, with full S3 access. name: `dogsheep-photos-simon-s3-all-access` Rather than creating a group for that user, I'm trying the "Attach existing policies directly" option: ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79670182-03513880-8176-11ea-811a-c80aefb4538a.png) That user DID work with Transmit. I uploaded a test HEIC image. I used Transmit to copy a signed URL for it. ``` ~ $ curl -i 'https://dogsheep-photos-simon.s3.us-west-1.amazonaws.com/IMG_7195.HEIC?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAWXFXAI...' \| head -n 100 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0HTTP/1.1 200 OK x-amz-id-2: gBOCYqZfbNAnv0R/uJ++qm2NbW5SgD4TapgF9RQjzzeDIThcCz/BkKU+YoxlG4NJHlcmMgAHyh4= x-amz-request-id: C2FE7FCC3BD53A84 Date: Sat, 18 Apr 2020 20:28:54 GMT Last-Modified: Sat, 18 Apr 2020 20:13:49 GMT ETag: "fe3e081239a123ef745517878c53b854" Accept-Ranges: bytes Content-Type: image/heic Content-Length: 1913097 Server: AmazonS3 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615936880	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615936880	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzNjg4MA==	simonw 9599	2020-04-18T20:04:31Z	2020-04-18T20:04:31Z	MEMBER	Next step: create two IAM users, one for each of those groups. https://console.aws.amazon.com/iam/home#/users$new?step=details ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79669931-1bc05380-8174-11ea-9657-0e0c6a692d42.png) ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79669941-27137f00-8174-11ea-8ce7-249f0d4f96f6.png) I copied the keys into a secure note in 1password. Couldn't get into Transmit with them though! https://library.panic.com/transmit/transmit5/iam-roles/ may help.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615935577	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615935577	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzNTU3Nw==	simonw 9599	2020-04-18T19:54:59Z	2020-04-18T19:55:30Z	MEMBER	Creating IAM groups called `dogsheep-photos-simon-read-write` and `dogsheep-photos-simon-read`: https://console.aws.amazon.com/iam/home#/groups - I created them with no attached policies. Now I can attach an "inline policy" to each one. For the read-write group I go here: https://console.aws.amazon.com/iam/home#/groups/dogsheep-photos-simon-read-write ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79669703-2d086080-8172-11ea-9597-83e0b155193e.png) Example policies are here: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html For the read-write one I went with: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "s3:", "Resource": [ "arn:aws:s3:::dogsheep-photos-simon/" ] } ] } ``` For the read-only policy I'm going to guess that this is appropriate: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::dogsheep-photos-simon/" ] } ] } ``` I tried the policy simulator to test this out: https://policysim.aws.amazon.com/home/index.jsp?#groups/dogsheep-photos-simon-read - this worked: ![IAM_Policy_Simulator](https://user-images.githubusercontent.com/9599/79669893-cd12b980-8173-11ea-8dfb-5660ce3652da.png)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615933273	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615933273	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzMzI3Mw==	simonw 9599	2020-04-18T19:37:33Z	2020-04-18T19:37:33Z	MEMBER	https://console.aws.amazon.com/s3/bucket/create?region=us-west-1 ![S3_Management_Console](https://user-images.githubusercontent.com/9599/79669552-33e2a380-8171-11ea-9ab5-5785d34f652a.png) I created it with no public read-write access. I plan to use signed URLs via a transforming proxy to access images for display on the web.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615932204	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615932204	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzMjIwNA==	simonw 9599	2020-04-18T19:29:22Z	2020-04-18T19:34:44Z	MEMBER	I'm going to call my bucket `dogsheep-photos-simon`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615932007	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615932007	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzMjAwNw==	simonw 9599	2020-04-18T19:27:55Z	2020-04-18T19:27:55Z	MEMBER	Research thread: https://twitter.com/simonw/status/1249049694984011776 > I want to build some software that lets people store their own data in their own S3 bucket, but if possible I'd like not to have to teach people the incantations needed to get their bucket setup and minimum-permission credentials figures out https://testdriven.io/blog/storing-django-static-and-media-files-on-amazon-s3/ looks useful	{"total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615931488	https://github.com/dogsheep/dogsheep-photos/issues/2#issuecomment-615931488	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/2	MDEyOklzc3VlQ29tbWVudDYxNTkzMTQ4OA==	simonw 9599	2020-04-18T19:24:02Z	2020-04-18T19:24:02Z	MEMBER	I made a start on this last week with a https://github.com/simonw/heic-to-jpeg proxy.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to convert HEIC images to JPEG 602533352
615886206	https://github.com/dogsheep/github-to-sqlite/issues/28#issuecomment-615886206	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/28	MDEyOklzc3VlQ29tbWVudDYxNTg4NjIwNg==	simonw 9599	2020-04-18T15:04:59Z	2020-04-18T15:04:59Z	MEMBER	Demo: https://github-to-sqlite.dogsheep.net/github/contributors Documentation: https://github.com/dogsheep/github-to-sqlite/blob/13f8868fb5efa01c263b24f6dd91c617e6e938e1/README.md#fetching-contributors-to-a-repository	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Pull repository contributors 601333634
615883687	https://github.com/dogsheep/github-to-sqlite/issues/28#issuecomment-615883687	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/28	MDEyOklzc3VlQ29tbWVudDYxNTg4MzY4Nw==	simonw 9599	2020-04-18T14:49:58Z	2020-04-18T14:49:58Z	MEMBER	That happened trying to pull contributors for `dogsheep/beta` - an empty repository. Turns out it was returning a `204 no content`: ``` ~ $ curl -i 'https://api.github.com/repos/dogsheep/beta/contributors' HTTP/1.1 204 No Content ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Pull repository contributors 601333634
615883040	https://github.com/dogsheep/github-to-sqlite/issues/28#issuecomment-615883040	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/28	MDEyOklzc3VlQ29tbWVudDYxNTg4MzA0MA==	simonw 9599	2020-04-18T14:45:38Z	2020-04-18T14:45:38Z	MEMBER	``` File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(args, kwargs) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, kwargs) File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/cli.py", line 219, in contributors utils.save_contributors(db, contributors, repo_full["id"]) File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 354, in save_contributors for contributor in contributors: File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 228, in fetch_contributors for contributors in paginate(url, headers): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 286, in paginate data = response.json() File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/requests/models.py", line 898, in json return complexjson.loads(self.text, kwargs) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/json/__init__.py", line 357, in loads return _default_decoder.decode(s) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from N…	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Pull repository contributors 601333634
615519409	https://github.com/dogsheep/github-to-sqlite/issues/27#issuecomment-615519409	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/27	MDEyOklzc3VlQ29tbWVudDYxNTUxOTQwOQ==	simonw 9599	2020-04-18T00:19:16Z	2020-04-18T00:19:16Z	MEMBER	``` $ github-to-sqlite repos b.db dogsheep $ sqlite3 b.db '.schema repos' CREATE TABLE [repos] ( [id] INTEGER PRIMARY KEY, ... [permissions] TEXT, [organization] INTEGER REFERENCES [users]([id]), FOREIGN KEY(license) REFERENCES licenses(key) ); ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Repos have a big blob of JSON in the organization column 601330277
615518606	https://github.com/dogsheep/github-to-sqlite/issues/27#issuecomment-615518606	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/27	MDEyOklzc3VlQ29tbWVudDYxNTUxODYwNg==	simonw 9599	2020-04-18T00:14:32Z	2020-04-18T00:14:32Z	MEMBER	https://github.com/simonw/sqlite-utils/issues/100 is done and released in sqlite-utils 2.7.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Repos have a big blob of JSON in the organization column 601330277
615513491	https://github.com/dogsheep/twitter-to-sqlite/issues/43#issuecomment-615513491	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/43	MDEyOklzc3VlQ29tbWVudDYxNTUxMzQ5MQ==	simonw 9599	2020-04-17T23:48:28Z	2020-04-17T23:48:28Z	MEMBER	Released in 0.21.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	"twitter-to-sqlite lists" command for retrieving a user's owned lists 602176870
615510361	https://github.com/dogsheep/twitter-to-sqlite/issues/37#issuecomment-615510361	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/37	MDEyOklzc3VlQ29tbWVudDYxNTUxMDM2MQ==	simonw 9599	2020-04-17T23:38:27Z	2020-04-17T23:38:27Z	MEMBER	That's a bit tricky since I'd have to rewrite the internals of a bunch of other commands. For the moment I'll exit the script with an error but at least it will be a decent error!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Handle "User not found" error 585353598
615509803	https://github.com/dogsheep/twitter-to-sqlite/issues/37#issuecomment-615509803	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/37	MDEyOklzc3VlQ29tbWVudDYxNTUwOTgwMw==	simonw 9599	2020-04-17T23:36:40Z	2020-04-17T23:36:40Z	MEMBER	I'm going to print a warning to stderr, skip and continue - because if you have 100 screen names and only one of them is invalid you should still execute for the other 99.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Handle "User not found" error 585353598
615509578	https://github.com/dogsheep/twitter-to-sqlite/issues/37#issuecomment-615509578	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/37	MDEyOklzc3VlQ29tbWVudDYxNTUwOTU3OA==	simonw 9599	2020-04-17T23:36:00Z	2020-04-17T23:36:00Z	MEMBER	``` $ twitter-to-sqlite user-timeline doggo.db doggoenthuonetuh Traceback (most recent call last): ... File "/Users/simonw/Dropbox/Development/twitter-to-sqlite/twitter_to_sqlite/utils.py", line 272, in transform_user user["created_at"] = parser.parse(user["created_at"]) KeyError: 'created_at' ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Handle "User not found" error 585353598
614843406	https://github.com/dogsheep/github-to-sqlite/issues/27#issuecomment-614843406	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/27	MDEyOklzc3VlQ29tbWVudDYxNDg0MzQwNg==	simonw 9599	2020-04-16T19:11:53Z	2020-04-16T19:20:23Z	MEMBER	This didn't quite work: the column type is incorrect, so the foreign key relationship isn't sticking: https://github-to-sqlite.dogsheep.net/github/repos?organization=53015001 `[organization] TEXT REFERENCES [users]([id])` - should be `INTEGER`. The problem is that if the first repo inserted has no organization it's set to `null`, which `sqlite-utils` derives as a `TEXT` column. One solution would be to create the column explicitly with a type, but this could get messy. I think I want a new sqlite-utils feature for this instead.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Repos have a big blob of JSON in the organization column 601330277
614831842	https://github.com/dogsheep/github-to-sqlite/issues/27#issuecomment-614831842	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/27	MDEyOklzc3VlQ29tbWVudDYxNDgzMTg0Mg==	simonw 9599	2020-04-16T18:48:18Z	2020-04-16T18:48:18Z	MEMBER	I'm going to make `organization` another foreign key to the `users` table just in case it IS possible (maybe with GitHub Enterprise or similar?)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Repos have a big blob of JSON in the organization column 601330277
614831451	https://github.com/dogsheep/github-to-sqlite/issues/27#issuecomment-614831451	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/27	MDEyOklzc3VlQ29tbWVudDYxNDgzMTQ1MQ==	simonw 9599	2020-04-16T18:47:25Z	2020-04-16T18:47:25Z	MEMBER	Is it possible for a repo to have an `owner` that differs from its `organization`?	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Repos have a big blob of JSON in the organization column 601330277
614810417	https://github.com/dogsheep/github-to-sqlite/issues/25#issuecomment-614810417	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/25	MDEyOklzc3VlQ29tbWVudDYxNDgxMDQxNw==	simonw 9599	2020-04-16T18:07:11Z	2020-04-16T18:07:11Z	MEMBER	Turns out the main problem was #26 - now fixed.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Improvements to demo instance 601265023
614795712	https://github.com/dogsheep/github-to-sqlite/issues/26#issuecomment-614795712	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/26	MDEyOklzc3VlQ29tbWVudDYxNDc5NTcxMg==	simonw 9599	2020-04-16T17:40:27Z	2020-04-16T17:40:27Z	MEMBER	Aha! it was missing from the `fetch_repo()` function.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Topics are missing from repositories 601271612
614794739	https://github.com/dogsheep/github-to-sqlite/issues/26#issuecomment-614794739	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/26	MDEyOklzc3VlQ29tbWVudDYxNDc5NDczOQ==	simonw 9599	2020-04-16T17:38:28Z	2020-04-16T17:38:28Z	MEMBER	I'm already doing this here: https://github.com/dogsheep/github-to-sqlite/blob/c4aaa50e167cfa9021c7c94260bc3e89e10947bf/github_to_sqlite/utils.py#L246-L250	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Topics are missing from repositories 601271612
613641947	https://github.com/dogsheep/github-to-sqlite/issues/14#issuecomment-613641947	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/14	MDEyOklzc3VlQ29tbWVudDYxMzY0MTk0Nw==	simonw 9599	2020-04-14T19:38:24Z	2020-04-14T19:38:34Z	MEMBER	Since events include payloads with full object representations in them (for issues, repos and more) running this command every few minutes may be all it takes to keep a constant copy of everything updated in a very rate-limit friendly manner (thanks to the ETags).	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Command for importing events 530491074
613611455	https://github.com/dogsheep/github-to-sqlite/issues/16#issuecomment-613611455	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/16	MDEyOklzc3VlQ29tbWVudDYxMzYxMTQ1NQ==	simonw 9599	2020-04-14T18:37:21Z	2020-04-14T18:37:21Z	MEMBER	This should have been fixed by #20 and #23 @jayvdb I'm definitely interested in this tool working as a library - it's purely designed as a CLI tool at the moment, but cleaning it up to work better as a dependency is totally in-scope for the project. https://sqlite-utils.readthedocs.io/ is an example of a tool I've built that works for both. Feel free to open a new issue here with some notes on what you would need for this to work as a library for your project!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Exception running first command: IndexError: list index out of range 546051181
607019151	https://github.com/dogsheep/twitter-to-sqlite/issues/40#issuecomment-607019151	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/40	MDEyOklzc3VlQ29tbWVudDYwNzAxOTE1MQ==	simonw 9599	2020-04-01T04:11:10Z	2020-04-01T04:11:10Z	MEMBER	In testing this collects a LOT of data. I'm going to skip tracking favourites_count and statuses_count and just track followers, friends and listed instead.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Feature: record history of follower counts 590669793
607011972	https://github.com/dogsheep/twitter-to-sqlite/issues/40#issuecomment-607011972	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/40	MDEyOklzc3VlQ29tbWVudDYwNzAxMTk3Mg==	simonw 9599	2020-04-01T03:49:02Z	2020-04-01T03:50:01Z	MEMBER	I want the datetime value to look like `2020-04-01T03:34:58+00:00` (the format returned by the Twitter API which I am storing in other tables at the moment). ``` >>> datetime.utcnow().isoformat().split('.')[0] + '+00:00' '2020-04-01T03:49:52+00:00' ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Feature: record history of follower counts 590669793
607011421	https://github.com/dogsheep/twitter-to-sqlite/issues/40#issuecomment-607011421	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/40	MDEyOklzc3VlQ29tbWVudDYwNzAxMTQyMQ==	simonw 9599	2020-04-01T03:47:37Z	2020-04-01T03:55:08Z	MEMBER	Actually a single table with a `type` integer ID referencing a `count_history_types` table would better match the way I implemented the `since_ids` table: https://github.com/dogsheep/twitter-to-sqlite/blob/4b6c8d8c1cc6fefdb566ec8506157133f47c569a/twitter_to_sqlite/utils.py#L331-L341 In which case the compound primary key would be `type`, `user`, `datetime`	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Feature: record history of follower counts 590669793
607010791	https://github.com/dogsheep/twitter-to-sqlite/issues/10#issuecomment-607010791	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/10	MDEyOklzc3VlQ29tbWVudDYwNzAxMDc5MQ==	simonw 9599	2020-04-01T03:45:48Z	2020-04-01T03:45:48Z	MEMBER	I'm happy with the recent work I did on this.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rethink progress bars for various commands 492297930
607010634	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-607010634	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNzAxMDYzNA==	simonw 9599	2020-04-01T03:45:16Z	2020-04-01T03:45:16Z	MEMBER	OK, fix is applied to everything now.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
607003655	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-607003655	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNzAwMzY1NQ==	simonw 9599	2020-04-01T03:18:00Z	2020-04-01T03:18:00Z	MEMBER	I've got this working for the `user-timeline` command.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606998669	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606998669	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjk5ODY2OQ==	simonw 9599	2020-04-01T02:57:36Z	2020-04-01T02:57:36Z	MEMBER	The tricky thing here is thinking about the interaction between the recorded since_id and a desire to run the initial import. The first time you run `twitter-to-sqlite user-timeline db.db username` we want to fetch as many tweets from that user as possible - probably around 3,200 before the API limitations cut us off. We need to record the maximum ID from those as the `since_id` - which we will see on the very first page we paginate through. That way next time we run the command with `--since` we will only fetch new tweets. But what happens if our initial import is cancelled after only a few tweets? We risk never pulling in the rest of the tweets. Not sure if I need to solve this at all or if I should instead trust users to run the command a second time without `--since` if they think they didn't retrieve anything the first time. I had considered letting `--stop_after=` over-ride `--since` but that doesn't actually make sense - if you send a since_id to the Twitter API you'll never get back more tweets than exist after that ID, so the `--stop_after` would not make a meaningful difference.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606850453	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606850453	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjg1MDQ1Mw==	simonw 9599	2020-03-31T20:14:58Z	2020-04-01T03:03:50Z	MEMBER	Actually I'll hard-code the population of `since_id_types` to get known ID constants.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606850008	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606850008	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjg1MDAwOA==	simonw 9599	2020-03-31T20:13:59Z	2020-04-01T00:23:00Z	MEMBER	Table design for `since_ids` table: type \| key \| since_id --- \| --- \| --- 1 \| 124324 \| 2347239847293 2 \| 99ff9cefff5cbfd804f7cd43e2b27ced8addbe8d \| 2125947927344 Primary compound key on `(category, key)` `type` is also a foreign key to a `since_id_types` table with `id` and `name` columns (probably created using https://sqlite-utils.readthedocs.io/en/stable/python-api.html#working-with-lookup-tables )	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606844521	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606844521	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjg0NDUyMQ==	simonw 9599	2020-03-31T20:01:39Z	2020-03-31T20:01:39Z	MEMBER	I think `utils.fetch_timeline()` grows a new argument, `since_key`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606843224	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606843224	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjg0MzIyNA==	simonw 9599	2020-03-31T19:59:11Z	2020-03-31T20:06:32Z	MEMBER	Or... have a single `since_ids` table to track since values, and have its primary key be a string that looks something like this: `user:123145` `home:23441` `mentions:23425` `search:99ff9cefff5cbfd804f7cd43e2b27ced8addbe8d` That last example would use the hash generated here: https://github.com/dogsheep/twitter-to-sqlite/blob/810cb2af5a175837204389fd7f4b5721f8b325ab/twitter_to_sqlite/cli.py#L792-L808	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606824992	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606824992	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjgyNDk5Mg==	simonw 9599	2020-03-31T19:24:23Z	2020-03-31T19:24:23Z	MEMBER	The `--since` option is actually used by four commands: * `user-timeline` * `home-timeline` * `mentions-timeline` * `search` All of them use the same `fetch_timeline()` utility function under the hood. I should move the logic that looks up the last `since_id` into that shared function. Question: should I have a table for each of those four methods or a single table that is used by them all? I'm leaning towards four separate tables.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606309165	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606309165	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjMwOTE2NQ==	simonw 9599	2020-03-30T23:41:31Z	2020-03-30T23:41:31Z	MEMBER	I like the separate `user_timeline_since` table solution.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606307376	https://github.com/dogsheep/twitter-to-sqlite/issues/40#issuecomment-606307376	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/40	MDEyOklzc3VlQ29tbWVudDYwNjMwNzM3Ng==	simonw 9599	2020-03-30T23:35:40Z	2020-03-30T23:39:15Z	MEMBER	I think five separate tables: * followers_count_history * friends_count_history * listed_count_history * favourites_count_history * statuses_count_history Each with the following structure: * datetime (ISO UTC) * user (ID, foreign key to users) * count (integer) I'm tempted to have a compound primary key here - user, datetime	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Feature: record history of follower counts 590669793
606307019	https://github.com/dogsheep/twitter-to-sqlite/issues/40#issuecomment-606307019	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/40	MDEyOklzc3VlQ29tbWVudDYwNjMwNzAxOQ==	simonw 9599	2020-03-30T23:34:27Z	2020-03-30T23:34:27Z	MEMBER	The count properties available for a user are: * followers_count * friends_count * listed_count * favourites_count * statuses_count May as well track history for all of them? Should be pretty cheap to store.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Feature: record history of follower counts 590669793
606305701	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606305701	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjMwNTcwMQ==	simonw 9599	2020-03-30T23:30:27Z	2020-03-30T23:30:27Z	MEMBER	A better alternative would be to maintain a separate table with the last seen since value for when we ran `user-timeline` for any specific user.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
606304837	https://github.com/dogsheep/twitter-to-sqlite/issues/39#issuecomment-606304837	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/39	MDEyOklzc3VlQ29tbWVudDYwNjMwNDgzNw==	simonw 9599	2020-03-30T23:27:50Z	2020-03-30T23:29:31Z	MEMBER	One option would be something like this: ```sql select max(id) from tweets where user = ? and not exists (select id from tweets where retweeted_status = id) and not exists (select id from tweets where quoted_status = id) and not exists (select id from tweets where in_reply_to_status_id = id) ``` Might be a good idea to index those columns (after confirming that doing so would indeed speed up the query).	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	--since feature can be confused by retweets 590666760
605382373	https://github.com/dogsheep/swarm-to-sqlite/pull/6#issuecomment-605382373	https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/6	MDEyOklzc3VlQ29tbWVudDYwNTM4MjM3Mw==	simonw 9599	2020-03-28T02:27:32Z	2020-03-28T02:27:32Z	MEMBER	Thanks!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	don't break if source is missing 543355051
605338322	https://github.com/dogsheep/pocket-to-sqlite/issues/2#issuecomment-605338322	https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/2	MDEyOklzc3VlQ29tbWVudDYwNTMzODMyMg==	simonw 9599	2020-03-27T22:18:02Z	2020-03-27T22:18:02Z	MEMBER	Just needs documentation now.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Track and use the 'since' value 503234169
605337941	https://github.com/dogsheep/pocket-to-sqlite/issues/2#issuecomment-605337941	https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/2	MDEyOklzc3VlQ29tbWVudDYwNTMzNzk0MQ==	simonw 9599	2020-03-27T22:16:32Z	2020-03-27T22:16:32Z	MEMBER	Need to test this. I have 7,394 items in my database right now. I'm going to save a new thing. Then I ran this: ``` pocket-to-sqlite fetch pocket-simon.db ``` And it worked!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Track and use the 'since' value 503234169
605327655	https://github.com/dogsheep/pocket-to-sqlite/issues/1#issuecomment-605327655	https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/1	MDEyOklzc3VlQ29tbWVudDYwNTMyNzY1NQ==	simonw 9599	2020-03-27T21:42:49Z	2020-03-27T21:42:49Z	MEMBER	Or maybe it was because of the current Google Cloud outage? https://news.ycombinator.com/item?id=22706677	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Use better pagination (and implement progress bar) 503233021
605325897	https://github.com/dogsheep/pocket-to-sqlite/issues/1#issuecomment-605325897	https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/1	MDEyOklzc3VlQ29tbWVudDYwNTMyNTg5Nw==	simonw 9599	2020-03-27T21:37:26Z	2020-03-27T21:38:37Z	MEMBER	I keep getting 503 errors even though I appear to be staying within the rate limit: ``` {'Date': 'Fri, 27 Mar 2020 21:35:57 GMT', 'Content-Type': 'application/json', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Server': 'Apache/2.4.25 (Debian)', 'Content-Location': 'get.php', 'Vary': 'negotiate', 'TCN': 'choice', 'Set-Cookie': '...; httponly', 'X-Frame-Options': 'SAMEORIGIN', 'Status': '200 OK', 'X-Limit-Key-Limit': '10000', 'X-Limit-Key-Remaining': '9960', 'X-Limit-Key-Reset': '282', 'X-Source': 'Pocket', 'P3P': 'policyref="/w3c/p3p.xml", CP="ALL CURa ADMa DEVa OUR IND UNI COM NAV INT STA PRE"'} [##----------------------------------] 6% 06:49:27 {'Date': 'Fri, 27 Mar 2020 21:36:06 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Content-Length': '23', 'Connection': 'keep-alive', 'Server': 'Apache/2.4.25 (Debian)', 'Content-Location': 'get.php', 'Vary': 'negotiate', 'TCN': 'choice', 'Set-Cookie': '...', 'X-Frame-Options': 'SAMEORIGIN', 'X-Error': 'Pocket is currently under heavy load. Please wait a moment and try again.', 'X-Error-Code': '199', 'Status': '503 Service Unavailable', 'X-Source': 'Pocket', 'P3P': 'policyref="/w3c/p3p.xml", CP="ALL CURa ADMa DEVa OUR IND UNI COM NAV INT STA PRE"'} ``` I'm going to try doing a few automatic retries any time I see a 503 error.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Use better pagination (and implement progress bar) 503233021
605316146	https://github.com/dogsheep/pocket-to-sqlite/issues/1#issuecomment-605316146	https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/1	MDEyOklzc3VlQ29tbWVudDYwNTMxNjE0Ng==	simonw 9599	2020-03-27T21:09:15Z	2020-03-27T21:09:22Z	MEMBER	For a progress bar I need to know how many total items there are. I found an undocumented API for this! `/v3/stats` which returns: ```json { "count_list": 7394, "count_read": 1016, "count_unread": 6378, "status": 1 } ``` I guessed this based on the documented v2 API: https://getpocket.com/api/v2_docs/#stats	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Use better pagination (and implement progress bar) 503233021
602928533	https://github.com/dogsheep/github-to-sqlite/issues/23#issuecomment-602928533	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/23	MDEyOklzc3VlQ29tbWVudDYwMjkyODUzMw==	simonw 9599	2020-03-24T00:15:49Z	2020-03-24T00:15:49Z	MEMBER	https://github.com/dogsheep/github-to-sqlite/releases/tag/1.0	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Release 1.0 586595839
602924714	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602924714	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjkyNDcxNA==	simonw 9599	2020-03-24T00:03:25Z	2020-03-24T00:03:25Z	MEMBER	This is good enough for the 1.0 release.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281
602920163	https://github.com/dogsheep/github-to-sqlite/issues/21#issuecomment-602920163	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/21	MDEyOklzc3VlQ29tbWVudDYwMjkyMDE2Mw==	simonw 9599	2020-03-23T23:48:22Z	2020-03-23T23:48:22Z	MEMBER	I'm happy with this pattern: https://github.com/dogsheep/github-to-sqlite/blob/f78c4e9baaf0970ffab266ba780df7240aae9f32/github_to_sqlite/utils.py#L4-L18	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Turn GitHub API errors into exceptions 586561727
602919058	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602919058	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjkxOTA1OA==	simonw 9599	2020-03-23T23:44:48Z	2020-03-23T23:44:48Z	MEMBER	Next step: use a `metadata.json` file to add some extras. And add the `datasette-render-markdown` plugin as soon as I ship https://github.com/simonw/datasette-render-markdown/issues/2 (GFM support).	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281
602918689	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602918689	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjkxODY4OQ==	simonw 9599	2020-03-23T23:43:39Z	2020-03-23T23:47:50Z	MEMBER	I pointed https://github-to-sqlite.dogsheep.net/ at it. May take a few minutes for the new certificate to provision though.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281
602917713	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602917713	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjkxNzcxMw==	simonw 9599	2020-03-23T23:40:29Z	2020-03-23T23:40:29Z	MEMBER	Most recently updated issues across all Dogsheep repos, with faceting: https://github-to-sqlite-j7hipcg4aq-uc.a.run.app/github/issues?_facet=repo&_facet=user&_facet=state&_facet=author_association&_facet=type&_sort_desc=updated_at	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281
602916947	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602916947	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjkxNjk0Nw==	simonw 9599	2020-03-23T23:38:06Z	2020-03-23T23:38:06Z	MEMBER	Woohoo! https://github-to-sqlite-j7hipcg4aq-uc.a.run.app/	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281
602896434	https://github.com/dogsheep/github-to-sqlite/issues/21#issuecomment-602896434	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/21	MDEyOklzc3VlQ29tbWVudDYwMjg5NjQzNA==	simonw 9599	2020-03-23T22:43:37Z	2020-03-23T22:43:37Z	MEMBER	I'm going to do this now to help figure out the latest error in #13: ``` Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.8.2/x64/bin/github-to-sqlite", line 11, in <module> load_entry_point('github-to-sqlite', 'console_scripts', 'github-to-sqlite')() File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(args, kwargs) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, **kwargs) File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/cli.py", line 237, in commits utils.save_commits(db, commits, repo_full["id"]) File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 345, in save_commits for commit in commits: File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 207, in fetch_commits if stop_when(commit): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/cli.py", line 224, in stop_when db["commits"].get(commit["sha"]) TypeError: string indices must be integers ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Turn GitHub API errors into exceptions 586561727
602895896	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602895896	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjg5NTg5Ng==	simonw 9599	2020-03-23T22:42:25Z	2020-03-23T22:42:25Z	MEMBER	Urgh this is such a mess! I should have done this on a branch / pull request to avoid polluting my main master history, but never mind.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281
602862967	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602862967	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjg2Mjk2Nw==	simonw 9599	2020-03-23T21:22:04Z	2020-03-23T21:22:04Z	MEMBER	Following these instructions: https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281
602862236	https://github.com/dogsheep/github-to-sqlite/issues/13#issuecomment-602862236	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/13	MDEyOklzc3VlQ29tbWVudDYwMjg2MjIzNg==	simonw 9599	2020-03-23T21:20:26Z	2020-03-23T21:20:26Z	MEMBER	I'll run the `commits` and `issues` and `issue-comments` commands in addition to the `releases` command.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Set up a live demo Datasette instance 521275281

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);