github: issue_comments: 525 rows where author_association = "MEMBER" sorted by html

525 rows where author_association = "MEMBER" sorted by html_url

Search:

descending

id	html_url ▼	issue_url	node_id	user	created_at	updated_at	author_association	body	reactions	issue
623723687	https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623723687	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15	MDEyOklzc3VlQ29tbWVudDYyMzcyMzY4Nw==	simonw 9599	2020-05-04T21:43:06Z	2020-05-04T21:43:06Z	MEMBER	It looks like I can map the photos I'm importing to these tables using the `ZUUID` column on `ZGENERICASSET` to get a `Z_PK` which then maps to the rows in `ZGENERICASSET`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767
623730934	https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623730934	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15	MDEyOklzc3VlQ29tbWVudDYyMzczMDkzNA==	simonw 9599	2020-05-04T22:00:38Z	2020-05-04T22:00:48Z	MEMBER	Here's the query to create the new table: ```sql create table apple_photos_scores as select ZGENERICASSET.ZUUID, ZGENERICASSET.ZOVERALLAESTHETICSCORE, ZGENERICASSET.ZCURATIONSCORE, ZGENERICASSET.ZPROMOTIONSCORE, ZGENERICASSET.ZHIGHLIGHTVISIBILITYSCORE, ZCOMPUTEDASSETATTRIBUTES.ZBEHAVIORALSCORE, ZCOMPUTEDASSETATTRIBUTES.ZFAILURESCORE, ZCOMPUTEDASSETATTRIBUTES.ZHARMONIOUSCOLORSCORE, ZCOMPUTEDASSETATTRIBUTES.ZIMMERSIVENESSSCORE, ZCOMPUTEDASSETATTRIBUTES.ZINTERACTIONSCORE, ZCOMPUTEDASSETATTRIBUTES.ZINTERESTINGSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZINTRUSIVEOBJECTPRESENCESCORE, ZCOMPUTEDASSETATTRIBUTES.ZLIVELYCOLORSCORE, ZCOMPUTEDASSETATTRIBUTES.ZLOWLIGHT, ZCOMPUTEDASSETATTRIBUTES.ZNOISESCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTCAMERATILTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTCOMPOSITIONSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTLIGHTINGSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPATTERNSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPERSPECTIVESCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPOSTPROCESSINGSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTREFLECTIONSSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTSYMMETRYSCORE, ZCOMPUTEDASSETATTRIBUTES.ZSHARPLYFOCUSEDSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZTASTEFULLYBLURREDSCORE, ZCOMPUTEDASSETATTRIBUTES.ZWELLCHOSENSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZWELLFRAMEDSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZWELLTIMEDSHOTSCORE from attached.ZGENERICASSET join attached.ZCOMPUTEDASSETATTRIBUTES on attached.ZGENERICASSET.Z_PK = attached.ZCOMPUTEDASSETATTRIBUTES.Z_PK; ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767
623739934	https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623739934	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15	MDEyOklzc3VlQ29tbWVudDYyMzczOTkzNA==	simonw 9599	2020-05-04T22:24:26Z	2020-05-04T22:24:26Z	MEMBER	Twitter thread with some examples of photos that are coming up from queries against these scores: https://twitter.com/simonw/status/1257434670750408705	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767
623805823	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623805823	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzgwNTgyMw==	simonw 9599	2020-05-05T02:45:56Z	2020-05-05T02:45:56Z	MEMBER	I filed an issue with `osxphotos` about this here: https://github.com/RhetTbull/osxphotos/issues/121	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623806085	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806085	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzgwNjA4NQ==	simonw 9599	2020-05-05T02:47:18Z	2020-05-05T02:47:18Z	MEMBER	In https://github.com/RhetTbull/osxphotos/issues/121#issuecomment-623249263 Rhet Turnbull spotted a table called `ZSCENEIDENTIFIER` which looked like it might have the right data, but the columns in it aren't particularly helpful: ``` Z_PK,Z_ENT,Z_OPT,ZSCENEIDENTIFIER,ZASSETATTRIBUTES,ZCONFIDENCE 8,49,1,731,5,0.11834716796875 9,49,1,684,6,0.0233648251742125 10,49,1,1702,1,0.026153564453125 ``` I love the look of those confidence scores, but what do the numbers mean?	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623806533	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806533	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzgwNjUzMw==	simonw 9599	2020-05-05T02:50:16Z	2020-05-05T02:50:16Z	MEMBER	I figured there must be a separate database that Photos uses to store the text of the identified labels. I used "Open Files and Ports" in Activity Monitor against the Photos app to try and spot candidates... and found `/Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite` - a 53MB SQLite database file. <img width="1365" alt="Item-0_and_Item-0_and_Item-0_and_Item-0" src="https://user-images.githubusercontent.com/9599/81031213-61ea0800-8e40-11ea-8237-cfce4a5128e0.png"> Here's the schema of that file: ``` $ sqlite3 psi.sqlite .schema CREATE TABLE word_embedding(word TEXT, extended_word TEXT, score DOUBLE); CREATE INDEX word_embedding_index ON word_embedding(word); CREATE VIRTUAL TABLE word_embedding_prefix USING fts5(extended_word) /* word_embedding_prefix(extended_word) */; CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_data'(id INTEGER PRIMARY KEY, block BLOB); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID; CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_content'(id INTEGER PRIMARY KEY, c0); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_docsize'(id INTEGER PRIMARY KEY, sz BLOB); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_config'(k PRIMARY KEY, v) WITHOUT ROWID; CREATE TABLE groups(category INT2, owning_groupid INT, content_string TEXT, normalized_string TEXT, lookup_identifier TEXT, token_ranges_0 INT8, token_ranges_1 INT8, UNIQUE(category, owning_groupid, content_string, lookup_identifier, token_ranges_0, token_ranges_1)); CREATE TABLE assets(uuid_0 INT, uuid_1 INT, creationDate INT, UNIQUE(uuid_0, uuid_1)); CREATE TABLE ga(groupid INT, assetid INT, PRIMARY KEY(groupid, assetid)); CREATE TABLE collections(uuid_0 INT, uuid_1 INT, startDate INT, endDate INT, title TEXT, subtitle TEXT, keyAssetUUID_0 INT, keyAssetUUID_1 INT, typeAndNumberOfAssets INT32, sortDate DOUBLE, UNIQUE(uuid_0, uuid_1)); CREATE TABLE gc(groupid INT, collectionid INT, PRIMARY KEY(groupid, collectionid)); CREATE…	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623806687	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806687	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzgwNjY4Nw==	simonw 9599	2020-05-05T02:51:16Z	2020-05-05T02:51:16Z	MEMBER	Running datasette against it directly doesn't work: ``` simon@Simons-MacBook-Pro search % datasette psi.sqlite Serve! files=('psi.sqlite',) (immutables=()) on port 8001 Usage: datasette serve [OPTIONS] [FILES]... Error: Connection to psi.sqlite failed check: no such tokenizer: PSITokenizer ``` Instead, I created a new SQLite database with a copy of some of the key tables, like this: ``` sqlite-utils rows psi.sqlite groups \| sqlite-utils insert /tmp/search.db groups - sqlite-utils rows psi.sqlite assets \| sqlite-utils insert /tmp/search.db assets - sqlite-utils rows psi.sqlite ga \| sqlite-utils insert /tmp/search.db ga - sqlite-utils rows psi.sqlite collections \| sqlite-utils insert /tmp/search.db collections - sqlite-utils rows psi.sqlite gc \| sqlite-utils insert /tmp/search.db gc - sqlite-utils rows psi.sqlite lookup \| sqlite-utils insert /tmp/search.db lookup - ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623807568	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623807568	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzgwNzU2OA==	simonw 9599	2020-05-05T02:56:06Z	2020-05-05T02:56:06Z	MEMBER	I'm pretty sure this is what I'm after. The `groups` table has what looks like identified labels in the rows with category = 2025: <img width="1122" alt="words__groups__2_528_rows_where_where_category___2025" src="https://user-images.githubusercontent.com/9599/81031361-e0df4080-8e40-11ea-9060-6d850aa52140.png"> Then there's a `ga` table that maps groups to assets: <img width="304" alt="words__ga__633_653_rows" src="https://user-images.githubusercontent.com/9599/81031387-f48aa700-8e40-11ea-9a3d-da23903be928.png"> And an `assets` table which looks like it has one row for every one of my photos: <img width="645" alt="words__assets__40_419_rows" src="https://user-images.githubusercontent.com/9599/81031402-04a28680-8e41-11ea-8047-e9199d068563.png"> One major challenge: these UUIDs are split into two integer numbers, `uuid_0` and `uuid_1` - but the main photos database uses regular UUIDs like this: ![image](https://user-images.githubusercontent.com/9599/81031481-39164280-8e41-11ea-983b-005ced641a18.png) I need to figure out how to match up these two different UUID representations. I asked on Twitter if anyone has any ideas: https://twitter.com/simonw/status/1257500689019703296	{"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623811131	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623811131	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzgxMTEzMQ==	simonw 9599	2020-05-05T03:16:18Z	2020-05-05T03:16:18Z	MEMBER	Here's how to convert two integers unto a UUID using Java. Not sure if it's the solution I need though (or how to do the same thing in Python): https://repl.it/repls/EuphoricSomberClasslibrary <img width="1494" alt="Repl_it_-_EuphoricSomberClasslibrary" src="https://user-images.githubusercontent.com/9599/81032267-0d488c00-8e44-11ea-9be7-680eaccd1611.png"> ```java import java.util.UUID; class Main { public static void main(String[] args) { java.util.UUID uuid = new java.util.UUID( 2544182952487526660L, -3640314103732024685L ); System.out.println( uuid ); } } ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623846880	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623846880	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzg0Njg4MA==	simonw 9599	2020-05-05T04:06:08Z	2020-05-05T04:06:08Z	MEMBER	This function seems to convert them into UUIDs that match my photos: ```python def to_uuid(uuid_0, uuid_1): b = uuid_0.to_bytes(8, 'little', signed=True) + uuid_1.to_bytes(8, 'little', signed=True) return str(uuid.UUID(bytes=b)).upper() ```	{"total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623855841	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855841	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzg1NTg0MQ==	simonw 9599	2020-05-05T04:54:28Z	2020-05-05T04:54:28Z	MEMBER	Things were not matching up for me correctly: <img width="1143" alt="search__select_json_object__img_src____https___photos_simonwillison_net_i______photos_sha256___________photos_ext______w_400___as_photo__groups_content_string__assets_uuid_0__assets_uuid_1__to_uuid_assets_uuid_0__assets_uuid_1__as_uuid__pho" src="https://user-images.githubusercontent.com/9599/81035923-ca8db080-8e51-11ea-95a7-6ee60bae7502.png"> I think that's because my import script didn't correctly import the existing `rowid` values.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623855885	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855885	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzg1NTg4NQ==	simonw 9599	2020-05-05T04:54:39Z	2020-05-05T04:54:53Z	MEMBER	Trying this import mechanism instead: `sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite .dump \| grep -v 'CREATE INDEX' \| grep -v 'CREATE TRIGGER' \| grep -v 'CREATE VIRTUAL TABLE' \| sqlite3 search.db`	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623857417	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623857417	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzg1NzQxNw==	simonw 9599	2020-05-05T05:01:47Z	2020-05-05T05:01:47Z	MEMBER	Even that didn't work - it didn't copy across the rowid values. I'm pretty sure that's what's wrong here: ``` sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite 'select rowid, uuid_0, uuid_1 from assets limit 10' 1619605\|-9205353363298198838\|4814875488794983828 1641378\|-9205348195631362269\|390804289838822030 1634974\|-9205331524553603243\|-3834026796261633148 1619083\|-9205326176986145401\|7563404215614709654 22131\|-9205315724827218763\|8370531509591906734 1645633\|-9205247376092758131\|-1311540150497601346 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623863902	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623863902	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzg2MzkwMg==	simonw 9599	2020-05-05T05:31:53Z	2020-05-05T05:31:53Z	MEMBER	Yes! Turning those `rowid` values into `id` with this script did the job: ```python import sqlite3 import sqlite_utils conn = sqlite3.connect( "/Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite" ) def all_rows(table): result = conn.execute("select rowid as id, * from {}".format(table)) cols = [c[0] for c in result.description] for row in result.fetchall(): yield dict(zip(cols, row)) if __name__ == "__main__": db = sqlite_utils.Database("psi_copy.db") for table in ("assets", "collections", "ga", "gc", "groups"): db[table].upsert_all(all_rows(table), pk="id", alter=True) ``` Then I ran this query: ```sql select json_object('img_src', 'https://photos.simonwillison.net/i/' \|\| photos.sha256 \|\| '.' \|\| photos.ext \|\| '?w=400') as photo, group_concat(strip_null_chars(groups.content_string), ' ') as words, assets.uuid_0, assets.uuid_1, to_uuid(assets.uuid_0, assets.uuid_1) as uuid from assets join ga on assets.id = ga.assetid join groups on ga.groupid = groups.id join photos on photos.uuid = to_uuid(assets.uuid_0, assets.uuid_1) where groups.category = 2024 group by assets.id order by random() limit 10 ``` And got these results! <img width="1054" alt="psi_copy__select_json_object__img_src____https___photos_simonwillison_net_i______photos_sha256___________photos_ext______w_400___as_photo__group_concat_strip_null_chars_groups_content_string________as_words__assets_uuid_0__assets_uuid_1__to" src="https://user-images.githubusercontent.com/9599/81037264-f1021a80-8e56-11ea-9924-6f9f55a0fb4b.png">	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
623865250	https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623865250	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16	MDEyOklzc3VlQ29tbWVudDYyMzg2NTI1MA==	simonw 9599	2020-05-05T05:38:16Z	2020-05-05T05:38:16Z	MEMBER	It looks like `groups.content_string` often has a null byte in it. I should clean this up as part of the import.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234
624278090	https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624278090	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17	MDEyOklzc3VlQ29tbWVudDYyNDI3ODA5MA==	simonw 9599	2020-05-05T20:06:01Z	2020-05-05T20:06:01Z	MEMBER	https://www.python.org/dev/peps/pep-0508/#environment-markers I think I want `sys_platform` of `darwin`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Only install osxphotos if running on macOS 612860531
624278714	https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624278714	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17	MDEyOklzc3VlQ29tbWVudDYyNDI3ODcxNA==	simonw 9599	2020-05-05T20:07:19Z	2020-05-05T20:07:19Z	MEMBER	From https://hynek.me/articles/conditional-python-dependencies/ I think this will look like: ```python setup( # ... install_requires=[ # ... "osxphotos>=0.28.13 ; sys_platform=='darwin'", ] ) ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Only install osxphotos if running on macOS 612860531
624364557	https://github.com/dogsheep/dogsheep-photos/issues/18#issuecomment-624364557	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/18	MDEyOklzc3VlQ29tbWVudDYyNDM2NDU1Nw==	simonw 9599	2020-05-05T23:49:18Z	2020-05-05T23:49:18Z	MEMBER	Label is `macos-latest`	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Switch CI solution to GitHub Actions with a macOS runner 612860758
624406285	https://github.com/dogsheep/dogsheep-photos/issues/19#issuecomment-624406285	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/19	MDEyOklzc3VlQ29tbWVudDYyNDQwNjI4NQ==	simonw 9599	2020-05-06T02:10:03Z	2020-05-06T02:10:03Z	MEMBER	Most annoying part of this is the difficulty of actually showing a photo. Maybe I need to run a local proxy that I can link to? A custom Datasette plugin perhaps?	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	apple-photos command should work even if upload has not run 613002220
615931488	https://github.com/dogsheep/dogsheep-photos/issues/2#issuecomment-615931488	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/2	MDEyOklzc3VlQ29tbWVudDYxNTkzMTQ4OA==	simonw 9599	2020-04-18T19:24:02Z	2020-04-18T19:24:02Z	MEMBER	I made a start on this last week with a https://github.com/simonw/heic-to-jpeg proxy.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to convert HEIC images to JPEG 602533352
624408220	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408220	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYyNDQwODIyMA==	simonw 9599	2020-05-06T02:18:47Z	2020-05-06T02:18:47Z	MEMBER	The `apple_photos` table has an indexed `uuid` column and a `path` column which stores the full path to that photo file on disk. I can write a custom Datasette plugin which takes the `uuid` from the URL, looks up the path, then serves up a thumbnail of the jpeg or heic image file. I'll prototype this is a one-off plugin first, then package it on PyPI for other people to install.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
624408370	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408370	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYyNDQwODM3MA==	simonw 9599	2020-05-06T02:19:27Z	2020-05-06T02:19:27Z	MEMBER	The plugin can be generalized: it can be configured to know how to take the URL path, look it up in ANY table (via a custom SQL query) to get a path on disk and then serve that.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
624408738	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408738	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYyNDQwODczOA==	simonw 9599	2020-05-06T02:21:05Z	2020-05-06T02:21:32Z	MEMBER	Here's rendering code from my hacked-together not-yet-released S3 image proxy: ```python from starlette.responses import Response from PIL import Image, ExifTags import pyheif for ORIENTATION_TAG in ExifTags.TAGS.keys(): if ExifTags.TAGS[ORIENTATION_TAG] == "Orientation": break ... # Load it into Pillow if ext == "heic": heic = pyheif.read_heif(image_response.content) image = Image.frombytes(mode=heic.mode, size=heic.size, data=heic.data) else: image = Image.open(io.BytesIO(image_response.content)) # Does EXIF tell us to rotate it? try: exif = dict(image._getexif().items()) if exif[ORIENTATION_TAG] == 3: image = image.rotate(180, expand=True) elif exif[ORIENTATION_TAG] == 6: image = image.rotate(270, expand=True) elif exif[ORIENTATION_TAG] == 8: image = image.rotate(90, expand=True) except (AttributeError, KeyError, IndexError): pass # Resize based on ?w= and ?h=, if set width, height = image.size w = request.query_params.get("w") h = request.query_params.get("h") if w is not None or h is not None: if h is None: # Set h based on w w = int(w) h = int((float(height) / width) * w) elif w is None: h = int(h) # Set w based on h w = int((float(width) / height) * h) w = int(w) h = int(h) image.thumbnail((w, h)) # ?bw= converts to black and white if request.query_params.get("bw"): image = image.convert("L") # ?q= sets the quality - defaults to 75 quality = 75 q = request.query_params.get("q") if q and q.isdigit() and 1 <= int(q) <= 100: quality = int(q) # Output as JPEG or PNG output_image = io.BytesIO() image_type = "JPEG" kwargs = {"quality": quality} if image.format == "PNG": image_type = "PNG" kwargs = {} …	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
625947133	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-625947133	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYyNTk0NzEzMw==	simonw 9599	2020-05-08T18:13:06Z	2020-05-08T18:13:06Z	MEMBER	`datasette-media` will be able to handle this once I implement https://github.com/simonw/datasette-media/issues/3	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
633626741	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633626741	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYzMzYyNjc0MQ==	simonw 9599	2020-05-25T15:38:55Z	2020-05-25T15:38:55Z	MEMBER	Sure, I should absolutely document this!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
633629944	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633629944	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYzMzYyOTk0NA==	simonw 9599	2020-05-25T15:47:42Z	2020-05-25T15:47:42Z	MEMBER	I'll add a proper section to the README, but for the moment here's how I do this. First, install `datasette` and the `datasette-media` plugin. Create a `metadata.yaml` file with the following content: ```yaml plugins: datasette-media: photo: sql: \|- select path as filepath, 200 as resize_height from apple_photos where uuid = :key photo-big: sql: \|- select path as filepath, 1024 as resize_height from apple_photos where uuid = :key ``` Now run `datasette -m metadata.yaml photos.db` - thumbnails will be served at http://127.0.0.1:8001/-/media/photo/F4469918-13F3-43D8-9EC1-734C0E6B60AD and larger sizes of the image at http://127.0.0.1:8001/-/media/photo-big/A8B02C7D-365E-448B-9510-69F80C26304D I also made myself two custom pages, one showing recent images and one showing random images. To do this, install the `datasette-template-sql` plugin and then create a `templates/pages` directory and add these files: `recent-photos.html` ```html <h1>Recent photos</h1> <div> {% for photo in sql("select * from apple_photos order by date desc limit 100") %} <img src="/-/media/photo/{{ photo['uuid'] }}"> {% endfor %} </div> ``` `random-photos.html` ```html <h1>Random photos</h1> <div> {% for photo in sql("with foo as (select * from apple_photos order by date desc limit 5000) select * from foo order by random() limit 100") %} <img src="/-/media/photo/{{ photo['uuid'] }}"> {% endfor %} </div> ``` Now run `datasette -m metadata.yaml photos.db --template-dir=templates/` Visit http://127.0.0.1:8001/random-photos to see some random photos or http://127.0.0.1:8002/recent-photos for recent photos. This is using this mechanism: https://datasette.readthedocs.io/en/stable/custom_templates.html#custom-pages	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
633643921	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633643921	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYzMzY0MzkyMQ==	simonw 9599	2020-05-25T16:29:44Z	2020-05-25T16:29:44Z	MEMBER	https://github.com/dogsheep/dogsheep-photos/blob/dc43fa8653cb9c7238a36f52239b91d1ec916d5c/README.md#serving-photos-locally-with-datasette-media	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
633644225	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633644225	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYzMzY0NDIyNQ==	simonw 9599	2020-05-25T16:30:44Z	2020-05-25T16:30:44Z	MEMBER	I'll add docs on using `datasette-json-html` too.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
633704127	https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633704127	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	MDEyOklzc3VlQ29tbWVudDYzMzcwNDEyNw==	simonw 9599	2020-05-25T20:14:22Z	2020-05-25T20:14:22Z	MEMBER	https://github.com/dogsheep/dogsheep-photos/blob/0.4.1/README.md#serving-photos-locally-with-datasette-media	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Ability to serve thumbnailed Apple Photo from its place on disk 613006393
626388764	https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626388764	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	MDEyOklzc3VlQ29tbWVudDYyNjM4ODc2NA==	simonw 9599	2020-05-10T20:58:52Z	2020-05-10T20:58:52Z	MEMBER	More from the debugger: ``` > /Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/photoinfo.py(614)place() -> self._place = PlaceInfo5(self._info["reverse_geolocation"]) ``` And: ``` > /Users/simon/Dropbox/Development/photos-to-sqlite/photos_to_sqlite/utils.py(91)osxphoto_to_row() -> place = photo.place ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990
626388837	https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626388837	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	MDEyOklzc3VlQ29tbWVudDYyNjM4ODgzNw==	simonw 9599	2020-05-10T20:59:32Z	2020-05-10T20:59:32Z	MEMBER	So it appears it's possible for `photo.place` to raise that exception. A workaround could be to catch that and treat those photos as not having a place.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990
626394989	https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626394989	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	MDEyOklzc3VlQ29tbWVudDYyNjM5NDk4OQ==	simonw 9599	2020-05-10T21:50:36Z	2020-05-10T21:50:36Z	MEMBER	https://github.com/Marketcircle/bpylist/pull/2 looks relevant here.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990
626395103	https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395103	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	MDEyOklzc3VlQ29tbWVudDYyNjM5NTEwMw==	simonw 9599	2020-05-10T21:51:36Z	2020-05-10T21:51:36Z	MEMBER	@RhetTbull I tried that workaround and it turns out I'm getting this error on ALL of my photos now! It's weird: a few day ago this wasn't happening. Now it's happening to everything. I'm not sure what I might have changed.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990
626395209	https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395209	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	MDEyOklzc3VlQ29tbWVudDYyNjM5NTIwOQ==	simonw 9599	2020-05-10T21:52:42Z	2020-05-10T21:52:42Z	MEMBER	Aha! It looks like I accidentally installed the old bplist into the same environment: ``` $ pip freeze \| grep bpylist bpylist==0.1.4 bpylist2==3.0.0 ```	{"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990
626395781	https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395781	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	MDEyOklzc3VlQ29tbWVudDYyNjM5NTc4MQ==	simonw 9599	2020-05-10T21:57:09Z	2020-05-10T21:57:09Z	MEMBER	Yes, I just recreated my virtual environment from scratch and the error went away. The problem occurred when I ran `pip install datasette-bplist` in the same virtual environment - https://github.com/simonw/datasette-bplist/blob/master/setup.py depends on `bpylist` which is incompatible with `bpylist2`.	{"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990
626941278	https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-626941278	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22	MDEyOklzc3VlQ29tbWVudDYyNjk0MTI3OA==	simonw 9599	2020-05-11T20:25:58Z	2020-05-11T20:25:58Z	MEMBER	Interesting - do you know if there's anything the `exiftool` process handles that `ExifReader` doesn't? I'm actually just going to extract a subset of the EXIF data at first - since the original photo files will always be available I don't feel the need to get everything out for the first step. My plan is to use EXIF to help support photo collections that aren't in Apple Photos - I'm going to build a database table keyed by the `sha256` of each photo that extracts the camera make, lens, a few settings (ISO, aperture etc) and the GPS lat/lon.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Try out ExifReader 615626118
631120771	https://github.com/dogsheep/dogsheep-photos/issues/23#issuecomment-631120771	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/23	MDEyOklzc3VlQ29tbWVudDYzMTEyMDc3MQ==	simonw 9599	2020-05-19T22:32:48Z	2020-05-19T22:32:48Z	MEMBER	Documentation: https://github.com/dogsheep/photos-to-sqlite/blob/e2fab012551eed05278040b5d57e7373a1b9a0bf/README.md#creating-a-subset-database	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	create-subset command for creating a publishable subset of a photos database 621280529
631255206	https://github.com/dogsheep/dogsheep-photos/issues/24#issuecomment-631255206	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/24	MDEyOklzc3VlQ29tbWVudDYzMTI1NTIwNg==	simonw 9599	2020-05-20T06:00:25Z	2020-05-20T06:00:25Z	MEMBER	This needs documentation.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Configurable URL for images 621323348
631127454	https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631127454	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	MDEyOklzc3VlQ29tbWVudDYzMTEyNzQ1NA==	simonw 9599	2020-05-19T22:48:00Z	2020-05-21T15:58:32Z	MEMBER	I built #23 to help with this. $ dogsheep-photos create-subset photos.db public.db \ "select sha256 from apple_photos where albums like '%Public%'" And publish with Vercel: $ datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-cluster-map	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Create a public demo 621332242
631251707	https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631251707	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	MDEyOklzc3VlQ29tbWVudDYzMTI1MTcwNw==	simonw 9599	2020-05-20T05:49:27Z	2020-05-21T15:58:42Z	MEMBER	Renaming this demo to `dogsheep-photos.dogsheep.net`	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Create a public demo 621332242
631253136	https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253136	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	MDEyOklzc3VlQ29tbWVudDYzMTI1MzEzNg==	simonw 9599	2020-05-20T05:53:58Z	2020-05-20T05:53:58Z	MEMBER	Updated deploy command: ``` datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-cluster-map \ --title "Dogsheep Photos demo" ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Create a public demo 621332242
631253248	https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253248	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	MDEyOklzc3VlQ29tbWVudDYzMTI1MzI0OA==	simonw 9599	2020-05-20T05:54:18Z	2020-05-20T05:54:18Z	MEMBER	https://dogsheep-photos.dogsheep.net/	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Create a public demo 621332242
631253852	https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253852	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	MDEyOklzc3VlQ29tbWVudDYzMTI1Mzg1Mg==	simonw 9599	2020-05-20T05:56:17Z	2020-05-21T22:26:16Z	MEMBER	I have a `deploy-demo.sh` script now: ```bash #!/bin/bash if [ -f public.db ]; then rm public.db fi pipenv run dogsheep-photos create-subset photos.db public.db \ "select sha256 from apple_photos where albums like '%Public%'" pipenv run sqlite-utils create-view public.db photos_on_a_map \ "select date, latitude, longitude, apple_photos.sha256, uploads.ext, json_object( 'title', 'Taken on ' \|\| date, 'image', 'https://photos.simonwillison.net/i/' \|\| uploads.sha256 \|\| '.' \|\| uploads.ext \|\| '?w=400', 'link', 'https://photos.simonwillison.net/i/' \|\| uploads.sha256 \|\| '.' \|\| uploads.ext \|\| '?w=1200' ) as popup from apple_photos join uploads on apple_photos.sha256 = uploads.sha256 where latitude is not null order by date desc" \ --replace pipenv run datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-pretty-json \ --install=datasette-cluster-map>=0.10 \ --title "Dogsheep Photos demo" ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Create a public demo 621332242
631226481	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226481	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyNjQ4MQ==	simonw 9599	2020-05-20T04:18:29Z	2020-05-20T04:18:29Z	MEMBER	I just renamed the repository.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
631226572	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226572	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyNjU3Mg==	simonw 9599	2020-05-20T04:18:52Z	2020-05-20T04:18:52Z	MEMBER	Need to reconfigure Circle CI.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
631226953	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226953	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyNjk1Mw==	simonw 9599	2020-05-20T04:20:34Z	2020-05-20T04:20:34Z	MEMBER	Huh, it looks like Circle CI picked up the name change automatically. https://app.circleci.com/pipelines/github/dogsheep/dogsheep-photos	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
631227020	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227020	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyNzAyMA==	simonw 9599	2020-05-20T04:20:48Z	2020-05-20T04:21:16Z	MEMBER	Next time I push a release it will create `dogsheep-photos` on PyPI.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
631227105	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227105	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyNzEwNQ==	simonw 9599	2020-05-20T04:21:06Z	2020-05-20T04:21:06Z	MEMBER	Then I just need to push a final photos-to-sqlite release that updates the README to tell people about the name change.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
631227245	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227245	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyNzI0NQ==	simonw 9599	2020-05-20T04:21:38Z	2020-05-20T04:21:38Z	MEMBER	I'm going to release 0.4 now.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
631229409	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631229409	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyOTQwOQ==	simonw 9599	2020-05-20T04:30:40Z	2020-05-20T04:30:40Z	MEMBER	https://pypi.org/project/photos-to-sqlite/ now links to dogsheep-photos.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
631229485	https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631229485	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	MDEyOklzc3VlQ29tbWVudDYzMTIyOTQ4NQ==	simonw 9599	2020-05-20T04:31:02Z	2020-05-20T04:31:02Z	MEMBER	https://pypi.org/project/dogsheep-photos/ is live.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Rename project to dogsheep-photos 621444763
615932007	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615932007	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzMjAwNw==	simonw 9599	2020-04-18T19:27:55Z	2020-04-18T19:27:55Z	MEMBER	Research thread: https://twitter.com/simonw/status/1249049694984011776 > I want to build some software that lets people store their own data in their own S3 bucket, but if possible I'd like not to have to teach people the incantations needed to get their bucket setup and minimum-permission credentials figures out https://testdriven.io/blog/storing-django-static-and-media-files-on-amazon-s3/ looks useful	{"total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615932204	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615932204	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzMjIwNA==	simonw 9599	2020-04-18T19:29:22Z	2020-04-18T19:34:44Z	MEMBER	I'm going to call my bucket `dogsheep-photos-simon`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615933273	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615933273	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzMzI3Mw==	simonw 9599	2020-04-18T19:37:33Z	2020-04-18T19:37:33Z	MEMBER	https://console.aws.amazon.com/s3/bucket/create?region=us-west-1 ![S3_Management_Console](https://user-images.githubusercontent.com/9599/79669552-33e2a380-8171-11ea-9ab5-5785d34f652a.png) I created it with no public read-write access. I plan to use signed URLs via a transforming proxy to access images for display on the web.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615935577	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615935577	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzNTU3Nw==	simonw 9599	2020-04-18T19:54:59Z	2020-04-18T19:55:30Z	MEMBER	Creating IAM groups called `dogsheep-photos-simon-read-write` and `dogsheep-photos-simon-read`: https://console.aws.amazon.com/iam/home#/groups - I created them with no attached policies. Now I can attach an "inline policy" to each one. For the read-write group I go here: https://console.aws.amazon.com/iam/home#/groups/dogsheep-photos-simon-read-write ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79669703-2d086080-8172-11ea-9597-83e0b155193e.png) Example policies are here: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html For the read-write one I went with: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "s3:", "Resource": [ "arn:aws:s3:::dogsheep-photos-simon/" ] } ] } ``` For the read-only policy I'm going to guess that this is appropriate: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::dogsheep-photos-simon/" ] } ] } ``` I tried the policy simulator to test this out: https://policysim.aws.amazon.com/home/index.jsp?#groups/dogsheep-photos-simon-read - this worked: ![IAM_Policy_Simulator](https://user-images.githubusercontent.com/9599/79669893-cd12b980-8173-11ea-8dfb-5660ce3652da.png)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615936880	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615936880	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTkzNjg4MA==	simonw 9599	2020-04-18T20:04:31Z	2020-04-18T20:04:31Z	MEMBER	Next step: create two IAM users, one for each of those groups. https://console.aws.amazon.com/iam/home#/users$new?step=details ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79669931-1bc05380-8174-11ea-9657-0e0c6a692d42.png) ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79669941-27137f00-8174-11ea-8ce7-249f0d4f96f6.png) I copied the keys into a secure note in 1password. Couldn't get into Transmit with them though! https://library.panic.com/transmit/transmit5/iam-roles/ may help.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615941746	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615941746	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0MTc0Ng==	simonw 9599	2020-04-18T20:29:36Z	2020-04-18T20:29:36Z	MEMBER	I'm going to create another user just for Transmit, with full S3 access. name: `dogsheep-photos-simon-s3-all-access` Rather than creating a group for that user, I'm trying the "Attach existing policies directly" option: ![IAM_Management_Console](https://user-images.githubusercontent.com/9599/79670182-03513880-8176-11ea-811a-c80aefb4538a.png) That user DID work with Transmit. I uploaded a test HEIC image. I used Transmit to copy a signed URL for it. ``` ~ $ curl -i 'https://dogsheep-photos-simon.s3.us-west-1.amazonaws.com/IMG_7195.HEIC?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAWXFXAI...' \| head -n 100 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0HTTP/1.1 200 OK x-amz-id-2: gBOCYqZfbNAnv0R/uJ++qm2NbW5SgD4TapgF9RQjzzeDIThcCz/BkKU+YoxlG4NJHlcmMgAHyh4= x-amz-request-id: C2FE7FCC3BD53A84 Date: Sat, 18 Apr 2020 20:28:54 GMT Last-Modified: Sat, 18 Apr 2020 20:13:49 GMT ETag: "fe3e081239a123ef745517878c53b854" Accept-Ranges: bytes Content-Type: image/heic Content-Length: 1913097 Server: AmazonS3 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615942116	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615942116	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0MjExNg==	simonw 9599	2020-04-18T20:30:56Z	2020-04-18T20:30:56Z	MEMBER	Next step: attempt a programmatic upload using the `dogsheep-photos-simon-read-write` credentials from a Jupyter notebook. Also attempt a programmatic bucket listing and read using `dogsheep-photos-simon-read` credentials.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615944806	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615944806	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NDgwNg==	simonw 9599	2020-04-18T20:41:39Z	2020-04-18T20:41:39Z	MEMBER	This worked! ![Dogsheep_Photos_S3_access](https://user-images.githubusercontent.com/9599/79670712-d868e380-8179-11ea-82a5-5dfd17356113.png) And this worked: ![Dogsheep_Photos_S3_access](https://user-images.githubusercontent.com/9599/79670777-50370e00-817a-11ea-83cd-18ebf5702878.png)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615945056	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615945056	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NTA1Ng==	simonw 9599	2020-04-18T20:42:41Z	2020-04-18T20:42:41Z	MEMBER	But... `list_objects` failed for both of my keys (read and write): ![Dogsheep_Photos_S3_access](https://user-images.githubusercontent.com/9599/79670798-75c41780-817a-11ea-9907-2cbc4a2e497c.png)	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615946537	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615946537	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NjUzNw==	simonw 9599	2020-04-18T20:48:13Z	2020-04-18T20:48:13Z	MEMBER	How about generating a signed URL? ```python read_client.generate_presigned_url( "get_object", Params={ "Bucket": "dogsheep-photos-simon", "Key": "this_is_fine.jpg", }, ExpiresIn=600 ) ``` Gave me https://dogsheep-photos-simon.s3.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398 Which does this: ``` ~ $ curl -i 'https://dogsheep-photos-simon.s3.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398' HTTP/1.1 307 Temporary Redirect x-amz-bucket-region: us-west-1 x-amz-request-id: E78CD859AEE21D33 x-amz-id-2: 648mx+1+YSGga7NDOU7Q6isfsKnEPWOLC+DI4+x2o9FCc6pSCdIaoHJUbFMI8Vsuh1ADtx46ymU= Location: https://dogsheep-photos-simon.s3-us-west-1.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398 Content-Type: application/xml Transfer-Encoding: chunked Date: Sat, 18 Apr 2020 20:47:21 GMT Server: AmazonS3 <?xml version="1.0" encoding="UTF-8"?> <Error><Code>TemporaryRedirect</Code><Message>Please re-send this request to the specified temporary endpoint. Continue to use the original request endpoint for future requests.</Message><Endpoint>dogsheep-photos-simon.s3-us-west-1.amazonaws.com</Endpoint><Bucket>dogsheep-photos-simon</Bucket><RequestId>E78CD859AEE21D33</RequestId><HostId>648mx+1+YSGga7NDOU7Q6isfsKnEPWOLC+DI4+x2o9FCc6pSCdIaoHJUbFMI8Vsuh1ADtx46ymU=</HostId></Error>~ $ ``` So it redirects to another URL... which returns this: ``` ~ $ curl -i 'https://dogsheep-photos-simon.s3-us-west-1.amazonaws.com/this_is_fine.jpg?AWSAccessKeyId=AKIAWXFXAIOZNZ3JFO7I&Signature=x1zrS4w4OTGAACd7yHp9mYqXvN8%3D&Expires=1587243398' HTTP/1.1 200 OK x-amz-id-2: XafOl6mswj3yz0GJC9+Ptot1ll5sROVwqsMc10CUUfgpaUANTdIx2GhnONb5d1GVFJ6wlS2j3UY= x-amz-request-id: 258387C180411AFE Date: Sat, 18 Apr 2020 20:47:52 GMT Last-Modified: Sat, 18 Apr 2020 20:37:35 GMT E…	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615947229	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615947229	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NzIyOQ==	simonw 9599	2020-04-18T20:51:26Z	2020-04-18T20:51:26Z	MEMBER	Running the upload again like this resulted in the correct content-type: ```python client.upload_file( "/Users/simonw/Desktop/this_is_fine.jpg", "dogsheep-photos-simon", "this_is_fine.jpg", ExtraArgs={ "ContentType": "image/jpeg" } ) ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615947370	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615947370	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0NzM3MA==	simonw 9599	2020-04-18T20:52:13Z	2020-04-18T20:52:13Z	MEMBER	This is great! I now have a key that can upload photos, and a separate key that can download photos OR generate signed URLs to access those photos. Next step: a script that starts uploading my photos.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615948102	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615948102	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk0ODEwMg==	simonw 9599	2020-04-18T20:56:59Z	2020-04-18T20:56:59Z	MEMBER	I'm going to start with this: `photos-to-sqlite upload photos.db ~/path/to/directory` This will scan the provided directory (and all sub-directories) for image files. It will then: * Calculate a sha256 of the contents of that file * Upload the file to a key that's `sha256.jpg` or `.heic` * Upload a `sha256.json` file with the original path to the image * Add that image to a `uploads` table in `photos.db` Stretch goal: grab the EXIF data and include that in the `.json` upload AND the `uploads` database table.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615957385	https://github.com/dogsheep/dogsheep-photos/issues/4#issuecomment-615957385	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4	MDEyOklzc3VlQ29tbWVudDYxNTk1NzM4NQ==	simonw 9599	2020-04-18T21:56:16Z	2020-04-18T21:58:11Z	MEMBER	Got this working! I'll do EXIF in a separate ticket #3.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Upload all my photos to a secure S3 bucket 602533539
615949574	https://github.com/dogsheep/dogsheep-photos/issues/5#issuecomment-615949574	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/5	MDEyOklzc3VlQ29tbWVudDYxNTk0OTU3NA==	simonw 9599	2020-04-18T21:06:07Z	2020-04-18T21:06:07Z	MEMBER	``` $ photos-to-sqlite s3-auth Create S3 credentials and paste them here: Access key ID: xxx Secret access key: yyy $ cat auth.json { "access_key_id": "xxx", "secret_access_key": "yyy" } ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	photos-to-sqlite s3-auth command 602551638
615979923	https://github.com/dogsheep/dogsheep-photos/issues/6#issuecomment-615979923	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/6	MDEyOklzc3VlQ29tbWVudDYxNTk3OTkyMw==	simonw 9599	2020-04-18T23:36:02Z	2020-04-18T23:36:02Z	MEMBER	I'll use a Click progress bar. To do this I need to first calculate the sum number of bytes in the photos that are going to be uploaded, then run the upload.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add progress bar to upload command 602575575
615983393	https://github.com/dogsheep/dogsheep-photos/issues/6#issuecomment-615983393	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/6	MDEyOklzc3VlQ29tbWVudDYxNTk4MzM5Mw==	simonw 9599	2020-04-18T23:53:10Z	2020-04-18T23:53:10Z	MEMBER	``` $ photos-to-sqlite upload photos3.db ~/Pictures/Photos\ Library.photoslibrary/Masters/2020 Uploading 2.09 GB [##----------------------------------] 6% 00:36:37 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add progress bar to upload command 602575575
615993178	https://github.com/dogsheep/dogsheep-photos/issues/7#issuecomment-615993178	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/7	MDEyOklzc3VlQ29tbWVudDYxNTk5MzE3OA==	simonw 9599	2020-04-19T00:37:08Z	2020-04-19T00:37:08Z	MEMBER	https://pypi.org/project/ImageHash/ Is one option.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Integrate image content hashing 602585497
618100434	https://github.com/dogsheep/dogsheep-photos/issues/8#issuecomment-618100434	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/8	MDEyOklzc3VlQ29tbWVudDYxODEwMDQzNA==	simonw 9599	2020-04-23T00:02:53Z	2020-04-23T00:02:53Z	MEMBER	I don't think it matters one way or the other - I'm storing the sha256 in the filename, so the fact that I could read the MD5 back from the list bucket operation doesn't give me any benefits.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Should I have used MD5 instead of SHA256? 605147638
618100658	https://github.com/dogsheep/dogsheep-photos/issues/8#issuecomment-618100658	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/8	MDEyOklzc3VlQ29tbWVudDYxODEwMDY1OA==	simonw 9599	2020-04-23T00:03:35Z	2020-04-23T00:03:35Z	MEMBER	Also MD5 isn't guaranteed for the ETag: > If an object is created by either the Multipart Upload or Part Copy operation, the ETag is not an MD5 digest, regardless of the method of encryption.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Should I have used MD5 instead of SHA256? 605147638
618724149	https://github.com/dogsheep/dogsheep-photos/issues/9#issuecomment-618724149	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/9	MDEyOklzc3VlQ29tbWVudDYxODcyNDE0OQ==	simonw 9599	2020-04-23T23:35:29Z	2020-04-23T23:35:29Z	MEMBER	``` % photos-to-sqlite upload photos.db ~/Pictures/Photos\ Library.photoslibrary/originals Fetching existing keys from S3... Got 22,446 existing keys Calculating hashes [####--------------------------------] 13% 00:04:14 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	upload command should be resumable, should only upload photos not already uploaded 605938063
618725155	https://github.com/dogsheep/dogsheep-photos/issues/9#issuecomment-618725155	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/9	MDEyOklzc3VlQ29tbWVudDYxODcyNTE1NQ==	simonw 9599	2020-04-23T23:39:14Z	2020-04-23T23:39:14Z	MEMBER	A few minutes later... ``` Fetching existing keys from S3... Got 22,446 existing keys Calculating hashes [####################################] 100% 22,441 hashed files, 610 are not yet in S3 Uploading 0.99 GB Uploading 610 photos [------------------------------------] 1/610 03:10:35 ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	upload command should be resumable, should only upload photos not already uploaded 605938063
739058820	https://github.com/dogsheep/dogsheep-photos/pull/29#issuecomment-739058820	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/29	MDEyOklzc3VlQ29tbWVudDczOTA1ODgyMA==	simonw 9599	2020-12-04T22:32:35Z	2020-12-04T22:32:35Z	MEMBER	Thanks for this!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Fixed bug in SQL query for photo scores 638375985
706775706	https://github.com/dogsheep/evernote-to-sqlite/issues/1#issuecomment-706775706	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/1	MDEyOklzc3VlQ29tbWVudDcwNjc3NTcwNg==	simonw 9599	2020-10-11T22:14:00Z	2020-10-11T22:14:00Z	MEMBER	A live demo would be good too.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Documentation on how to use this with Datasette 718934942
777798330	https://github.com/dogsheep/evernote-to-sqlite/issues/11#issuecomment-777798330	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11	MDEyOklzc3VlQ29tbWVudDc3Nzc5ODMzMA==	simonw 9599	2021-02-11T21:18:58Z	2021-02-11T21:18:58Z	MEMBER	Thanks for the fix!	{"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	XML parse error 792851444
905203570	https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-905203570	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13	IC_kwDOEhK-wc419E9y	simonw 9599	2021-08-25T05:51:22Z	2021-08-25T05:53:27Z	MEMBER	The debugger showed me that it broke on a string that looked like this: ```xml <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd"> <en-note> <h1 title="Q3 2018 Reflection & Development"> <span title=Q3 2018 Reflection & Development"> Q3 2018 Reflection & Development </span> </h1> ... ``` Yeah that is not valid XML!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426
905206234	https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-905206234	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13	IC_kwDOEhK-wc419Fna	simonw 9599	2021-08-25T05:58:42Z	2021-08-25T05:58:42Z	MEMBER	https://github.com/dogsheep/evernote-to-sqlite/blob/36a466f142e5bad52719851c2fbda0c05cd35b99/evernote_to_sqlite/utils.py#L34-L42 Not sure why I was round-tripping the `content_xml` like that - I will try not doing that.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426
906635938	https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-906635938	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13	IC_kwDOEhK-wc42Ciqi	simonw 9599	2021-08-26T18:18:27Z	2021-08-26T18:18:27Z	MEMBER	It looks like I was using the round-trip to dump the `<?xml version="1.0" encoding="UTF-8" standalone="no"?>` and `<!DOCTYPE` prefixes.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426
906646452	https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-906646452	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13	IC_kwDOEhK-wc42ClO0	simonw 9599	2021-08-26T18:34:34Z	2021-08-26T18:35:20Z	MEMBER	I tried this ampersand fix: https://regex101.com/r/ojU2H9/1 ```python # https://regex101.com/r/ojU2H9/1 _invalid_ampersand_re = re.compile(r'&(?![a-z0-9]+;)') def fix_bad_xml(xml): # More fixes for things like '&' not as part of an entity return _invalid_ampersand_re.sub('&', xml) ``` Even with that I'm still getting total garbage in the `<en-note>` content - it's just HTML, not even trying to be XML.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426
706784028	https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706784028	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4	MDEyOklzc3VlQ29tbWVudDcwNjc4NDAyOA==	simonw 9599	2020-10-11T23:20:32Z	2020-10-11T23:20:32Z	MEMBER	I haven't done the FTS on OCR yet. I'm going to move that to another ticket because it requires more thought.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Configure FTS + add an index on the date columns 718938508
706786548	https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706786548	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4	MDEyOklzc3VlQ29tbWVudDcwNjc4NjU0OA==	simonw 9599	2020-10-11T23:39:46Z	2020-10-11T23:39:46Z	MEMBER	Should have used porter stemming for this.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Configure FTS + add an index on the date columns 718938508
706776180	https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776180	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	MDEyOklzc3VlQ29tbWVudDcwNjc3NjE4MA==	simonw 9599	2020-10-11T22:17:55Z	2020-10-11T22:17:55Z	MEMBER	We could even do server-side thumbnailing for some of these images, but I'm inclined to serve up the full size ones and set a width on the image element based on the `width` attribute on `<en-media>`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out how to display images from <en-media> tags inline in Datasette 718938889
706776242	https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776242	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	MDEyOklzc3VlQ29tbWVudDcwNjc3NjI0Mg==	simonw 9599	2020-10-11T22:18:30Z	2020-10-11T22:19:48Z	MEMBER	Alternatively, rather than relying on `datasette-media` this could base64-embed the images. `evernote-to-sqlite` could register itself as a Datasette plugin that knows how to do this. Maybe rename the column to `evernote_content` and register a render cell hook that knows how to rewrite those note bodies so that they are visible? Might need to feed them through Bleach too, just in case any nasty code can get into them.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out how to display images from <en-media> tags inline in Datasette 718938889
706776447	https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776447	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	MDEyOklzc3VlQ29tbWVudDcwNjc3NjQ0Nw==	simonw 9599	2020-10-11T22:20:32Z	2020-10-11T22:20:32Z	MEMBER	Or... I could do this client-side. JavaScript that looks for `<en-media>` tags and fetches the data using `fetch()` wouldn't be too hard to write.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out how to display images from <en-media> tags inline in Datasette 718938889
706776680	https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776680	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	MDEyOklzc3VlQ29tbWVudDcwNjc3NjY4MA==	simonw 9599	2020-10-11T22:22:16Z	2020-10-11T22:22:16Z	MEMBER	Maybe the best way do this is with a custom route, `/-/evernote/note-id` - that way I can clean the HTML and resolve the other things in the `<en-note>` structure without using `render_cell()` and the like. My concern about using `render_cell()` is that it could lead to weird security problems when combined with `?sql=` queries.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out how to display images from <en-media> tags inline in Datasette 718938889
706776808	https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776808	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	MDEyOklzc3VlQ29tbWVudDcwNjc3NjgwOA==	simonw 9599	2020-10-11T22:23:14Z	2020-10-11T22:23:14Z	MEMBER	... but it's still important to be able to get to the rendered note directly from the browse notes `/evernote/notes` page. Maybe use a simple `render_cell()` hook that just knows how to generate the link to the rendered note page?	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out how to display images from <en-media> tags inline in Datasette 718938889
706834800	https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706834800	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	MDEyOklzc3VlQ29tbWVudDcwNjgzNDgwMA==	simonw 9599	2020-10-12T03:24:57Z	2020-10-16T20:16:28Z	MEMBER	Here's my first attempt at a plugin for this: ```python from datasette import hookimpl import jinja2 START = "<en-note" END = "</en-note>" TEMPLATE = """ <div style="max-width: 500px; white-space: normal; overflow-wrap: break-word;">{}</div> """.strip() EN_MEDIA_SCRIPT = """ Array.from(document.querySelectorAll('en-media')).forEach(el => { let hash = el.getAttribute('hash'); let type = el.getAttribute('type'); let path = `/evernote/resources_data/${hash}.json?_shape=array`; fetch(path).then(r => r.json()).then(rows => { let b64 = rows[0].data.encoded; let data = `data:${type};base64,${b64}`; el.innerHTML = `<img style="max-width: 300px" src="${data}">`; }); }); """ @hookimpl def render_cell(value, table): if not table: # Don't render content from arbitrary SQL queries, could be XSS hole return if not value or not isinstance(value, str): return value = value.strip() if value.startswith(START) and value.endswith(END): trimmed = value[len(START) : -len(END)] trimmed = trimmed.split(">", 1)[1] # Replace those horrible double newlines trimmed = trimmed.replace("<div><br /></div>", "<br>") return jinja2.Markup(TEMPLATE.format(trimmed)) @hookimpl def extra_body_script(): return EN_MEDIA_SCRIPT ``` It works! It does however demonstrate that Evernote's "clip this webpage" feature means there is a LOT of weird HTML that can get into a note. It looks like they've filtered out the scripts but I wouldn't bet on it - they certainly don't filter out many of the inline styles. So running Bleach is almost certainly a good idea.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out how to display images from <en-media> tags inline in Datasette 718938889
706785086	https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785086	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6	MDEyOklzc3VlQ29tbWVudDcwNjc4NTA4Ng==	simonw 9599	2020-10-11T23:28:50Z	2020-10-11T23:28:50Z	MEMBER	The XML for the OCR stuff is a bit weird. Currently I'm doing this to it: https://github.com/dogsheep/evernote-to-sqlite/blob/c33d7b043a45eb3e88676e5fa3ce31755199d9f8/evernote_to_sqlite/utils.py#L70-L78 This can produce some odd results, for example: > Sure 'Sure, 'Sure. Sure, Sure. sure sure. sure ? If you If Yau [you live jive In m 1n an area devoid of natural wonders, wanders, wonders ? wonders wonders. your mind will be blown, blown' blown. blown ? -e i ? ,1 IL it ? at ? KY ? fl ft bat at Which came from this image: ![image](https://user-images.githubusercontent.com/9599/95692952-5dd7c880-0bde-11eb-939a-d10b800a4105.png) The XML for that is: ```xml <recoIndex docType="unknown" objType="image" objID="05ffb72b307bf495f064243c7099d94f" engineVersion="6.5.17.7" recoType="service" lang="en" objWidth="1000" objHeight="1504"> <item x="68" y="75" w="104" h="37"> <t w="60">Sure</t> <t w="52">'Sure,</t> <t w="47">'Sure.</t> <t w="33">Sure,</t> <t w="26">Sure.</t> </item> <item x="182" y="83" w="92" h="26"> <t w="62">sure</t> <t w="58">sure.</t> <t w="46">sure ?</t> </item> <item x="69" y="132" w="107" h="45"> <t w="81">If you</t> <t w="64">If Yau</t> <t w="31">[you</t> </item> <item x="186" y="132" w="67" h="35"> <t w="85">live</t> <t w="51">jive</t> </item> <item x="263" y="140" w="36" h="27"> <t w="82">In</t> <t w="56">m</t> <t w="53">1n</t> </item> <item x="309" y="140" w="53" h="27"> <t w="82">an</t> </item> <item x="372" y="141" w="90" h="26"> <t w="94">area</t> </item> <item x="472" y="132" w="138" h="35"> <t w="85">devoid</t> </item> <item x="620" y="132" w="43" h="35"> <t w="82">of</t> </item> <item x="68" y="190" w="137" h="35"> <t w="87">natural</t> </item> <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">won…	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Better handling of OCR data 718949182
706785201	https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785201	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6	MDEyOklzc3VlQ29tbWVudDcwNjc4NTIwMQ==	simonw 9599	2020-10-11T23:29:39Z	2020-10-11T23:29:39Z	MEMBER	It looks to me like each of those `<item>` blocks has a number of guesses in order of confidence: ```xml <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">wonders.</t> </item> ``` So maybe the best approach here is to just take the first `t` element within each `item`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Better handling of OCR data 718949182
777827396	https://github.com/dogsheep/evernote-to-sqlite/issues/7#issuecomment-777827396	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7	MDEyOklzc3VlQ29tbWVudDc3NzgyNzM5Ng==	simonw 9599	2021-02-11T22:13:14Z	2021-02-11T22:13:14Z	MEMBER	My best guess is that you have an older version of `sqlite-utils` installed here - the `replace=True` argument was added in version 2.0. I've bumped the dependency in `setup.py`.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	evernote-to-sqlite on windows 10 give this error: TypeError: insert() got an unexpected keyword argument 'replace' 743297582
777821383	https://github.com/dogsheep/evernote-to-sqlite/issues/9#issuecomment-777821383	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9	MDEyOklzc3VlQ29tbWVudDc3NzgyMTM4Mw==	simonw 9599	2021-02-11T22:01:28Z	2021-02-11T22:01:28Z	MEMBER	Aha! I think I've figured out what's going on here. The CData blocks containing the notes look like this: `<![CDATA[<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd"><en-note><div>This note includes two images.</div><div><br /></div>...` The DTD at http://xml.evernote.com/pub/enml2.dtd includes some entities: ``` <!--=========== External character mnemonic entities ===================--> <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlat1; <!ENTITY % HTMLsymbol PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"> %HTMLsymbol; <!ENTITY % HTMLspecial PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"> %HTMLspecial; ``` So I need to be able to handle all of those different entities. I think I can do that using `html.entities.entitydefs` from the Python standard library, which looks a bit like this: ```python {'Aacute': 'Á', 'aacute': 'á', 'Aacute;': 'Á', 'aacute;': 'á', 'Abreve;': 'Ă', 'abreve;': 'ă', 'ac;': '∾', 'acd;': '∿', # ... } ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	ParseError: undefined entity &scaron; 748372469
777839351	https://github.com/dogsheep/evernote-to-sqlite/pull/10#issuecomment-777839351	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/10	MDEyOklzc3VlQ29tbWVudDc3NzgzOTM1MQ==	simonw 9599	2021-02-11T22:37:55Z	2021-02-11T22:37:55Z	MEMBER	I've merged these changes by hand now, thanks!	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	BugFix for encoding and not update info. 770712149
544646516	https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-544646516	https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1	MDEyOklzc3VlQ29tbWVudDU0NDY0NjUxNg==	simonw 9599	2019-10-21T18:30:14Z	2019-10-21T18:30:14Z	MEMBER	Thanks to help from Dr. Laura Cantino at Science Hack Day San Francisco I've been able to pull together this query: ```sql select rsid, genotype, case genotype when 'AA' then 'brown eye color, 80% of the time' when 'AG' then 'brown eye color' when 'GG' then 'blue eye color, 99% of the time' end as interpretation from genome where rsid = 'rs12913832' ``` See also https://www.snpedia.com/index.php/Rs12913832 - in particular this table: <img width="321" alt="rs12913832_-_SNPedia" src="https://user-images.githubusercontent.com/9599/67232392-216ff300-f3f6-11e9-8e14-b5f50c0c0d16.png">	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out some interesting example SQL queries 496415321
544648863	https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-544648863	https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1	MDEyOklzc3VlQ29tbWVudDU0NDY0ODg2Mw==	simonw 9599	2019-10-21T18:36:03Z	2019-10-21T18:36:03Z	MEMBER	<img width="1418" alt="natalie__select_rsid__genotype__case_genotype_when__AA__then__brown_eye_color__80__of_the_time__when__AG__then__brown_eye_color__when__GG__then__blue_eye_color__99__of_the_time__end_as_interpretation_from_genome_where_rsid____rs12913832__an" src="https://user-images.githubusercontent.com/9599/67232810-f4701000-f3f6-11e9-90e2-8fe2cca1d98d.png">	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Figure out some interesting example SQL queries 496415321
549230337	https://github.com/dogsheep/github-to-sqlite/issues/10#issuecomment-549230337	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/10	MDEyOklzc3VlQ29tbWVudDU0OTIzMDMzNw==	simonw 9599	2019-11-04T05:47:18Z	2019-11-04T05:47:18Z	MEMBER	This definition isn't quite right - it's not pulling the identity of the user who starred the repo (`users.login` ends up being the owner login instead).	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add this repos_starred view 516967682
622461122	https://github.com/dogsheep/github-to-sqlite/issues/10#issuecomment-622461122	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/10	MDEyOklzc3VlQ29tbWVudDYyMjQ2MTEyMg==	simonw 9599	2020-05-01T16:34:39Z	2020-05-01T16:34:39Z	MEMBER	Blocked on #37	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add this repos_starred view 516967682
622980203	https://github.com/dogsheep/github-to-sqlite/issues/10#issuecomment-622980203	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/10	MDEyOklzc3VlQ29tbWVudDYyMjk4MDIwMw==	simonw 9599	2020-05-02T16:34:29Z	2020-05-02T16:34:29Z	MEMBER	Fixed definition: ```sql select stars.starred_at, starring_user.login as starred_by, repos.* from repos join stars on repos.id = stars.repo join users as starring_user on stars.user = starring_user.id join users on repos.owner = users.id order by starred_at desc; ```	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add this repos_starred view 516967682
594151327	https://github.com/dogsheep/github-to-sqlite/issues/12#issuecomment-594151327	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/12	MDEyOklzc3VlQ29tbWVudDU5NDE1MTMyNw==	simonw 9599	2020-03-03T20:26:15Z	2020-03-03T20:32:23Z	MEMBER	Better version (since this also includes JSON array of repository topics): ```sql CREATE VIEW recent_releases AS select repos.rowid as rowid, json_object("label", repos.full_name, "href", repos.html_url) as repo, json_object( "href", releases.html_url, "label", releases.name ) as release, substr(releases.published_at, 0, 11) as date, releases.body as body_markdown, releases.published_at, coalesce(repos.topics, '[]') as topics from releases join repos on repos.id = releases.repo order by releases.published_at desc ``` That `repos.rowid as rowid` bit is necessary because otherwise clicking on a link in facet-by-topic doesn't return any results.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add this view for seeing new releases 520756546
594155249	https://github.com/dogsheep/github-to-sqlite/issues/12#issuecomment-594155249	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/12	MDEyOklzc3VlQ29tbWVudDU5NDE1NTI0OQ==	simonw 9599	2020-03-03T20:35:17Z	2020-03-03T20:35:17Z	MEMBER	`swarm-to-sqlite` has an example of adding views here: https://github.com/dogsheep/swarm-to-sqlite/blob/f2c89dd613fb8a7f14e5267ccc2145463b996190/swarm_to_sqlite/utils.py#L141 I think that approach can be approved by first checking if the view exists, then dropping it, then recreating it. Could even try to see if the view exists and matches what we were going to set it to and do nothing if that is the case.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	Add this view for seeing new releases 520756546

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);