issue_comments
7,983 rows sorted by html_url
This data as json, CSV (advanced)
Suggested facets: reactions
created_at (date) >30 ✖
- 2021-03-22 66
- 2021-11-19 60
- 2020-10-15 52
- 2020-09-22 51
- 2020-10-30 49
- 2022-03-21 46
- 2020-12-18 43
- 2020-06-09 42
- 2020-06-18 41
- 2020-10-20 40
- 2022-01-09 40
- 2020-05-27 39
- 2021-11-16 39
- 2021-12-16 39
- 2020-12-30 38
- 2020-10-09 36
- 2021-11-20 36
- 2022-01-20 36
- 2022-03-19 36
- 2020-09-15 34
- 2021-11-29 34
- 2020-06-08 33
- 2021-01-04 33
- 2021-05-27 33
- 2022-02-06 33
- 2020-06-01 32
- 2022-03-05 32
- 2019-06-24 31
- 2020-09-21 31
- 2021-08-13 31
- …
id | html_url ▼ | issue_url | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
686238498 | https://github.com/dogsheep/dogsheep-beta/issues/10#issuecomment-686238498 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/10 | MDEyOklzc3VlQ29tbWVudDY4NjIzODQ5OA== | simonw 9599 | 2020-09-03T04:05:05Z | 2020-09-03T04:05:05Z | MEMBER | Since the first two categories are `created` and `saved` this one should be called `received`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Category 3: received 691557547 | |
686618669 | https://github.com/dogsheep/dogsheep-beta/issues/11#issuecomment-686618669 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/11 | MDEyOklzc3VlQ29tbWVudDY4NjYxODY2OQ== | simonw 9599 | 2020-09-03T16:47:34Z | 2020-09-03T16:53:25Z | MEMBER | I think a `is_public` integer column which defaults to 0 would be good here. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Public / Private mechanism 692125110 | |
686774592 | https://github.com/dogsheep/dogsheep-beta/issues/13#issuecomment-686774592 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/13 | MDEyOklzc3VlQ29tbWVudDY4Njc3NDU5Mg== | simonw 9599 | 2020-09-03T21:30:21Z | 2020-09-03T21:30:21Z | MEMBER | This is partially supported: the custom search SQL we run doesn't escape them, but the `?_search` used to calculate facet counts does. So this is a bug. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Support advanced FTS queries 692386625 | |
695124698 | https://github.com/dogsheep/dogsheep-beta/issues/15#issuecomment-695124698 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/15 | MDEyOklzc3VlQ29tbWVudDY5NTEyNDY5OA== | simonw 9599 | 2020-09-18T23:17:38Z | 2020-09-18T23:17:38Z | MEMBER | This can be part of the demo instance in #6. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add a bunch of config examples 694136490 | |
694548909 | https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-694548909 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 | MDEyOklzc3VlQ29tbWVudDY5NDU0ODkwOQ== | simonw 9599 | 2020-09-17T23:15:09Z | 2020-09-17T23:15:09Z | MEMBER | I have sort by date now, #21. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Timeline view 694493566 | |
695851036 | https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-695851036 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 | MDEyOklzc3VlQ29tbWVudDY5NTg1MTAzNg== | simonw 9599 | 2020-09-20T23:34:57Z | 2020-09-20T23:34:57Z | MEMBER | Really basic starting point is to add facet by date. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Timeline view 694493566 | |
695877627 | https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-695877627 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 | MDEyOklzc3VlQ29tbWVudDY5NTg3NzYyNw== | simonw 9599 | 2020-09-21T02:42:29Z | 2020-09-21T02:42:29Z | MEMBER | Fun twist: assuming `timestamp` is always stored as UTC, I need the interface to be timezone aware so I can see e.g. everything from 4th July 2020 in the San Francisco timezone definition of 4th July 2020. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Timeline view 694493566 | |
687880459 | https://github.com/dogsheep/dogsheep-beta/issues/17#issuecomment-687880459 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/17 | MDEyOklzc3VlQ29tbWVudDY4Nzg4MDQ1OQ== | simonw 9599 | 2020-09-06T19:36:32Z | 2020-09-06T19:36:32Z | MEMBER | At some point I may even want to support search types which are indexed from (and inflated from) more than one database file. I'm going to ignore that for the moment though. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename "table" to "type" 694500679 | |
689226390 | https://github.com/dogsheep/dogsheep-beta/issues/17#issuecomment-689226390 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/17 | MDEyOklzc3VlQ29tbWVudDY4OTIyNjM5MA== | simonw 9599 | 2020-09-09T00:36:07Z | 2020-09-09T00:36:07Z | MEMBER | Alternative names: - type - record_type - doctype I think `type` is right. It matches what Elasticsearch used to call their equivalent of this (before they removed the feature!). https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Rename "table" to "type" 694500679 | |
688622995 | https://github.com/dogsheep/dogsheep-beta/issues/18#issuecomment-688622995 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/18 | MDEyOklzc3VlQ29tbWVudDY4ODYyMjk5NQ== | simonw 9599 | 2020-09-08T05:15:21Z | 2020-09-08T05:15:21Z | MEMBER | Alternatively it could run as it does now but add a `DELETE FROM index1.search_index WHERE key not in (select key from ...)`. I'm not sure which would be more efficient. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Deleted records stay in the search index 695553522 | |
688623097 | https://github.com/dogsheep/dogsheep-beta/issues/18#issuecomment-688623097 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/18 | MDEyOklzc3VlQ29tbWVudDY4ODYyMzA5Nw== | simonw 9599 | 2020-09-08T05:15:51Z | 2020-09-08T05:15:51Z | MEMBER | I'm inclined to go with the first, simpler option. I have longer term plans for efficient incremental index updates based on clever trickery with triggers. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Deleted records stay in the search index 695553522 | |
688625430 | https://github.com/dogsheep/dogsheep-beta/issues/19#issuecomment-688625430 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/19 | MDEyOklzc3VlQ29tbWVudDY4ODYyNTQzMA== | simonw 9599 | 2020-09-08T05:24:50Z | 2020-09-08T05:24:50Z | MEMBER | I thought about allowing tables to define a incremental indexing SQL query - maybe something that can return just records touched in the past hour, or records since a recorded "last indexed record" value. The problem with this is deletes - if you delete a record, how does the indexer know to remove it? See #18 - that's already caused problems. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out incremental re-indexing 695556681 | |
688626037 | https://github.com/dogsheep/dogsheep-beta/issues/19#issuecomment-688626037 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/19 | MDEyOklzc3VlQ29tbWVudDY4ODYyNjAzNw== | simonw 9599 | 2020-09-08T05:27:07Z | 2020-09-08T05:27:07Z | MEMBER | A really clever way to do this would be with triggers. The indexer script would add triggers to each of the database tables that it is indexing - each in their own database. Those triggers would then maintain a `_index_queue_` table. This table would record the primary key of rows that are added, modified or deleted. The indexer could then work by reading through the `_index_queue_` table, re-indexing (or deleting) just the primary keys listed there, and then emptying the queue once it has finished. This would add a small amount of overhead to insert/update/delete queries run against the table. My hunch is that the overhead would be miniscule, but I could still allow people to opt-out for tables that are so high traffic that this would matter. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Figure out incremental re-indexing 695556681 | |
685115519 | https://github.com/dogsheep/dogsheep-beta/issues/2#issuecomment-685115519 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/2 | MDEyOklzc3VlQ29tbWVudDY4NTExNTUxOQ== | simonw 9599 | 2020-09-01T20:31:57Z | 2020-09-01T20:31:57Z | MEMBER | Actually this doesn't work: you can't turn on stemming for specific tables, because all of the content goes into a single `search_index` table which is configured the same way. So stemming needs to be a global option. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Apply porter stemming 689809225 | |
685121074 | https://github.com/dogsheep/dogsheep-beta/issues/2#issuecomment-685121074 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/2 | MDEyOklzc3VlQ29tbWVudDY4NTEyMTA3NA== | simonw 9599 | 2020-09-01T20:42:00Z | 2020-09-01T20:42:00Z | MEMBER | Documentation at the bottom of the Usage section here: https://github.com/dogsheep/dogsheep-beta/blob/0.2/README.md#usage | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Apply porter stemming 689809225 | |
694551406 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694551406 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NDU1MTQwNg== | simonw 9599 | 2020-09-17T23:22:07Z | 2020-09-17T23:22:07Z | MEMBER | Neat, I can debug this with the new `--pdb` option: datasette . --get '/-/beta?q=pycon&sort=oldest' --pdb | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
694551646 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694551646 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NDU1MTY0Ng== | simonw 9599 | 2020-09-17T23:22:48Z | 2020-09-17T23:22:48Z | MEMBER | Looks like its happening in a Jinja fragment template for one of the results: ``` /Users/simon/Dropbox/Development/dogsheep-beta/dogsheep_beta/__init__.py(169)process_results() -> output = compiled.render({**result, **{"json": json}}) /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/asyncsupport.py(71)render() -> return original_render(self, *args, **kwargs) /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/environment.py(1090)render() -> self.environment.handle_exception() /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/environment.py(832)handle_exception() -> reraise(*rewrite_traceback_stack(source=source)) /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/_compat.py(28)reraise() -> raise value.with_traceback(tb) <template>(5)top-level template code() > /usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py(341)loads() -> raise TypeError(f'the JSON object must be str, bytes or bytearray, ' ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
694552393 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694552393 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NDU1MjM5Mw== | simonw 9599 | 2020-09-17T23:25:01Z | 2020-09-17T23:25:17Z | MEMBER | Ran `locals()` In the debugger: `{'range': <class 'range'>, 'dict': <class 'dict'>, 'lipsum': <function generate_lorem_ipsum at 0x10aeff430>, 'cycler': <class 'jinja2.utils.Cycler'>, 'joiner': <class 'jinja2.utils.Joiner'>, 'namespace': <class 'jinja2.utils.Namespace'>, 'rank': -9.383801886431414, 'rowid': 14297, 'type': 'twitter.db/tweets', 'key': '312658917933076480', 'title': 'Tweet by @chrisstreeter', 'category': 2, 'timestamp': '2013-03-15T20:17:49+00:00', 'search_1': '@simonw are you at pycon? Would love to meet you.', 'display': {'avatar_url': 'https://pbs.twimg.com/profile_images/806275088597204993/38yLHfJi_normal.jpg', 'user_name': 'Chris Streeter', 'screen_name': 'chrisstreeter', 'followers_count': 280, 'tweet_id': 312658917933076480, 'created_at': '2013-03-15T20:17:49+00:00', 'full_text': '@simonw are you at pycon? Would love to meet you.', 'media_urls_2': '[]', 'media_urls': '[]'}, 'json': <module 'json' from '/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py'>}` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
694552681 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694552681 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NDU1MjY4MQ== | simonw 9599 | 2020-09-17T23:25:54Z | 2020-09-17T23:25:54Z | MEMBER | This is the template fragment it's rendering: ```html+jinja <div style="overflow: hidden;"> <p>Tweet by <a href="https://twitter.com/{{ display.screen_name }}">@{{ display.screen_name }}</a> ({{ display.user_name }}, {{ "{:,}".format(display.followers_count or 0) }} followers) on <a href="https://twitter.com/{{ display.screen_name }}/status/{{ display.tweet_id }}">{{ display.created_at }}</a></p> </p> <blockquote>{{ display.full_text }}</blockquote> {% if display.media_urls and json.loads(display.media_urls) %} {% for url in json.loads(display.media_urls) %} <img src="{{ url }}" style="height: 200px;"> {% endfor %} {% endif %} </div> ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
694553579 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694553579 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NDU1MzU3OQ== | simonw 9599 | 2020-09-17T23:28:37Z | 2020-09-17T23:28:37Z | MEMBER | More investigation in pdb: ``` (dogsheep-beta) dogsheep-beta % datasette . --get '/-/beta?q=pycon&sort=oldest' --pdb > /usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py(341)loads() -> raise TypeError(f'the JSON object must be str, bytes or bytearray, ' (Pdb) list 336 if s.startswith('\ufeff'): 337 raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)", 338 s, 0) 339 else: 340 if not isinstance(s, (bytes, bytearray)): 341 -> raise TypeError(f'the JSON object must be str, bytes or bytearray, ' 342 f'not {s.__class__.__name__}') 343 s = s.decode(detect_encoding(s), 'surrogatepass') 344 345 if "encoding" in kw: 346 import warnings (Pdb) bytes <class 'bytes'> (Pdb) locals()['s'] Undefined (Pdb) type(locals()['s']) <class 'jinja2.runtime.Undefined'> ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
694554584 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694554584 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NDU1NDU4NA== | simonw 9599 | 2020-09-17T23:31:25Z | 2020-09-17T23:31:25Z | MEMBER | I'd prefer it if errors in these template fragments were displayed as errors inline where the fragment should have been inserted, rather than 500ing the whole page - especially since the template fragments are user-provided and could have all kinds of odd errors in them which should be as easy to debug as possible. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
694557425 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694557425 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NDU1NzQyNQ== | simonw 9599 | 2020-09-17T23:41:01Z | 2020-09-17T23:41:01Z | MEMBER | I removed all of the `json.loads()` calls and I'm still getting that `Undefined` error. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
695113871 | https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-695113871 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 | MDEyOklzc3VlQ29tbWVudDY5NTExMzg3MQ== | simonw 9599 | 2020-09-18T22:30:17Z | 2020-09-18T22:30:17Z | MEMBER | I think I know what's going on here: https://github.com/dogsheep/dogsheep-beta/blob/0f1b951c5131d16f3c8559a8e4d79ed5c559e3cb/dogsheep_beta/__init__.py#L166-L171 This is a logic bug - the `compiled` variable could be the template from the previous loop! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814 | |
695108895 | https://github.com/dogsheep/dogsheep-beta/issues/25#issuecomment-695108895 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/25 | MDEyOklzc3VlQ29tbWVudDY5NTEwODg5NQ== | simonw 9599 | 2020-09-18T22:11:32Z | 2020-09-18T22:11:32Z | MEMBER | I'm going to make this a new plugin configuration setting, `template_debug`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | template_debug mechanism 704685890 | |
695109140 | https://github.com/dogsheep/dogsheep-beta/issues/25#issuecomment-695109140 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/25 | MDEyOklzc3VlQ29tbWVudDY5NTEwOTE0MA== | simonw 9599 | 2020-09-18T22:12:20Z | 2020-09-18T22:12:20Z | MEMBER | Documented here: https://github.com/dogsheep/dogsheep-beta/blob/534fc9689227eba70e69a45da0cee5820bbda9e1/README.md#datasette-plugin | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | template_debug mechanism 704685890 | |
695855646 | https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695855646 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 | MDEyOklzc3VlQ29tbWVudDY5NTg1NTY0Ng== | simonw 9599 | 2020-09-21T00:16:11Z | 2020-09-21T00:16:11Z | MEMBER | Should I do this with offset/limit or should I do proper keyset pagination? I think keyset because then it will work well for the full search interface with no filters or search string. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Pagination 705215230 | |
695855723 | https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695855723 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 | MDEyOklzc3VlQ29tbWVudDY5NTg1NTcyMw== | simonw 9599 | 2020-09-21T00:16:52Z | 2020-09-21T00:17:53Z | MEMBER | It feels a bit weird to implement keyset pagination against results sorted by `rank` because the ranks could change substantially if the search index gets updated while the user is paginating. I may just ignore that though. If you want reliable pagination you can get it by sorting by date. Maybe it doesn't even make sense to offer pagination if you sort by relevance? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Pagination 705215230 | |
695856398 | https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695856398 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 | MDEyOklzc3VlQ29tbWVudDY5NTg1NjM5OA== | simonw 9599 | 2020-09-21T00:22:20Z | 2020-09-21T00:22:20Z | MEMBER | I'm going to try for keyset pagination sorted by relevance just as a learning exercise. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Pagination 705215230 | |
695856967 | https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695856967 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 | MDEyOklzc3VlQ29tbWVudDY5NTg1Njk2Nw== | simonw 9599 | 2020-09-21T00:26:59Z | 2020-09-21T00:26:59Z | MEMBER | It's a shame Datasette doesn't currently have an easy way to implement sorted-by-rank keyset-paginated using a TableView or QueryView. I'll have to do this using the custom SQL query constructed in the plugin: https://github.com/dogsheep/dogsheep-beta/blob/bed9df2b3ef68189e2e445427721a28f4e9b4887/dogsheep_beta/__init__.py#L8-L43 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Pagination 705215230 | |
695875274 | https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695875274 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 | MDEyOklzc3VlQ29tbWVudDY5NTg3NTI3NA== | simonw 9599 | 2020-09-21T02:28:58Z | 2020-09-21T02:28:58Z | MEMBER | Datasette's implementation is complex because it has to support compound primary keys: https://github.com/simonw/datasette/blob/a258339a935d8d29a95940ef1db01e98bb85ae63/datasette/utils/__init__.py#L88-L114 - but that's not something that's needed for dogsheep-beta. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Pagination 705215230 | |
695879237 | https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695879237 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 | MDEyOklzc3VlQ29tbWVudDY5NTg3OTIzNw== | simonw 9599 | 2020-09-21T02:53:29Z | 2020-09-21T02:53:29Z | MEMBER | If previous page ended at `2018-02-11T16:32:53+00:00`: ```sql select search_index.rowid, search_index.type, search_index.key, search_index.title, search_index.category, search_index.timestamp, search_index.search_1 from search_index where date("timestamp") = '2018-02-11' and timestamp < '2018-02-11T16:32:53+00:00' order by search_index.timestamp desc, rowid limit 41 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Pagination 705215230 | |
695879531 | https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695879531 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 | MDEyOklzc3VlQ29tbWVudDY5NTg3OTUzMQ== | simonw 9599 | 2020-09-21T02:55:28Z | 2020-09-21T02:55:54Z | MEMBER | Actually for the tie-breaker it should be something like https://latest.datasette.io/fixtures?sql=select+pk%2C+created%2C+planet_int%2C+on_earth%2C+state%2C+city_id%2C+neighborhood%2C+tags%2C+complex_array%2C+distinct_some_null+from+facetable+where+%28created+%3E+%3Ap1+or+%28created+%3D+%3Ap1+and+%28%28pk+%3E+%3Ap0%29%29%29%29+order+by+created%2C+pk+limit+11&p0=10&p1=2019-01-16+08%3A00%3A00 ```sql where ( created > :p1 or ( created = :p1 and ((pk > :p0)) ) ) ``` But with `rowid` and `timestamp` in place of `pk` and `created`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Pagination 705215230 | |
711089647 | https://github.com/dogsheep/dogsheep-beta/issues/28#issuecomment-711089647 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/28 | MDEyOklzc3VlQ29tbWVudDcxMTA4OTY0Nw== | simonw 9599 | 2020-10-17T22:43:13Z | 2020-10-17T22:43:13Z | MEMBER | Since my personal Dogsheep uses Datasette authentication, I'm going to need to pass through cookies. https://github.com/simonw/datasette/issues/1020 will solve that in the future but for now I need to solve it explicitly. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Switch to using datasette.client 723861683 | |
712266834 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-712266834 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDcxMjI2NjgzNA== | simonw 9599 | 2020-10-19T16:01:23Z | 2020-10-19T16:01:23Z | MEMBER | Might just be a documented pattern for how to configure this in YAML templates. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
747029636 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747029636 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAyOTYzNg== | simonw 9599 | 2020-12-16T21:14:03Z | 2020-12-16T21:14:03Z | MEMBER | I think I can do this as a cunning trick in `display_sql`. Consider this example query: https://til.simonwillison.net/tils?sql=select%0D%0A++path%2C%0D%0A++snippet%28til_fts%2C+-1%2C+%27b4de2a49c8%27%2C+%278c94a2ed4b%27%2C+%27...%27%2C+60%29+as+snippet%0D%0Afrom%0D%0A++til%0D%0A++join+til_fts+on+til.rowid+%3D+til_fts.rowid%0D%0Awhere%0D%0A++til_fts+match+escape_fts%28%3Aq%29%0D%0A++and+path+%3D+%27asgi_lifespan-test-httpx.md%27%0D%0A&q=pytest ```sql select path, snippet(til_fts, -1, 'b4de2a49c8', '8c94a2ed4b', '...', 60) as snippet from til join til_fts on til.rowid = til_fts.rowid where til_fts match escape_fts(:q) and path = 'asgi_lifespan-test-httpx.md' ``` The `and path = 'asgi_lifespan-test-httpx.md'` bit means we only get back a specific document - but the snippet highlighting is applied to it. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
747030964 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747030964 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAzMDk2NA== | simonw 9599 | 2020-12-16T21:14:54Z | 2020-12-16T21:14:54Z | MEMBER | To do this I'll need the search term to be passed to the `display_sql` SQL query: https://github.com/dogsheep/dogsheep-beta/blob/4890ec87b5e2ec48940f32c9ad1f5aae25c75a4d/dogsheep_beta/__init__.py#L164-L171 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
747031608 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747031608 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAzMTYwOA== | simonw 9599 | 2020-12-16T21:15:18Z | 2020-12-16T21:15:18Z | MEMBER | Should I pass any other details to the `display_sql` here as well? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
747034481 | https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747034481 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 | MDEyOklzc3VlQ29tbWVudDc0NzAzNDQ4MQ== | simonw 9599 | 2020-12-16T21:17:05Z | 2020-12-16T21:17:05Z | MEMBER | I'm just going to add `q` for the moment. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add search highlighting snippets 724759588 | |
684250044 | https://github.com/dogsheep/dogsheep-beta/issues/3#issuecomment-684250044 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/3 | MDEyOklzc3VlQ29tbWVudDY4NDI1MDA0NA== | simonw 9599 | 2020-09-01T05:01:09Z | 2020-09-01T05:01:23Z | MEMBER | Maybe this starts out as a custom templated canned query. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Datasette plugin to provide custom page for running faceted, ranked searches 689810340 | |
685961809 | https://github.com/dogsheep/dogsheep-beta/issues/3#issuecomment-685961809 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/3 | MDEyOklzc3VlQ29tbWVudDY4NTk2MTgwOQ== | simonw 9599 | 2020-09-02T19:54:24Z | 2020-09-02T19:54:24Z | MEMBER | This should implement search highlighting too, as seen on https://til.simonwillison.net/til/search?q=cloud <img width="1029" alt="TIL_search__cloud" src="https://user-images.githubusercontent.com/9599/92029959-32c6a300-ed1b-11ea-8b5e-971950980c38.png"> | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Datasette plugin to provide custom page for running faceted, ranked searches 689810340 | |
686689612 | https://github.com/dogsheep/dogsheep-beta/issues/3#issuecomment-686689612 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/3 | MDEyOklzc3VlQ29tbWVudDY4NjY4OTYxMg== | simonw 9599 | 2020-09-03T18:44:20Z | 2020-09-03T18:44:20Z | MEMBER | Facets are now displayed but selecting them doesn't work yet. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Datasette plugin to provide custom page for running faceted, ranked searches 689810340 | |
748426501 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426501 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjUwMQ== | simonw 9599 | 2020-12-19T06:12:22Z | 2020-12-19T06:12:22Z | MEMBER | I deliberately added support for advanced FTS in https://github.com/dogsheep/dogsheep-beta/commit/cbb2491b85d7ff416d6d429b60109e6c2d6d50b9 for #13 but that's the cause of this bug. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
748426581 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426581 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjU4MQ== | simonw 9599 | 2020-12-19T06:13:17Z | 2020-12-19T06:13:17Z | MEMBER | One fix for this could be to try running the raw query, but if it throws an error run it again with the query escaped. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
748426663 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426663 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjY2Mw== | simonw 9599 | 2020-12-19T06:14:06Z | 2020-12-19T06:14:06Z | MEMBER | Looks like I already do that here: https://github.com/dogsheep/dogsheep-beta/blob/9ba4401017ac24ffa3bc1db38e0910ea49de7616/dogsheep_beta/__init__.py#L141-L146 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
748426877 | https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426877 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 | MDEyOklzc3VlQ29tbWVudDc0ODQyNjg3Nw== | simonw 9599 | 2020-12-19T06:16:11Z | 2020-12-19T06:16:11Z | MEMBER | Here's why: if "fts5" in str(e): But the error being raised here is: sqlite3.OperationalError: no such column: to I'm going to attempt the escaped on on every error. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Searching for "github-to-sqlite" throws an error 771316301 | |
684395444 | https://github.com/dogsheep/dogsheep-beta/issues/4#issuecomment-684395444 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/4 | MDEyOklzc3VlQ29tbWVudDY4NDM5NTQ0NA== | simonw 9599 | 2020-09-01T06:00:03Z | 2020-09-01T06:00:03Z | MEMBER | I ran `sqlite-utils optimize beta.db` against my test DB and the size reduced from 183M to 176M - and a 450ms search ran in 359ms. So not a huge improvement but still worthwhile. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Optimize the FTS table 689839399 | |
686689366 | https://github.com/dogsheep/dogsheep-beta/issues/5#issuecomment-686689366 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/5 | MDEyOklzc3VlQ29tbWVudDY4NjY4OTM2Ng== | simonw 9599 | 2020-09-03T18:43:50Z | 2020-09-03T18:43:50Z | MEMBER | No longer needed thanks to #9 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Add a context column that's not searchable 689847361 | |
685895540 | https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685895540 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 | MDEyOklzc3VlQ29tbWVudDY4NTg5NTU0MA== | simonw 9599 | 2020-09-02T17:46:44Z | 2020-09-02T17:46:44Z | MEMBER | Some opet questions about this: - Should I restrict to two exclusive categories here, or should I have a generic category mechanism that can be expanded to more than two? - Should an item be able to exist in more than one category? Do I want to be able to mark an indexed item as both by-me and liked-by-me for example? This question is more interesting if the number of categories is greater than two. - How should this be modeled? Single column, multiple boolean columns, JSON array, m2m against separate table? - What's the best way to make this performant | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for differentiating between "by me" and "liked by me" 691265198 | |
685962280 | https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685962280 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 | MDEyOklzc3VlQ29tbWVudDY4NTk2MjI4MA== | simonw 9599 | 2020-09-02T19:55:26Z | 2020-09-02T19:59:58Z | MEMBER | Relevant: https://charlesleifer.com/blog/a-tour-of-tagging-schemas-many-to-many-bitmaps-and-more/ SQLite supports bitwise operators Binary AND (&) and Binary OR (|) - I could try those. Not sure how they interact with indexes though. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for differentiating between "by me" and "liked by me" 691265198 | |
685965516 | https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685965516 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 | MDEyOklzc3VlQ29tbWVudDY4NTk2NTUxNg== | simonw 9599 | 2020-09-02T20:01:54Z | 2020-09-02T20:01:54Z | MEMBER | Relevant post: https://sqlite.org/forum/forumpost/9f06fedaa5 - drh says: > Indexes are one-to-one. There is one entry in the index for each row in the table. > > You are asking for an index that is many-to-one - multiple index entries for each table row. > > A Full-Text Index is basically a many-to-one index. So if all of your array entries really are words, you could probably get this to work using a Full-Text Index. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for differentiating between "by me" and "liked by me" 691265198 | |
685966361 | https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685966361 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 | MDEyOklzc3VlQ29tbWVudDY4NTk2NjM2MQ== | simonw 9599 | 2020-09-02T20:03:29Z | 2020-09-02T20:03:41Z | MEMBER | I'm going to implement the first version of this as an indexed integer `category` column which has 1 for "about me" and 2 for "liked by me" - and space for other category numerals in the future, albeit a row can only belong to one category. I'll think about a full tagging system separately. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for differentiating between "by me" and "liked by me" 691265198 | |
685966707 | https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685966707 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 | MDEyOklzc3VlQ29tbWVudDY4NTk2NjcwNw== | simonw 9599 | 2020-09-02T20:04:08Z | 2020-09-02T20:04:08Z | MEMBER | I'll make `category` a foreign key to a `categories` table so Datasette can automatically show the `name` column. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for differentiating between "by me" and "liked by me" 691265198 | |
685970384 | https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685970384 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 | MDEyOklzc3VlQ29tbWVudDY4NTk3MDM4NA== | simonw 9599 | 2020-09-02T20:11:41Z | 2020-09-02T20:11:59Z | MEMBER | Default categories: - 1 = created - 2 = saved | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for differentiating between "by me" and "liked by me" 691265198 | |
685960072 | https://github.com/dogsheep/dogsheep-beta/issues/8#issuecomment-685960072 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/8 | MDEyOklzc3VlQ29tbWVudDY4NTk2MDA3Mg== | simonw 9599 | 2020-09-02T19:50:47Z | 2020-09-02T19:50:47Z | MEMBER | This doesn't actually help, because the Datasette table view page doesn't then support adding the `where search_index_fts match :query` bit. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Create a view for running faceted searches 691369691 | |
686153967 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686153967 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4NjE1Mzk2Nw== | simonw 9599 | 2020-09-03T00:17:16Z | 2020-09-03T00:17:55Z | MEMBER | Maybe I can take advantage of https://sqlite.org/np1queryprob.html here - I could define a SQL query for fetching the "display" version of each item, and include a Jinja template fragment in the configuration as well. Maybe something like this: ```yaml photos.db: photos_with_apple_metadata: sql: |- select sha256 as key, 'Photo in ' || coalesce(place_city, 'unknown') as title, ( select group_concat(normalized_string, ' ') from labels where labels.uuid = photos_with_apple_metadata.uuid ) as search_1, date as timestamp, 1 as category from photos_with_apple_metadata display_sql: |- select sha256, place_city, date from photos_with_apple_metadata where sha256 = :key display: |- <img src="https://photos.simonwillison.net/i/{{ display.sha256 }}.jpeg?w=600"> <p>Taken in {{ display.place_city }} on {{ display.date }}</p> ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
686154486 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686154486 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4NjE1NDQ4Ng== | simonw 9599 | 2020-09-03T00:18:54Z | 2020-09-03T00:18:54Z | MEMBER | `display_sql` could be optional. If it's not defined, a `row` object is passed to the template which is the row that's stored in `search_index`. If `display_sql` IS defined then it's executed and the result is made available as a `display` object in addition to the `row` object. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
686154627 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686154627 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4NjE1NDYyNw== | simonw 9599 | 2020-09-03T00:19:22Z | 2020-09-03T00:19:22Z | MEMBER | If this performs well enough (100 displayed items will be 100 extra `display_sql` calls) then I'll go with this as the design for the feature. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
686158454 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686158454 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4NjE1ODQ1NA== | simonw 9599 | 2020-09-03T00:32:42Z | 2020-09-03T00:32:42Z | MEMBER | If this turns out to be too inefficient I could add a `display` text column to the `search_index` table which is designed to be populated with arbitrary JSON by the indexing query, which can then be used to render the template fragment. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
686163754 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686163754 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4NjE2Mzc1NA== | simonw 9599 | 2020-09-03T00:46:21Z | 2020-09-03T00:46:21Z | MEMBER | Challenge: the `dogsheep-beta.yml` configuration file that is passed to the `dogsheep-beta index` command needs to also be made available to Datasette itself, so that it can read the configuration. Let's say it can either be duplicated in the `plugins` configuration block of the `metadata.yml` OR you can do this in `metadata.yml`: ```yaml plugins: dogsheep-beta: config_file: dogsheep-beta.yml ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
686688963 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686688963 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4NjY4ODk2Mw== | simonw 9599 | 2020-09-03T18:42:59Z | 2020-09-03T18:42:59Z | MEMBER | I'm pleased with how this works now. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
686689122 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686689122 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4NjY4OTEyMg== | simonw 9599 | 2020-09-03T18:43:20Z | 2020-09-03T18:43:20Z | MEMBER | Needs documentation. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
686767208 | https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686767208 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 | MDEyOklzc3VlQ29tbWVudDY4Njc2NzIwOA== | simonw 9599 | 2020-09-03T21:12:14Z | 2020-09-03T21:12:14Z | MEMBER | Documentation: https://github.com/dogsheep/dogsheep-beta/blob/0.4/README.md#custom-results-display | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Mechanism for defining custom display of results 691521965 | |
623193947 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623193947 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzE5Mzk0Nw== | simonw 9599 | 2020-05-03T22:36:17Z | 2020-05-03T22:36:17Z | MEMBER | I'm going to use [osxphotos](https://github.com/RhetTbull/osxphotos) for this. Since I've already got code to upload photos and insert them into a table based on their `sha256` hash, my first go at this will be to import data using the tool and foreign-key it to the `sha256` hash in the existing table. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
623195197 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623195197 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzE5NTE5Nw== | simonw 9599 | 2020-05-03T22:44:33Z | 2020-05-03T22:44:33Z | MEMBER | Command will be this: $ photos-to-sqlite apple-photos photos.db This will populate a `apple_photos` table with the data imported by the `osxphotos` library, plus the calculated sha256. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
623198653 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623198653 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzE5ODY1Mw== | simonw 9599 | 2020-05-03T23:09:57Z | 2020-05-03T23:09:57Z | MEMBER | For locations: I'll add `place_x` columns for all of these: ``` (Pdb) photo.place.address._asdict() {'street': None, 'sub_locality': None, 'city': 'Loreto', 'sub_administrative_area': 'Loreto', 'state_province': 'BCS', 'postal_code': None, 'country': 'Mexico', 'iso_country_code': 'MX'} ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
623198986 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623198986 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzE5ODk4Ng== | simonw 9599 | 2020-05-03T23:12:31Z | 2020-05-03T23:12:46Z | MEMBER | To get the taken date in UTC: ``` from datetime import timezone (Pdb) photo.date.astimezone(timezone.utc).isoformat() '2018-02-13T20:21:31.620000+00:00' (Pdb) photo.date.astimezone(timezone.utc).isoformat().split(".") ['2018-02-13T20:21:31', '620000+00:00'] (Pdb) photo.date.astimezone(timezone.utc).isoformat().split(".")[0] '2018-02-13T20:21:31' (Pdb) photo.date.astimezone(timezone.utc).isoformat().split(".")[0] + "+00:00" '2018-02-13T20:21:31+00:00' ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
623199214 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623199214 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzE5OTIxNA== | simonw 9599 | 2020-05-03T23:14:08Z | 2020-05-03T23:14:08Z | MEMBER | Albums have UUIDs: ``` (Pdb) photo.album_info[0].__dict__ {'_uuid': '17816791-ABF3-447B-942C-9FA8065EEBBA', '_db': osxphotos.PhotosDB(dbfile='/Users/simon/Pictures/Photos Library.photoslibrary/database/photos.db'), '_title': 'Geotaggable Photos geotagged'} ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
623199701 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623199701 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzE5OTcwMQ== | simonw 9599 | 2020-05-03T23:17:38Z | 2020-05-03T23:17:38Z | MEMBER | Record burst_uuid as a column: ``` (Pdb) with_bursts[0]._info["burstUUID"] '703FAA23-57BF-40B4-8A33-D9CEB143391B' ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
623199750 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623199750 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzE5OTc1MA== | simonw 9599 | 2020-05-03T23:17:58Z | 2020-05-03T23:17:58Z | MEMBER | Reading this source code is really useful for figuring out how to store a photo in a DB table: https://github.com/RhetTbull/osxphotos/blob/7444b6d173918a3ad2a07aefce5ecf054786c787/osxphotos/photoinfo.py | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
623232984 | https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623232984 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 | MDEyOklzc3VlQ29tbWVudDYyMzIzMjk4NA== | simonw 9599 | 2020-05-04T02:41:32Z | 2020-05-04T02:41:32Z | MEMBER | Needs documentation. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import photo metadata from Apple Photos into SQLite 602533300 | |
618796564 | https://github.com/dogsheep/dogsheep-photos/issues/12#issuecomment-618796564 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/12 | MDEyOklzc3VlQ29tbWVudDYxODc5NjU2NA== | simonw 9599 | 2020-04-24T04:35:25Z | 2020-04-24T04:35:25Z | MEMBER | Code: https://github.com/dogsheep/photos-to-sqlite/blob/a388cf1f1b6b67752d669466cda8b171b6582171/photos_to_sqlite/cli.py#L109-L114 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | If less than 500MB, show size in MB not GB 606033104 | |
620273692 | https://github.com/dogsheep/dogsheep-photos/issues/13#issuecomment-620273692 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/13 | MDEyOklzc3VlQ29tbWVudDYyMDI3MzY5Mg== | simonw 9599 | 2020-04-27T22:42:50Z | 2020-04-27T22:42:50Z | MEMBER | ``` >>> def ext_counts(directory): ... counts = {} ... for path in pathlib.Path(directory).glob("**/*"): ... ext = path.suffix ... counts[ext] = counts.get(ext, 0) + 1 ... return counts ... >>> >>> ext_counts("/Users/simon/Pictures/Photos Library.photoslibrary/originals") {'': 16, '.heic': 15478, '.jpeg': 21691, '.mov': 946, '.png': 2262, '.gif': 38, '.mp4': 116, '.aae': 2} ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Also upload movie files 607888367 | |
620309185 | https://github.com/dogsheep/dogsheep-photos/issues/13#issuecomment-620309185 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/13 | MDEyOklzc3VlQ29tbWVudDYyMDMwOTE4NQ== | simonw 9599 | 2020-04-28T00:39:45Z | 2020-04-28T00:39:45Z | MEMBER | I'm going to leave this until I have the mechanism for associating a live photo video with the photo. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Also upload movie files 607888367 | |
620769348 | https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620769348 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 | MDEyOklzc3VlQ29tbWVudDYyMDc2OTM0OA== | simonw 9599 | 2020-04-28T18:09:21Z | 2020-04-28T18:09:21Z | MEMBER | Pricing is pretty good: free for first 1,000 calls per month, then $1.50 per thousand after that. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Annotate photos using the Google Cloud Vision API 608512747 | |
620771067 | https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620771067 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 | MDEyOklzc3VlQ29tbWVudDYyMDc3MTA2Nw== | simonw 9599 | 2020-04-28T18:12:34Z | 2020-04-28T18:15:38Z | MEMBER | Python library docs: https://googleapis.dev/python/vision/latest/index.html I'm creating a new project for this called simonwillison-photos: https://console.cloud.google.com/projectcreate https://console.cloud.google.com/home/dashboard?project=simonwillison-photos Then I enabled the Vision API. The direct link to https://console.cloud.google.com/flows/enableapi?apiid=vision-json.googleapis.com which they provided in the docs didn't work - it gave me a "You don't have sufficient permissions to use the requested API" error - but starting at the "Enable APIs" page and searching for it worked fine. I created a new service account as an "owner" of that project: https://console.cloud.google.com/apis/credentials/serviceaccountkey (and complained about it on Twitter and through their feedback form) `pip install google-cloud-vision` ```python from google.cloud import vision client = vision.ImageAnnotatorClient.from_service_account_file("simonwillison-photos-18c570b301fe.json") # Photo of a lemur response = client.annotate_image( { "image": { "source": { "image_uri": "https://photos.simonwillison.net/i/1b3414ee9ade67ce04ade9042e6d4b433d1e523c9a16af17f490e2c0a619755b.jpeg" } }, "features": [ {"type": vision.enums.Feature.Type.IMAGE_PROPERTIES}, {"type": vision.enums.Feature.Type.OBJECT_LOCALIZATION}, {"type": vision.enums.Feature.Type.LABEL_DETECTION}, ], } ) response ``` Output is: ``` label_annotations { mid: "/m/09686" description: "Vertebrate" score: 0.9851104021072388 topicality: 0.9851104021072388 } label_annotations { mid: "/m/04rky" description: "Mammal" score: 0.975814163684845 topicality: 0.975814163684845 } label_annotations { mid: "/m/01280g" description: "Wildlife" score: 0.8973650336265564 topicality: 0.8973650336265564 } label_annotations { mid: "/m/02f9pk" description: "Lemur" score: 0.8270352482795715 … | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Annotate photos using the Google Cloud Vision API 608512747 | |
620771698 | https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620771698 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 | MDEyOklzc3VlQ29tbWVudDYyMDc3MTY5OA== | simonw 9599 | 2020-04-28T18:13:48Z | 2020-04-28T18:13:48Z | MEMBER | For face detection: ``` {"type": vision.enums.Feature.Type.Type.FACE_DETECTION} ``` For OCR: ``` {"type": vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION} ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Annotate photos using the Google Cloud Vision API 608512747 | |
620772190 | https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620772190 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 | MDEyOklzc3VlQ29tbWVudDYyMDc3MjE5MA== | simonw 9599 | 2020-04-28T18:14:43Z | 2020-04-28T18:14:43Z | MEMBER | Database schema for this will require some thought. Just dumping the output into a JSON column isn't going to be flexible enough - I want to be able to FTS against labels and OCR text, and potentially query against other characteristics too. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Annotate photos using the Google Cloud Vision API 608512747 | |
620774507 | https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620774507 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 | MDEyOklzc3VlQ29tbWVudDYyMDc3NDUwNw== | simonw 9599 | 2020-04-28T18:19:06Z | 2020-04-28T18:19:06Z | MEMBER | The default timeout is a bit aggressive and sometimes failed for me if my resizing proxy took too long to fetch and resize the image. `client.annotate_image(..., timeout=3.0)` may be worth trying. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Annotate photos using the Google Cloud Vision API 608512747 | |
623723026 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623723026 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDYyMzcyMzAyNg== | simonw 9599 | 2020-05-04T21:41:30Z | 2020-05-04T21:41:30Z | MEMBER | I'm going to put these in a table called `apple_photos_scores` - I'll also pull in the following columns from the `ZGENERICASSET` table: * `ZOVERALLAESTHETICSCORE` * `ZCURATIONSCORE` * `ZHIGHLIGHTVISIBILITYSCORE` * `ZPROMOTIONSCORE` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
623723687 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623723687 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDYyMzcyMzY4Nw== | simonw 9599 | 2020-05-04T21:43:06Z | 2020-05-04T21:43:06Z | MEMBER | It looks like I can map the photos I'm importing to these tables using the `ZUUID` column on `ZGENERICASSET` to get a `Z_PK` which then maps to the rows in `ZGENERICASSET`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
623730934 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623730934 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDYyMzczMDkzNA== | simonw 9599 | 2020-05-04T22:00:38Z | 2020-05-04T22:00:48Z | MEMBER | Here's the query to create the new table: ```sql create table apple_photos_scores as select ZGENERICASSET.ZUUID, ZGENERICASSET.ZOVERALLAESTHETICSCORE, ZGENERICASSET.ZCURATIONSCORE, ZGENERICASSET.ZPROMOTIONSCORE, ZGENERICASSET.ZHIGHLIGHTVISIBILITYSCORE, ZCOMPUTEDASSETATTRIBUTES.ZBEHAVIORALSCORE, ZCOMPUTEDASSETATTRIBUTES.ZFAILURESCORE, ZCOMPUTEDASSETATTRIBUTES.ZHARMONIOUSCOLORSCORE, ZCOMPUTEDASSETATTRIBUTES.ZIMMERSIVENESSSCORE, ZCOMPUTEDASSETATTRIBUTES.ZINTERACTIONSCORE, ZCOMPUTEDASSETATTRIBUTES.ZINTERESTINGSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZINTRUSIVEOBJECTPRESENCESCORE, ZCOMPUTEDASSETATTRIBUTES.ZLIVELYCOLORSCORE, ZCOMPUTEDASSETATTRIBUTES.ZLOWLIGHT, ZCOMPUTEDASSETATTRIBUTES.ZNOISESCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTCAMERATILTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTCOMPOSITIONSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTLIGHTINGSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPATTERNSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPERSPECTIVESCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPOSTPROCESSINGSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTREFLECTIONSSCORE, ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTSYMMETRYSCORE, ZCOMPUTEDASSETATTRIBUTES.ZSHARPLYFOCUSEDSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZTASTEFULLYBLURREDSCORE, ZCOMPUTEDASSETATTRIBUTES.ZWELLCHOSENSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZWELLFRAMEDSUBJECTSCORE, ZCOMPUTEDASSETATTRIBUTES.ZWELLTIMEDSHOTSCORE from attached.ZGENERICASSET join attached.ZCOMPUTEDASSETATTRIBUTES on attached.ZGENERICASSET.Z_PK = attached.ZCOMPUTEDASSETATTRIBUTES.Z_PK; ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
623739934 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623739934 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDYyMzczOTkzNA== | simonw 9599 | 2020-05-04T22:24:26Z | 2020-05-04T22:24:26Z | MEMBER | Twitter thread with some examples of photos that are coming up from queries against these scores: https://twitter.com/simonw/status/1257434670750408705 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
748436115 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748436115 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDc0ODQzNjExNQ== | nickvazz 8573886 | 2020-12-19T07:43:38Z | 2020-12-19T07:47:36Z | NONE | Hey Simon! I really enjoy datasette so far, just started trying it out today following your iPhone photos [example](https://simonwillison.net/2020/May/21/dogsheep-photos/). I am not sure if you had run into this or not, but it seems like they might have changed one of the column names from `ZGENERICASSET` to `ZASSET`. Should I open a PR? Would change: - [here](https://github.com/dogsheep/dogsheep-photos/blob/master/dogsheep_photos/cli.py#L209-L213) - [here](https://github.com/dogsheep/dogsheep-photos/blob/master/dogsheep_photos/cli.py#L238) - [here](https://github.com/dogsheep/dogsheep-photos/blob/master/dogsheep_photos/cli.py#L240) | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
748436779 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748436779 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDc0ODQzNjc3OQ== | RhetTbull 41546558 | 2020-12-19T07:49:00Z | 2020-12-19T07:49:00Z | CONTRIBUTOR | @nickvazz ZGENERICASSET changed to ZASSET in Big Sur. Here's a list of other changes to the schema in Big Sur: https://github.com/RhetTbull/osxphotos/wiki/Changes-in-Photos-6---Big-Sur | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
748562288 | https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748562288 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 | MDEyOklzc3VlQ29tbWVudDc0ODU2MjI4OA== | RhetTbull 41546558 | 2020-12-20T04:44:22Z | 2020-12-20T04:44:22Z | CONTRIBUTOR | @nickvazz @simonw I opened a [PR](https://github.com/dogsheep/dogsheep-photos/pull/31) that replaces the SQL for `ZCOMPUTEDASSETATTRIBUTES` to use osxphotos which now exposes all this data and has been updated for Big Sur. I did regression tests to confirm the extracted data is identical, with one exception which should not affect operation: the old code pulled data from `ZCOMPUTEDASSETATTRIBUTES` for missing photos while the main loop ignores missing photos and does not add them to `apple_photos`. The new code does not add rows to the `apple_photos_scores` table for missing photos. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767 | |
623805823 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623805823 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzgwNTgyMw== | simonw 9599 | 2020-05-05T02:45:56Z | 2020-05-05T02:45:56Z | MEMBER | I filed an issue with `osxphotos` about this here: https://github.com/RhetTbull/osxphotos/issues/121 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623806085 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806085 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzgwNjA4NQ== | simonw 9599 | 2020-05-05T02:47:18Z | 2020-05-05T02:47:18Z | MEMBER | In https://github.com/RhetTbull/osxphotos/issues/121#issuecomment-623249263 Rhet Turnbull spotted a table called `ZSCENEIDENTIFIER` which looked like it might have the right data, but the columns in it aren't particularly helpful: ``` Z_PK,Z_ENT,Z_OPT,ZSCENEIDENTIFIER,ZASSETATTRIBUTES,ZCONFIDENCE 8,49,1,731,5,0.11834716796875 9,49,1,684,6,0.0233648251742125 10,49,1,1702,1,0.026153564453125 ``` I love the look of those confidence scores, but what do the numbers mean? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623806533 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806533 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzgwNjUzMw== | simonw 9599 | 2020-05-05T02:50:16Z | 2020-05-05T02:50:16Z | MEMBER | I figured there must be a separate database that Photos uses to store the text of the identified labels. I used "Open Files and Ports" in Activity Monitor against the Photos app to try and spot candidates... and found `/Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite` - a 53MB SQLite database file. <img width="1365" alt="Item-0_and_Item-0_and_Item-0_and_Item-0" src="https://user-images.githubusercontent.com/9599/81031213-61ea0800-8e40-11ea-8237-cfce4a5128e0.png"> Here's the schema of that file: ``` $ sqlite3 psi.sqlite .schema CREATE TABLE word_embedding(word TEXT, extended_word TEXT, score DOUBLE); CREATE INDEX word_embedding_index ON word_embedding(word); CREATE VIRTUAL TABLE word_embedding_prefix USING fts5(extended_word) /* word_embedding_prefix(extended_word) */; CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_data'(id INTEGER PRIMARY KEY, block BLOB); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID; CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_content'(id INTEGER PRIMARY KEY, c0); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_docsize'(id INTEGER PRIMARY KEY, sz BLOB); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_config'(k PRIMARY KEY, v) WITHOUT ROWID; CREATE TABLE groups(category INT2, owning_groupid INT, content_string TEXT, normalized_string TEXT, lookup_identifier TEXT, token_ranges_0 INT8, token_ranges_1 INT8, UNIQUE(category, owning_groupid, content_string, lookup_identifier, token_ranges_0, token_ranges_1)); CREATE TABLE assets(uuid_0 INT, uuid_1 INT, creationDate INT, UNIQUE(uuid_0, uuid_1)); CREATE TABLE ga(groupid INT, assetid INT, PRIMARY KEY(groupid, assetid)); CREATE TABLE collections(uuid_0 INT, uuid_1 INT, startDate INT, endDate INT, title TEXT, subtitle TEXT, keyAssetUUID_0 INT, keyAssetUUID_1 INT, typeAndNumberOfAssets INT32, sortDate DOUBLE, UNIQUE(uuid_0, uuid_1)); CREATE TABLE gc(groupid INT, collectionid INT, PRIMARY KEY(groupid, collectionid)); CREATE… | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623806687 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806687 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzgwNjY4Nw== | simonw 9599 | 2020-05-05T02:51:16Z | 2020-05-05T02:51:16Z | MEMBER | Running datasette against it directly doesn't work: ``` simon@Simons-MacBook-Pro search % datasette psi.sqlite Serve! files=('psi.sqlite',) (immutables=()) on port 8001 Usage: datasette serve [OPTIONS] [FILES]... Error: Connection to psi.sqlite failed check: no such tokenizer: PSITokenizer ``` Instead, I created a new SQLite database with a copy of some of the key tables, like this: ``` sqlite-utils rows psi.sqlite groups | sqlite-utils insert /tmp/search.db groups - sqlite-utils rows psi.sqlite assets | sqlite-utils insert /tmp/search.db assets - sqlite-utils rows psi.sqlite ga | sqlite-utils insert /tmp/search.db ga - sqlite-utils rows psi.sqlite collections | sqlite-utils insert /tmp/search.db collections - sqlite-utils rows psi.sqlite gc | sqlite-utils insert /tmp/search.db gc - sqlite-utils rows psi.sqlite lookup | sqlite-utils insert /tmp/search.db lookup - ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623807568 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623807568 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzgwNzU2OA== | simonw 9599 | 2020-05-05T02:56:06Z | 2020-05-05T02:56:06Z | MEMBER | I'm pretty sure this is what I'm after. The `groups` table has what looks like identified labels in the rows with category = 2025: <img width="1122" alt="words__groups__2_528_rows_where_where_category___2025" src="https://user-images.githubusercontent.com/9599/81031361-e0df4080-8e40-11ea-9060-6d850aa52140.png"> Then there's a `ga` table that maps groups to assets: <img width="304" alt="words__ga__633_653_rows" src="https://user-images.githubusercontent.com/9599/81031387-f48aa700-8e40-11ea-9a3d-da23903be928.png"> And an `assets` table which looks like it has one row for every one of my photos: <img width="645" alt="words__assets__40_419_rows" src="https://user-images.githubusercontent.com/9599/81031402-04a28680-8e41-11ea-8047-e9199d068563.png"> One major challenge: these UUIDs are split into two integer numbers, `uuid_0` and `uuid_1` - but the main photos database uses regular UUIDs like this:  I need to figure out how to match up these two different UUID representations. I asked on Twitter if anyone has any ideas: https://twitter.com/simonw/status/1257500689019703296 | {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623811131 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623811131 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzgxMTEzMQ== | simonw 9599 | 2020-05-05T03:16:18Z | 2020-05-05T03:16:18Z | MEMBER | Here's how to convert two integers unto a UUID using Java. Not sure if it's the solution I need though (or how to do the same thing in Python): https://repl.it/repls/EuphoricSomberClasslibrary <img width="1494" alt="Repl_it_-_EuphoricSomberClasslibrary" src="https://user-images.githubusercontent.com/9599/81032267-0d488c00-8e44-11ea-9be7-680eaccd1611.png"> ```java import java.util.UUID; class Main { public static void main(String[] args) { java.util.UUID uuid = new java.util.UUID( 2544182952487526660L, -3640314103732024685L ); System.out.println( uuid ); } } ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623845014 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623845014 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg0NTAxNA== | RhetTbull 41546558 | 2020-05-05T03:55:14Z | 2020-05-05T03:56:24Z | CONTRIBUTOR | I'm traveling w/o access to my Mac so can't help with any code right now. I suspected ZSCENEIDENTIFIER was a foreign key into one of these psi.sqlite tables. But looks like you're on to something connecting groups to assets. As for the UUID, I think there's two ints because each is 64-bits but UUIDs are 128-bits. Thus they need to be combined to get the 128 bit UUID. You might be able to use Apple's [NSUUID](https://developer.apple.com/documentation/foundation/nsuuid?language=objc), for example, by wrapping with pyObjC. Here's one [example](https://github.com/ronaldoussoren/pyobjc/blob/881c82a7ba90f193934b52b44143360c80dce5e5/pyobjc-framework-Cocoa/PyObjCTest/test_nsuuid.py) of using this in PyObjC's test suite. Interesting it's stored this way instead of a UUIDString as in Photos.sqlite. Perhaps it for faster indexing. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623846880 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623846880 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg0Njg4MA== | simonw 9599 | 2020-05-05T04:06:08Z | 2020-05-05T04:06:08Z | MEMBER | This function seems to convert them into UUIDs that match my photos: ```python def to_uuid(uuid_0, uuid_1): b = uuid_0.to_bytes(8, 'little', signed=True) + uuid_1.to_bytes(8, 'little', signed=True) return str(uuid.UUID(bytes=b)).upper() ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623855841 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855841 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg1NTg0MQ== | simonw 9599 | 2020-05-05T04:54:28Z | 2020-05-05T04:54:28Z | MEMBER | Things were not matching up for me correctly: <img width="1143" alt="search__select_json_object__img_src____https___photos_simonwillison_net_i______photos_sha256___________photos_ext______w_400___as_photo__groups_content_string__assets_uuid_0__assets_uuid_1__to_uuid_assets_uuid_0__assets_uuid_1__as_uuid__pho" src="https://user-images.githubusercontent.com/9599/81035923-ca8db080-8e51-11ea-95a7-6ee60bae7502.png"> I think that's because my import script didn't correctly import the existing `rowid` values. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623855885 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855885 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg1NTg4NQ== | simonw 9599 | 2020-05-05T04:54:39Z | 2020-05-05T04:54:53Z | MEMBER | Trying this import mechanism instead: `sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite .dump | grep -v 'CREATE INDEX' | grep -v 'CREATE TRIGGER' | grep -v 'CREATE VIRTUAL TABLE' | sqlite3 search.db` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623857417 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623857417 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg1NzQxNw== | simonw 9599 | 2020-05-05T05:01:47Z | 2020-05-05T05:01:47Z | MEMBER | Even that didn't work - it didn't copy across the rowid values. I'm pretty sure that's what's wrong here: ``` sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite 'select rowid, uuid_0, uuid_1 from assets limit 10' 1619605|-9205353363298198838|4814875488794983828 1641378|-9205348195631362269|390804289838822030 1634974|-9205331524553603243|-3834026796261633148 1619083|-9205326176986145401|7563404215614709654 22131|-9205315724827218763|8370531509591906734 1645633|-9205247376092758131|-1311540150497601346 ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623863902 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623863902 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg2MzkwMg== | simonw 9599 | 2020-05-05T05:31:53Z | 2020-05-05T05:31:53Z | MEMBER | Yes! Turning those `rowid` values into `id` with this script did the job: ```python import sqlite3 import sqlite_utils conn = sqlite3.connect( "/Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite" ) def all_rows(table): result = conn.execute("select rowid as id, * from {}".format(table)) cols = [c[0] for c in result.description] for row in result.fetchall(): yield dict(zip(cols, row)) if __name__ == "__main__": db = sqlite_utils.Database("psi_copy.db") for table in ("assets", "collections", "ga", "gc", "groups"): db[table].upsert_all(all_rows(table), pk="id", alter=True) ``` Then I ran this query: ```sql select json_object('img_src', 'https://photos.simonwillison.net/i/' || photos.sha256 || '.' || photos.ext || '?w=400') as photo, group_concat(strip_null_chars(groups.content_string), ' ') as words, assets.uuid_0, assets.uuid_1, to_uuid(assets.uuid_0, assets.uuid_1) as uuid from assets join ga on assets.id = ga.assetid join groups on ga.groupid = groups.id join photos on photos.uuid = to_uuid(assets.uuid_0, assets.uuid_1) where groups.category = 2024 group by assets.id order by random() limit 10 ``` And got these results! <img width="1054" alt="psi_copy__select_json_object__img_src____https___photos_simonwillison_net_i______photos_sha256___________photos_ext______w_400___as_photo__group_concat_strip_null_chars_groups_content_string________as_words__assets_uuid_0__assets_uuid_1__to" src="https://user-images.githubusercontent.com/9599/81037264-f1021a80-8e56-11ea-9924-6f9f55a0fb4b.png"> | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
623865250 | https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623865250 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 | MDEyOklzc3VlQ29tbWVudDYyMzg2NTI1MA== | simonw 9599 | 2020-05-05T05:38:16Z | 2020-05-05T05:38:16Z | MEMBER | It looks like `groups.content_string` often has a null byte in it. I should clean this up as part of the import. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234 | |
624278090 | https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624278090 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17 | MDEyOklzc3VlQ29tbWVudDYyNDI3ODA5MA== | simonw 9599 | 2020-05-05T20:06:01Z | 2020-05-05T20:06:01Z | MEMBER | https://www.python.org/dev/peps/pep-0508/#environment-markers I think I want `sys_platform` of `darwin`. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Only install osxphotos if running on macOS 612860531 | |
624278714 | https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624278714 | https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17 | MDEyOklzc3VlQ29tbWVudDYyNDI3ODcxNA== | simonw 9599 | 2020-05-05T20:07:19Z | 2020-05-05T20:07:19Z | MEMBER | From https://hynek.me/articles/conditional-python-dependencies/ I think this will look like: ```python setup( # ... install_requires=[ # ... "osxphotos>=0.28.13 ; sys_platform=='darwin'", ] ) ``` | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | Only install osxphotos if running on macOS 612860531 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
updated_at (date) >30 ✖