{"id": 585526292, "node_id": "MDU6SXNzdWU1ODU1MjYyOTI=", "number": 1, "title": "Set up full text search", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-03-21T15:57:35Z", "updated_at": "2020-03-21T19:47:46Z", "closed_at": "2020-03-21T19:45:52Z", "author_association": "MEMBER", "pull_request": null, "body": "Should run against `title` and `text` in `items`, and `about` and `id` in `users`.", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/1/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 952179830, "node_id": "MDU6SXNzdWU5NTIxNzk4MzA=", "number": 2, "title": "Command for fetching Hacker News threads from the search API", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2021-07-25T02:00:45Z", "updated_at": "2021-07-25T03:12:57Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "I want to be able to fetch every item for a domain, e.g. https://news.ycombinator.com/from?site=simonwillison.net", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 952189173, "node_id": "MDU6SXNzdWU5NTIxODkxNzM=", "number": 3, "title": "Use HN algolia endpoint to retrieve trees", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2021-07-25T03:35:27Z", "updated_at": "2021-07-25T18:41:17Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "The `trees` command currently has to make a request for every single comment. Algolia have an endpoint that bundles the entire thread together into a single request.\r\n\r\n`https://hn.algolia.com/api/v1/items/ID`\r\n\r\nHere's an example that loads quickly, with about 50 comments: https://hn.algolia.com/api/v1/items/27941108\r\n\r\nIt doesn't appear to use pagination at all - if a thread is big then the response is big.\r\n\r\nI ran this search to find some stories with more than 1000 comments: https://hn.algolia.com/api/v1/search?tags=story&numericFilters=num_comments%3E=1000\r\n\r\nHere's one: https://news.ycombinator.com/item?id=25015967 with 4759 comments. Hitting the API takes 41s and returns 3.7 MB of JSON!\r\n```\r\nwget 'https://hn.algolia.com/api/v1/items/25015967' 0.03s user 0.04s system 0% cpu 41.368 total\r\n/tmp % ls -lah 25015967 \r\n-rw-r--r-- 1 simon wheel 3.7M Jul 24 20:31 25015967\r\n```", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1205867842, "node_id": "I_kwDODtX3eM5H4BVC", "number": 4, "title": "Retrieve the top-level story for a comment", "user": {"value": 1755789, "label": "telotortium"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-04-15T20:25:39Z", "updated_at": "2022-04-15T20:25:39Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I think that each comment inserted into the database should include a column `onstory` that contains the ID of the story on which the comment was made. This is exactly equivalent to the link after \"on:\" at the top of an HN comment page ([example](https://news.ycombinator.com/item?id=18358028)). We could do this either by directly retrieving the HTML page and using Beautiful Soup to find that link, or alternatively recurse up the tree in the Firebase API using the `parent` field (probably using `functools.lru_cache` in case a person has commented a bunch of times on the same story).", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/4/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1353418822, "node_id": "PR_kwDODtX3eM497MOV", "number": 5, "title": "The program fails when the user has no submissions", "user": {"value": 2467, "label": "fernand0"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-08-28T17:25:45Z", "updated_at": "2022-08-28T17:25:45Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/hacker-news-to-sqlite/pulls/5", "body": "Tested with:\r\n \r\n hacker-news-to-sqlite user hacker-news.db fernand0\r\n\r\nResult:\r\n`\r\nTraceback (most recent call last):\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/bin/hacker-news-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1130, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1055, in main\r\n rv = self.invoke(ctx)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1657, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1404, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 760, in invoke\r\n return __callback(*args, **kwargs)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hacker_news_to_sqlite/cli.py\", line 27, in user\r\n submitted = user.pop(\"submitted\", None) or []\r\nAttributeError: 'NoneType' object has no attribute 'pop'\r\n`\r\n\r\nThere is a problem of style with the patch (but not sure what to do) because with the new inicialization ( submitted = []) the part \r\n\r\n or []\r\n\r\nis not needed. Maybe there is a more adequate way of doing this.", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/5/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1641117021, "node_id": "PR_kwDODtX3eM5M66op", "number": 6, "title": "Add permalink virtual field to items table", "user": {"value": 1231935, "label": "xavdid"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-03-26T22:22:38Z", "updated_at": "2023-03-29T18:38:52Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/hacker-news-to-sqlite/pulls/6", "body": "I added a virtual column (no storage overhead) to the output that easily links back to the source. It works nicely out of the box with datasette:\r\n\r\n![](https://cdn.zappy.app/faf43661d539ee0fee02c0421de22d65.png)\r\n\r\nI got bit a bit by https://github.com/simonw/sqlite-utils/issues/411, so I went with a manual `table_xinfo` and creating the table via execute. Happy to adjust if that issue moves, but this seems like it works.\r\n\r\nI also added my best-guess instructions for local development on this package. I'm shooting in the dark, so feel free to replace with how you work on it locally.", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/6/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null}