issues
6 rows where repo = 248903544
This data as json, CSV (advanced)
Suggested facets: user, state, comments, author_association, type, created_at (date), updated_at (date), closed_at (date)
id ▼ | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
585526292 | MDU6SXNzdWU1ODU1MjYyOTI= | 1 | Set up full text search | simonw 9599 | closed | 0 | 1 | 2020-03-21T15:57:35Z | 2020-03-21T19:47:46Z | 2020-03-21T19:45:52Z | MEMBER | Should run against `title` and `text` in `items`, and `about` and `id` in `users`. | hacker-news-to-sqlite 248903544 | issue | {"url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | completed | ||||||
952179830 | MDU6SXNzdWU5NTIxNzk4MzA= | 2 | Command for fetching Hacker News threads from the search API | simonw 9599 | open | 0 | 4 | 2021-07-25T02:00:45Z | 2021-07-25T03:12:57Z | MEMBER | I want to be able to fetch every item for a domain, e.g. https://news.ycombinator.com/from?site=simonwillison.net | hacker-news-to-sqlite 248903544 | issue | {"url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | ||||||||
952189173 | MDU6SXNzdWU5NTIxODkxNzM= | 3 | Use HN algolia endpoint to retrieve trees | simonw 9599 | open | 0 | 3 | 2021-07-25T03:35:27Z | 2021-07-25T18:41:17Z | MEMBER | The `trees` command currently has to make a request for every single comment. Algolia have an endpoint that bundles the entire thread together into a single request. `https://hn.algolia.com/api/v1/items/ID` Here's an example that loads quickly, with about 50 comments: https://hn.algolia.com/api/v1/items/27941108 It doesn't appear to use pagination at all - if a thread is big then the response is big. I ran this search to find some stories with more than 1000 comments: https://hn.algolia.com/api/v1/search?tags=story&numericFilters=num_comments%3E=1000 Here's one: https://news.ycombinator.com/item?id=25015967 with 4759 comments. Hitting the API takes 41s and returns 3.7 MB of JSON! ``` wget 'https://hn.algolia.com/api/v1/items/25015967' 0.03s user 0.04s system 0% cpu 41.368 total /tmp % ls -lah 25015967 -rw-r--r-- 1 simon wheel 3.7M Jul 24 20:31 25015967 ``` | hacker-news-to-sqlite 248903544 | issue | {"url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | ||||||||
1205867842 | I_kwDODtX3eM5H4BVC | 4 | Retrieve the top-level story for a comment | telotortium 1755789 | open | 0 | 0 | 2022-04-15T20:25:39Z | 2022-04-15T20:25:39Z | NONE | I think that each comment inserted into the database should include a column `onstory` that contains the ID of the story on which the comment was made. This is exactly equivalent to the link after "on:" at the top of an HN comment page ([example](https://news.ycombinator.com/item?id=18358028)). We could do this either by directly retrieving the HTML page and using Beautiful Soup to find that link, or alternatively recurse up the tree in the Firebase API using the `parent` field (probably using `functools.lru_cache` in case a person has commented a bunch of times on the same story). | hacker-news-to-sqlite 248903544 | issue | {"url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/4/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | ||||||||
1353418822 | PR_kwDODtX3eM497MOV | 5 | The program fails when the user has no submissions | fernand0 2467 | open | 0 | 0 | 2022-08-28T17:25:45Z | 2022-08-28T17:25:45Z | FIRST_TIME_CONTRIBUTOR | dogsheep/hacker-news-to-sqlite/pulls/5 | Tested with: hacker-news-to-sqlite user hacker-news.db fernand0 Result: ` Traceback (most recent call last): File "/home/ftricas/.pyenv/versions/3.10.6/bin/hacker-news-to-sqlite", line 8, in <module> sys.exit(cli()) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hacker_news_to_sqlite/cli.py", line 27, in user submitted = user.pop("submitted", None) or [] AttributeError: 'NoneType' object has no attribute 'pop' ` There is a problem of style with the patch (but not sure what to do) because with the new inicialization ( submitted = []) the part or [] is not needed. Maybe there is a more adequate way of doing this. | hacker-news-to-sqlite 248903544 | pull | {"url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/5/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 0 | ||||||
1641117021 | PR_kwDODtX3eM5M66op | 6 | Add permalink virtual field to items table | xavdid 1231935 | open | 0 | 1 | 2023-03-26T22:22:38Z | 2023-03-29T18:38:52Z | FIRST_TIME_CONTRIBUTOR | dogsheep/hacker-news-to-sqlite/pulls/6 | I added a virtual column (no storage overhead) to the output that easily links back to the source. It works nicely out of the box with datasette: ![](https://cdn.zappy.app/faf43661d539ee0fee02c0421de22d65.png) I got bit a bit by https://github.com/simonw/sqlite-utils/issues/411, so I went with a manual `table_xinfo` and creating the table via execute. Happy to adjust if that issue moves, but this seems like it works. I also added my best-guess instructions for local development on this package. I'm shooting in the dark, so feel free to replace with how you work on it locally. | hacker-news-to-sqlite 248903544 | pull | {"url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/6/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 0 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [pull_request] TEXT, [body] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT , [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);