github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
779088071 | MDU6SXNzdWU3NzkwODgwNzE= | 54 | Archive import appears to be broken on recent exports | 21148 | open | 0 | 5 | 2021-01-05T14:18:01Z | 2023-01-04T11:06:55Z | CONTRIBUTOR | I requested a Twitter export yesterday, and unfortunately they seem to have changed it such that `twitter-to-sqlite import` can't handle it anymore 😢 So far I've ran into two issues. The first was easy to work around, but the second will take more investigation. If I can find the time I'll keep working on it and update this issue accordingly. The issues (so far): ### 1. Data seems to have moved to a `data/` subdirectory Running `twitter-to-sqlite import` on the raw zip file reports a bunch of "not yet implemented" errors, and then exits without actually importing anything: ``` ❯ twitter-to-sqlite import tarchive.db twitter.zip ... data/manifest: not yet implemented data/account-creation-ip: not yet implemented data/account-suspension: not yet implemented ... (dozens of more lines like this, including critical stuff like data/tweets) ... ``` (`tarchive.db` now exists, but is empty) Workaround: unpack the zip file, and run `twitter-to-sqlite import tarchive.db path/to/archive/data` That gets further, but: ### 2. Some schema(s?) have changed At least, the `blocks` schema seems different now: ``` ❯ twitter-to-sqlite import tarchive.db archive/data direct-messages-group: not yet implemented branch-links: not yet implemented periscope-expired-broadcasts: not yet implemented direct-messages: not yet implemented mute: not yet implemented Traceback (most recent call last): File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/bin/twitter-to-sqlite", line 8, in <module> sys.exit(cli()) File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN… | 206156866 | issue | {"url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/54/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | ||||||||
275814941 | MDU6SXNzdWUyNzU4MTQ5NDE= | 141 | datasette publish can fail if /tmp is on a different device | 21148 | closed | 0 | 2949431 | 5 | 2017-11-21T18:28:05Z | 2020-04-29T03:27:54Z | 2017-12-08T16:06:36Z | CONTRIBUTOR | `datasette publish` uses hard links to avoid copying the db into a tmp directory. This can fail if `/tmp` is on another device, because hardlinks can't cross devices. You'll see something like this: ``` $ datasette publish heroku whatever.db ... OSError: [Errno 18] Invalid cross-device link: '/mnt/c/Users/jacob/c/datasette/whatever.db' -> '/tmp/tmpvxq2yof6/whatever.db' ``` [In my case this is failing because I'm on a Windows machine, using WSL, so my code's on a different virtual filesystem from the Linux subsystem, Because Reasons.] I'm not sure if it's possible to detect this (can you figure out which device `/tmp` is on?), or what the fallback should be (soft link? copy?). | 107914493 | issue | {"url": "https://api.github.com/repos/simonw/datasette/issues/141/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | completed |