home / github

Menu
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

3 rows where comments = 3, type = "issue" and user = 15178711

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

id ▼ node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
751195017 MDU6SXNzdWU3NTExOTUwMTc= 1111 Accessing a database's `.json` is slow for very large SQLite files asg017 15178711 open 0     3 2020-11-26T00:27:27Z 2021-01-04T19:57:53Z   CONTRIBUTOR   I have a SQLite DB that's pretty large, 23GB and something like 300 million rows. I expect that most queries I run on it will be slow, which is fine, but there are some things that Datasette does that makes working with the DB very slow. Specifically, when I access the `.json` metadata for a table (which I believe it comes from `datasette/views/database.py`, it takes 43 seconds for the request to come in: ```bash $ time curl localhost:9999/out.json {"database": "out", "size": 24291454976, "tables": [{"name": "PageviewsHour", "columns": ["file", "code", "page", "pageviews"], "primary_keys": [], "count": null, "hidden": false, "fts_table": null, "foreign_keys": {"incoming": [], "outgoing": [{"other_table": "PageviewsHourFiles", "column": "file", "other_column": "file_id"}]}, "private": false}, {"name": "PageviewsHourFiles", "columns": ["file_id", "filename", "sha256", "size", "day", "hour"], "primary_keys": ["file_id"], "count": null, "hidden": false, "fts_table": null, "foreign_keys": {"incoming": [{"other_table": "PageviewsHour", "column": "file_id", "other_column": "file"}], "outgoing": []}, "private": false}, {"name": "sqlite_sequence", "columns": ["name", "seq"], "primary_keys": [], "count": 1, "hidden": false, "fts_table": null, "foreign_keys": {"incoming": [], "outgoing": []}, "private": false}], "hidden_count": 0, "views": [], "queries": [], "private": false, "allow_execute_sql": true, "query_ms": 43340.23213386536} real 0m43.417s user 0m0.006s sys 0m0.016s ``` I suspect this is because a `COUNT(*)` is happening under the hood, which, when I run it through sqlite directly, does take around the same time: ```bash $ time sqlite3 out.db < <(echo "select count(*) from PageviewsHour;") 362794272 real 0m44.523s user 0m2.497s sys 0m6.703s ``` I'm using the `.json` request in the [Observable Datasette Client](https://observablehq.com/@asg017/datasette-client) to 1) verify that a link passed in is a reachable Datasette instance, and 2) a quick way to look at metadata for a db. A few differe… datasette 107914493 issue     {"url": "https://api.github.com/repos/simonw/datasette/issues/1111/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}    
1060631257 I_kwDOBm6k_c4_N_LZ 1528 Add new `"sql_file"` key to Canned Queries in metadata? asg017 15178711 open 0     3 2021-11-22T21:58:01Z 2022-06-10T03:23:08Z   CONTRIBUTOR   Currently for canned queries, you have to inline SQL in your `metadata.yaml` like so: ```yaml databases: fixtures: queries: neighborhood_search: sql: |- select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood title: Search neighborhoods ``` This works fine, but for a few reasons, I usually have my canned queries already written in separate `.sql` files. I'd like to instead re-use those instead of re-writing it. So, I'd like to see a new `"sql_file"` key that works like so: `metadata.yaml`: ```yaml databases: fixtures: queries: neighborhood_search: sql_file: neighborhood_search.sql title: Search neighborhoods ``` `neighborhood_search.sql`: ```sql select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood ``` Both of these would work in the exact same way, where Datasette would instead open + include `neighborhood_search.sql` on startup. A few reasons why I'd like to keep my canned queries SQL separate from metadata.yaml: - Keeping SQL in standalone SQL files means syntax highlighting and other text editor integrations in my code - Multiline strings in yaml, while functional, are a tad cumbersome and are hard to edit - Works well with other tools (can pipe `.sql` files into the `sqlite3` CLI, or use with other SQLite clients easier) - Typically my canned queries are quite long compared to everything else in my metadata.yaml, so I'd love to separate it where possible Let me know if this is a feature you'd like to see, I can try to send up a PR if this sounds right! datasette 107914493 issue     {"url": "https://api.github.com/repos/simonw/datasette/issues/1528/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}    
1865869205 I_kwDOBm6k_c5vNueV 2157 Proposal: Make the `_internal` database persistent, customizable, and hidden asg017 15178711 open 0     3 2023-08-24T20:54:29Z 2023-08-31T02:45:56Z   CONTRIBUTOR   The current `_internal` database is used by Datasette core to cache info about databases/tables/columns/foreign keys of databases in a Datasette instance. It's a temporary database created at startup, that can only be seen by the root user. See an [example `_internal` DB here](https://latest.datasette.io/_internal), after [logging in as root](https://latest.datasette.io/login-as-root). The current `_internal` database has a few rough edges: - It's part of `datasette.databases`, so many plugins have to specifically exclude `_internal` from their queries [examples here](https://github.com/search?q=datasette+hookimpl+%22_internal%22+language%3APython+-path%3Adatasette%2F&ref=opensearch&type=code) - It's only used by Datasette core and can't be used by plugins or 3rd parties - It's created from scratch at startup and stored in memory. Why is fine, the performance is great, but persistent storage would be nice. Additionally, it would be really nice if plugins could use this `_internal` database to store their own configuration, secrets, and settings. For example: - `datasette-auth-tokens` [creates a `_datasette_auth_tokens` table](https://github.com/simonw/datasette-auth-tokens/blob/main/datasette_auth_tokens/__init__.py#L15) to store auth token metadata. This could be moved into the `_internal` database to avoid writing to the gues database - `datasette-socrata` [creates a `socrata_imports`](https://github.com/simonw/datasette-socrata/blob/1409aa9b4d2fc3aff286b52e73af33b5786d56d0/datasette_socrata/__init__.py#L190-L198) table, which also can be in `_internal` - `datasette-upload-csvs` [creates a `_csv_progress_`](https://github.com/simonw/datasette-upload-csvs/blob/main/datasette_upload_csvs/__init__.py#L154) table, which can be in `_internal` - `datasette-write-ui` wants to have the ability for users to toggle whether a table appears editable, which can be either in `datasette.yaml` or on-the-fly by storing config in `_internal` In general, these are specific features that Datasette plugins wou… datasette 107914493 issue     {"url": "https://api.github.com/repos/simonw/datasette/issues/2157/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}    

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 119.357ms · About: simonw/datasette-graphql