issue_comments: 1264737290
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/485#issuecomment-1264737290 | https://api.github.com/repos/simonw/datasette/issues/485 | 1264737290 | IC_kwDOBm6k_c5LYlwK | 9599 | 2022-10-02T21:29:59Z | 2022-10-02T21:29:59Z | OWNER | To clarify: the feature this issue is talking about relates to the way Datasette automatically displays foreign key relationships, for example on this page: https://github-to-sqlite.dogsheep.net/github/commits <img width="1233" alt="image" src="https://user-images.githubusercontent.com/9599/193476985-d41148cf-2b2f-49b9-b717-e92145afab31.png"> Each of those columns is a foreign key to another table. The link text that is displayed there comes from the "label column" that has either been configured or automatically detected for that other table. I wonder if this could be handled with a tiny machine learning model that's trained to help pick the best label column? Inputs to that model could include: - The names of the columns - The number of unique values in each column - The type of each column (or maybe only `TEXT` columns should be considered) - How many `null` values there are - Is the column marked as unique? - What's the average (or median or some other statistic) string length of values in each column? Output would be the most likely label column, or some indicator that no likely candidates had been found. My hunch is that this would be better solved using a few extra heuristics rather than by training a model, but it does feel like an interesting opportunity to experiment with a tiny ML model. Asked for tips about this on Twitter: https://twitter.com/simonw/status/1576680930680262658 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 447469253 |