github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030732909 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030732909 | IC_kwDOCGYnMM49b7xt | 9599 | 2022-02-06T01:47:06Z | 2022-02-06T01:47:06Z | OWNER | Here's an idea for an API design: ```python geojson_geometry = {} # ... GeoJSON goes here db["places"].insert( {"name": "Wales", "geometry": geojson_geometry}, geojson="geometry" ) ``` That `geojson=` parameter takes either a single column name or an iterable of column names. Any column in that list is expected to be a compatible `geometry` and the correct conversion functions will be applied. That solves for GeoJSON, but it's a bit ugly. Should I add `wkt=` and maybe even `kml=` and `gml=` and so-on too? Definitely not, that's way too many ugly and inscrutable new parameters. More importantly: if I want to support the following how would I do it? ```python db["places"].insert( {"name": "London", "point": (51.509865, -0.118092)} ) ``` Here I want to provide a `(latitude, longitude)` pair and have it inserted correctly into a `point` column. Could do this, but again it's messy: ```python db["places"].insert( {"name": "London", "point": (51.509865, -0.118092)}, point="point" ) ``` And again, what about those `(longitude, latitude)` people? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030735774 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030735774 | IC_kwDOCGYnMM49b8ee | 9599 | 2022-02-06T02:08:19Z | 2022-02-06T02:08:59Z | OWNER | Maybe I should leave this entirely up to documented patterns in the `conversions={}` dictionary? But even that's not ideal for the co-ordinate case. Consider the following: ```python db["places"].insert( {"name": "London", "point": (51.509865, -0.118092)}, conversions={"point": "GeomFromText(?, 4326)"}, ) ``` The challenge here is that the SpatiaLite function `GeomFromText()` expects a WKT string, which looks like this: POINT(-0.118092 51.509865) The existing `conversions=` mechanism doesn't support applying Python code to convert the `(lat, lon)` tuple to that value. It doesn't even support passing a Python tuple as a `?` parameter - so I don't think I could come up with a SQL string that would do the right thing here either. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030736047 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030736047 | IC_kwDOCGYnMM49b8iv | 9599 | 2022-02-06T02:10:18Z | 2022-02-06T02:10:18Z | OWNER | So maybe back to that earlier idea where the code introspects the table, figures out that `"point"` is a geometry table of type POINT, then applies the necessary conversions to the raw Python data? That feels overly-complicated to me, especially since nothing else in the `.insert()` method currently relies on table introspection. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030736589 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030736589 | IC_kwDOCGYnMM49b8rN | 9599 | 2022-02-06T02:14:52Z | 2022-02-06T02:14:52Z | OWNER | Another idea: introduce a helper function transform pattern, something a bit like this: ```python transformer = make_transformer({ "point": lambda pair: "POINT({} {})".format(pair[1], pair[0]) }) db["places"].insert_all( transformer([{"name": "London", "point": (51.509865, -0.118092)}]) conversions={"point": "GeomFromText(?, 4326)"}, ) ``` The `make_transformer(...)` function builds an object that can work as a wrapping iterator, applying those transform functions to everything in the sequence that it wraps. So the above code would handle converting `(lat, lon)` to `POINT(lon lat)` - then the `conversions=` applies `GeomFromText`. Naming is a challenge here: `.transform()` and `.convert()` and `conversions=` all have existing meanings within the `sqlite-utils` Python library. It's also a bit of a messy way of solving this. It's not exactly a smooth API for inserting a bunch of lat/lon coordinate pairs! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030736848 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030736848 | IC_kwDOCGYnMM49b8vQ | 9599 | 2022-02-06T02:17:35Z | 2022-02-06T02:17:35Z | OWNER | Note that GeoJSON itself uses `(longitude, latitude)` so I should probably stick to that order here too. https://datatracker.ietf.org/doc/html/rfc7946#section-3.1.1 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030738023 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030738023 | IC_kwDOCGYnMM49b9Bn | 9599 | 2022-02-06T02:28:05Z | 2022-02-06T02:29:24Z | OWNER | Here's the definitive guide to `latitude, longitude` v.s. `longitude, latitude`: https://macwright.com/lonlat/ > Which is right? > > Neither. This is an opinion with no right answer. Geographical tradition favors lat, lon. Math and software prefer lon, lat. I asked on Twitter here: https://twitter.com/simonw/status/1490148001569906688 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030739566 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030739566 | IC_kwDOCGYnMM49b9Zu | 9599 | 2022-02-06T02:45:25Z | 2022-02-06T02:50:27Z | OWNER | Another idea - my favourite option so far: ```python from sqlite_utils.utils import LongitudeLatitude db["places"].insert( { "name": "London", "point": (-0.118092, 51.509865) }, conversions={"point": LongitudeLatitude}, ) ``` Here `LongitudeLatitude` is a magical value which does TWO things: it sets up the `GeomFromText(?, 4326)` SQL function, and it handles converting the `(51.509865, -0.118092)` tuple into a `POINT({} {})` string. This would involve a change to the `conversions=` contract - where it usually expects a SQL string fragment, but it can also take an object which combines that SQL string fragment with a Python conversion function. Best of all... this resolves the `lat, lon` v.s. `lon, lat` dilemma because you can use `from sqlite_utils.utils import LongitudeLatitude` OR `from sqlite_utils.utils import LatitudeLongitude` depending on which you prefer! | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740570 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030740570 | IC_kwDOCGYnMM49b9pa | 9599 | 2022-02-06T02:56:17Z | 2022-02-06T02:57:00Z | OWNER | Thinking about types. The type of the `conversions` parameter right now is a bit lazy: ```python conversions: Optional[dict] = None, ``` That becomes: ```python Optional[Dict[str, Union[str, Conversion]]] ``` Where `Conversion` is an abstract base class which expects implementations to have a `.sql() -> str` and a `.convert(value) -> str` method. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740653 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030740653 | IC_kwDOCGYnMM49b9qt | 25778 | 2022-02-06T02:57:17Z | 2022-02-06T02:57:17Z | CONTRIBUTOR | I like the idea of having stock conversions you could import. I'd actually move them to a dedicated module (call it `sqlite_utils.conversions` or something), because it's different from other utilities. Maybe they even take configuration, or they're composable. ```python from sqlite_utils.conversions import LongitudeLatitude db["places"].insert( { "name": "London", "lng": -0.118092, "lat": 51.509865, }, conversions={"point": LongitudeLatitude("lng", "lat")}, ) ``` I would definitely use that for every CSV I get with lat/lng columns where I actually need GeoJSON. | {"total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740771 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030740771 | IC_kwDOCGYnMM49b9sj | 9599 | 2022-02-06T02:58:29Z | 2022-02-06T02:58:29Z | OWNER | That example you have there is really neat - I like the idea that they can also be used to populate completely new columns that are derived from the other column inputs. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740826 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030740826 | IC_kwDOCGYnMM49b9ta | 25778 | 2022-02-06T02:59:10Z | 2022-02-06T02:59:10Z | CONTRIBUTOR | All this said, I don't think it's unreasonable to point people to dedicated tools like `geojson-to-sqlite`. If I'm dealing with a bunch of GeoJSON or Shapefiles, I need to something to read those anyway (or I need to figure out virtual tables). But something like this might make it easier to build those libraries, or standardize the underlying parts. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740846 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030740846 | IC_kwDOCGYnMM49b9tu | 9599 | 2022-02-06T02:59:21Z | 2022-02-06T02:59:21Z | OWNER | I wonder if there are any interesting non-geospatial canned conversions that it would be worth including? | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740963 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030740963 | IC_kwDOCGYnMM49b9vj | 9599 | 2022-02-06T03:00:33Z | 2022-02-06T03:00:33Z | OWNER | Yeah, having this be a general purpose mechanism which has a few canned examples for handling geospatial stuff is a lot neater than having a mechanism for this that's exclusive to SpatiaLite. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030741289 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030741289 | IC_kwDOCGYnMM49b90p | 25778 | 2022-02-06T03:03:43Z | 2022-02-06T03:03:43Z | CONTRIBUTOR | > I wonder if there are any interesting non-geospatial canned conversions that it would be worth including? Off the top of my head: - Un-nesting JSON objects into columns - Splitting arrays - Normalizing dates and times - URL munging with `urlparse` - Converting strings to numbers Some of this is easy enough with SQL functions, some is easier in Python. Maybe that's where having pre-built classes gets really handy, because it saves you from thinking about which way it's implemented. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030807433 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030807433 | IC_kwDOCGYnMM49cN-J | 6025893 | 2022-02-06T10:54:09Z | 2022-02-06T10:54:09Z | NONE | > Interesting that some accept an SRID and others do not - presumably GeomFromGeoJSON() always uses SRID=4326? The ewtk/ewkb ones don't accept an SRID is because ewkt encodes the SRID in the string, so you would do this with a wkt string: `GeomFromText('POINT(529090 179645)', 27700)` but for ewkt it would be `GeomFromEWKT('SRID=27700;POINT(529090 179645)')` The specs for KML and GeoJSON specify a Coordinate Reference System for the format - https://datatracker.ietf.org/doc/html/rfc7946#section-4 - https://docs.opengeospatial.org/is/12-007r2/12-007r2.html#1274 GML can specify the SRID in the XML at feature level e.g: ``` <gml:Point srsName="EPSG:27700"> <gml:coordinates>529090, 179645</gml:coordinates> </gml:Point> ``` There's a few more obscure formats in there, but broadly I think it is safe to assume an SRID param exists on the function for cases where the SRID is not implied by or specified in the input format. | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030871591 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030871591 | IC_kwDOCGYnMM49cdon | 9599 | 2022-02-06T16:57:22Z | 2022-02-06T16:57:22Z | OWNER | I wonder if I could implement the above such that this *also* works: ```python db["places"].insert( { "name": "London", "point": LongitudeLatitude(-0.118092, 51.509865) } ) ``` This feels like a very natural way to work with single inserts. The challenge is writing the code inside `.insert_all()` such that it can handle these special objects in the input column values in addition to them being passed in `conversions=`. I'm feeling very good about this direction in general though, it feels like it takes the existing but not particularly elegant `conversions=` mechanism and upgrades it to be far more useful, while maintaining backwards compatibility. | {"total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1} | 1124731464 | |
https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030902158 | https://api.github.com/repos/simonw/sqlite-utils/issues/399 | 1030902158 | IC_kwDOCGYnMM49clGO | 9599 | 2022-02-06T19:53:54Z | 2022-02-06T19:53:54Z | OWNER | Moving the design of this new `Conversion` subclass mechanism to: - https://github.com/simonw/sqlite-utils/issues/402 | {"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0} | 1124731464 |