Conversation
|
See also: |
|
In this PR you are using the python interface to SQLite3 and PG. Have you tested the alternative using db.execute? The reason I am asking is that the GRASS db drivers have a lot of error handling that might be missing from the python interfaces to the respective DB drivers. Using the python interfaces could case cryptic errors if something goes wrong that might be better explained by the GRASS db drivers. |
Any general Python testing instructions should be applicable like this or this. You should probably focus just on SQLite, because PostgreSQL is more difficult to set up for the tests (although it is possible). If working within NC SPM location won't work for you tests, you can try to write plain Python unittest test and use e.g. --exec to run GRASS (the testing framework is from the times before --exec, so it does not integrate with it well, but I already had to use this approach here). |
I have not tested. The python interfaces offer some additional functionality and direct translation between Python objects and database data types. DB handling in pygrass (where these functions are supposed to end up) also uses Python DB adapters. But It is a good point to double check that potential errors are caught properly! |
| """ | ||
| sql_to_dtype = { | ||
| "sqlite": { | ||
| "INTEGER": [0, 2, 3, 4, 5, 6, 7, 8, 9, 10], |
There was a problem hiding this comment.
Does these really need to be numbers and not things like numpy.int32?
lib/python/pygrass/utils.py
Outdated
| insert_sql = "INSERT INTO {}({}) VALUES %s;".format( | ||
| table, | ||
| ", ".join(structured_array.dtype.names), | ||
| ",".join(["?"] * len(structured_array.dtype.names)), |
There was a problem hiding this comment.
Use named placeholders such as {table} and table= when there is more than one item to avoid confusion. Here seems to be some with %s and the third argument.
|
@ninsbl: would you mind to rebase this PR? |
|
Fixed the bad merge by undoing that commit, doing a reset, then merging upstream/main again. Now there isn't 5000+ files changed with 2000+ commits, but really the only single one changed with like 3 commits. |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
wenzeslaus
left a comment
There was a problem hiding this comment.
@ninsbl Can you please review this older PR to see how/if your motivation for this changed over time?
The two functions are related based on the original motivation, but independent as they stand now, also have different level of complexity, so splitting the PR into two would help merging at least parts of this faster.
| if type(tablestring).__name__ == "str": | ||
| tablestring = grasscore.encode(tablestring, encoding=encoding) | ||
| elif type(tablestring).__name__ != "bytes": |
There was a problem hiding this comment.
Why not testing the types or instances (isinstance) instead of names as strings?
| if structured: | ||
| kwargs["dtype"] = None | ||
|
|
||
| return np.genfromtxt(BytesIO(tablestring), **kwargs) |
There was a problem hiding this comment.
What the new function is adding on top of np.genfromtxt, the encoding?
| def txt2numpy( | ||
| tablestring, | ||
| sep=",", | ||
| names=None, | ||
| null_value=None, | ||
| fill_value=None, | ||
| comments="#", | ||
| usecols=None, | ||
| encoding=None, | ||
| structured=True, | ||
| ): |
There was a problem hiding this comment.
Put this one to a separate PR. Seems much easier to get it merged, perhaps it could be integrated into grass.tools if really useful. Or maybe improvements in format=json and format=csv when used together with np.genfromtxt are a better route?
| def numpy2table( | ||
| np_array, | ||
| table, | ||
| connection, | ||
| formats=None, | ||
| names=False, | ||
| column_prefix="column", | ||
| update_formats=True, | ||
| overwrite=True, | ||
| ): |
There was a problem hiding this comment.
If the motivation is the same as the referenced https://trac.osgeo.org/grass/ticket/3639, wouldn't a tool which takes a CSV table puts it into a database table be more fitting? Like db.in.org but specialized for CSV (db.in.table or db.in.csv).
This PR would add two New functions:
I would very much appreciate thorough review of the two functions, esp. the one writing to DB.
Also hints on how to properly add examples for doctest (or other unit-tests) would be very welcome...