geonetwork datadir checker useless ressources#11
geonetwork datadir checker useless ressources#11jeanmi151 wants to merge 23 commits intogeorchestra:masterfrom
Conversation
|
i know this is wip/draft, but from the current state of things, this is not at all a background task sent to celery, so if the gn datadir is huge the webpage wont be sent to the client until the size is calculated. and it'll get worse if we add more checks, like checksumming all files and reporting duplicates for example. And if it's not a proper celery task like others there's no point in adding a card on the home template, which relies on fetching the result of the task from the redis backend. |
|
I've added a README in https://github.com/georchestra/gaia/tree/master/geordash/checks trying to document how to add the async task via celery |
c61110d to
9bf6ea5
Compare
c52e6d8 to
eb011cf
Compare
|
can't figure out why in the home page i am getting this message i am getting this this script.js with this json result from http://localhost:8080/gaia/tasks/lastresultbytask/geordash.checks.gn_datadir.check_gn_meta?taskargs= @landryb any ideas ? |
the calls on the homepage assume that the job results come from a grouptask (which is the case for all other cards on the homepage), and thus loops over results to accumulate problems count in https://github.com/georchestra/gaia/blob/master/geordash/static/js/script.js#L23. In your case the task is a single task, so value is a dict and not an array of results, and the js blows. |
|
This is starting to look good ! i still see some hardcoded bits for the GN database name/host that should come from the new maybe another cosmetic nitpick, but the whole |
can't make the filesizeformat function works, |
it should work with the jinja2 v3.1.2 in debian12.. and i dont think special imports are needed, eg if in |
|
all good for me, ready for reviews |
landryb
left a comment
There was a problem hiding this comment.
i havent tested the code at runtime yet...
| "dashboard", __name__, url_prefix="/gaia", template_folder="templates/dashboard" | ||
| ) | ||
|
|
||
| def debug_only(f): |
There was a problem hiding this comment.
what is the use of this method ?
There was a problem hiding this comment.
it is a new wrap for the function tag use in line 78 for /debug route
is debug it enable answer the /debug content otherwise 404
There was a problem hiding this comment.
I would love to keep this debug route, it was super useful to develop
| parser.read_file(lines) | ||
| self.sections["geonetwork"] = parser["section"] | ||
|
|
||
| def tostr(self): |
There was a problem hiding this comment.
i tend to not commit debug-only methods...
There was a problem hiding this comment.
Super useful for developing :)
(and debug ;) )
I wish to keep it
|
finally looking at it, i think |
|
will check monday, but it blew at runtime, i guess it didnt find the database/schema: in my case iirc, geonetwork has its own database so the tables are in the |
afaict, this which is another hack ? i don't have it on my instance (probably because i use the default also, it should first use the oh and ... definition of the |
|
with the following diff, the job can apparently connect to the public schema of my geonetwork database (havent tried running the job yet), taking params from i had to comment out the collection name thing otherwise it blew with conflicts, and i dont remember why i had to use it for mapstore. |
|
right, I took the wrong var for db connection |
c83ff8c to
a6bc3b5
Compare
Seems I found a way to correctly request the database in my last commit |
yeah, it works here now on 4.2.8. now that i've been able to finally test it at runtime, i have a bit more remarks, some to fix, some which would be welcom improvements, and some for future work ? so here's a proper review:
|
okay
I will try to do such thing but I am not super confident with front stuff
okay
I removed them
Like total amount of folder ?
Okay I will correct that
" it will only detect _leftover folders from removed metadatas" --> yes it is what we are aiming here
we will keep that for futur improvments I think but it is a good idea |



The aim of this is to add a checker to spot files that are no longer needed because geonetwork forget to delete them
it checks the database records (metadata table) and search in the /mnt/geonetwork_datadir/data/metadata_data/
(value here https://github.com/georchestra/datadir/blob/docker-master/geonetwork/geonetwork.properties#L2)