feat: Add support for HIF format#53
Conversation
|
Seems that the first phase is ready. |
| @@ -73,6 +73,10 @@ h1[5,2] = 6.5 | |||
| @test get_vertex_meta(h1, 1) == get_vertex_meta(loaded_hg, 1) | |||
| @test get_hyperedge_meta(h1, 2) == get_hyperedge_meta(loaded_hg, 2) | |||
|
|
|||
There was a problem hiding this comment.
Make unit tests also on data from the https://github.com/pszufe/HIF-standard/tree/main/tests/test_files/HIF-compliant from the HIF-standard libraries.
Make a copy of those files to test/data/HIF-standard folder in the package repository.
Provide in the above folder README.md explaining the files come from the above library.
Each of the files should be able to load and later save.
While saving provide code comparing both JSON outputs (parse both JSONs as Dicts and compare the Dicts)
There was a problem hiding this comment.
I understand we still do not have yet code that iterates over HIF test files.
There was a problem hiding this comment.
How to handle the fact, that in order to load such a file I need to explicitly pass the types of values therein?
There was a problem hiding this comment.
I. e. is manually loading each file separately the only way forward? I don't see how I can dynamically do it in a loop iterating over the directory contents
src/io.jl
Outdated
| _ = format | ||
|
|
||
| json_hg = Dict{Symbol, Any}() | ||
| incidences = [] |
There was a problem hiding this comment.
pls no untyped vectors /containers
https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-abstract-container
most likely this are something like Union{String, Int, Nothing} (I am not sure - please check what types are possible)
src/io.jl
Outdated
| he_meta = h.he_meta | ||
|
|
||
| if any(isnothing, h.v_meta) | ||
| v_meta = [i for i in 1:length(h.v_meta)] |
There was a problem hiding this comment.
what is the intention of this code?
SimpleHypegraphs supports both Nothing as type of metadata (in that case there is no metadata) as well as concrete types
There was a problem hiding this comment.
Yes, but as far as I understand, we cannot have Nothing in edge or node metadata in the HIF format. This is probably not the best take, but the idea was that if there are missing metadata points, then we need to fallback to the solution, where edge or node ids are indexes in the matrix
src/io.jl
Outdated
| edge_dict = Dict(i => val for (i, val) in pairs(he_meta)) | ||
|
|
||
|
|
||
| types = collect(typeof(h).parameters) |
There was a problem hiding this comment.
most likely yo wanted to have those in the method definition:
hg_save(io::IO, h::Hypergraph{T, V, E, D}, format::HIF_Format) where {T, V, E, D}
Check how parametric types work in Julia:
https://docs.julialang.org/en/v1/manual/types/
src/io.jl
Outdated
| incidences = [] | ||
| v_meta = h.v_meta | ||
| he_meta = h.he_meta | ||
|
|
There was a problem hiding this comment.
do not change type of a variable
https://docs.julialang.org/en/v1/manual/performance-tips/#Avoid-changing-the-type-of-a-variable
src/io.jl
Outdated
| E = types[3] | ||
|
|
||
| for node_idx in 1:length(v_meta) | ||
| for edge_idx in 1:length(he_meta) |
There was a problem hiding this comment.
use eachindex as this is the recommended syntax
https://discourse.julialang.org/t/best-practices/85138
src/io.jl
Outdated
| for node_idx in 1:length(v_meta) | ||
| for edge_idx in 1:length(he_meta) | ||
| node = node_dict[node_idx] | ||
| if V == String |
There was a problem hiding this comment.
dispatch on type (use multiple-dispatch property of the language)
BTW there is string(xs...) method - so string() works with any type.
src/io.jl
Outdated
|
|
||
| weight = h[node_idx, edge_idx] | ||
|
|
||
| if isnothing(weight) |
There was a problem hiding this comment.
iterating over an entire sparse matrix is extremely inefficient (see the nested loop above)
Use gethyperedges or getvertices methods.
src/io.jl
Outdated
| format::HIF_Format; | ||
| T::Type{U} = Bool, | ||
| D::Type{<:AbstractDict{Int, U}} = Dict{Int, T}, | ||
| V::Union{Type{String}, Type{Int}} = String, |
There was a problem hiding this comment.
since V and E are parametric the code should perform type casting when collecting their values from JSON.
So this should be V(val) or E(val). For string you will need to handle this separately as for the String target the string method should be called.
src/io.jl
Outdated
|
|
||
| data = JSON3.read(read(io, String)) | ||
|
|
||
| nodes = get(data, "nodes", []) |
There was a problem hiding this comment.
no untyped containers pls (see the comment above)
src/SimpleHypergraphs.jl
Outdated
| include("hypergraph.jl") | ||
| include("io.jl") | ||
|
|
||
| include("hif/hif.jl") |
There was a problem hiding this comment.
Let's just make those three files into one named io_hif.jl without creating the hif subfolder.
In Julia packages we do not try to keep one mehod per file
src/hif/hif_load.jl
Outdated
| T::Type{U}, | ||
| D::Type{<:AbstractDict{Int,U}}, | ||
| ) where {U<:Real} | ||
| node_metadata = Vector{Union{JSON3.Object, Nothing}}() |
There was a problem hiding this comment.
You do not want the user suddenly see some JSON3.Object just cast it to a String.
Additionally, the code could check if this is some more complex data structure and than make it into a Dict.
Perhaps what could help would be using second parameter for reading JSON3.read("test.json", Dict{String, Any})
src/hif/hif_load.jl
Outdated
| append!(edge_metadata, [nothing for _ in 1:k]) | ||
| end | ||
|
|
||
| return Hypergraph{T,JSON3.Object,JSON3.Object,D}(n, k, node_metadata, edge_metadata) |
There was a problem hiding this comment.
User wants to get as metadata parametric types something like String or Dict{String, String} for more complex metadata. I think we do not need to support deeper nesting than one level.
src/hif/hif_load.jl
Outdated
|
|
||
| function add_weights_from_incidences!( | ||
| h::Hypergraph, | ||
| incidences::JSON3.Array{JSON3.Object}, |
There was a problem hiding this comment.
those types will need to be updated once you sort out the way you read the data.
Also using just AbstractVector as function type argument in such places is good enough.
| @@ -73,6 +73,10 @@ h1[5,2] = 6.5 | |||
| @test get_vertex_meta(h1, 1) == get_vertex_meta(loaded_hg, 1) | |||
| @test get_hyperedge_meta(h1, 2) == get_hyperedge_meta(loaded_hg, 2) | |||
|
|
|||
There was a problem hiding this comment.
I understand we still do not have yet code that iterates over HIF test files.
|
Hello team! I noticed the HIF demo that was shared in the paper is using the code from this branch. Note " Is there a possibility to move this PR forward to create a release from the main branch? Thank you!! CC @alessant |
|
Resolved in #55. |
Work in progress.
TODO:
hq_loadfunction withHIF_Formatswitchhq_savelikewise