Skip to content

GW_KR vs KR #59

@mattiasaine

Description

@mattiasaine

Have been looking into some publicly available Hi-C data from GEO (GSE147123). Has .hic and .mcool-files for a bunch of cell lines we are interested in. Have looked into using mustache for loop calling and plotgardener for visualization from the hic-files. Both seem to use straw for data read-in. But have run into some issues for both packages with reading in data at basically most resolutions <250kb. A fraction of cell line/chromosome combinations just fail. Makes uniform processing/analysis difficult.

I think this traces back to non-convergence of KR. Apparently results in empty slots in the hic-files. The hic-files however also have data based on the normalization method "GW_KR" available. This seems to mostly work as failed chromosomes are very rare. I found that manually changing the data read-in function in mustache.py to use "GW_KR" instead of "KR" worked across all the cell lines we are interested in at 5/10kb resolution.

My question is whether making this change is appropriate? Results still roughly as valid? Some other suggested fix?

Also found that installing using the Conda-approach on the landing page initially resulted in a broken package. Was unable to read any hic-file. Think it traced back to mustache.py which called "import straw" instead of "import hicstraw" in miniconda3/envs/mustache/lib/python3.8/site-packages/mustache/mustache.py. Replacing this file with the one from here or changing that line fixed it.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions