Skip to content

NeRV-3D-DC: A Nonlinear Dimensionality Reduction visualization method for 3D Chromosome Structure Reconstruction with high Resolution Hi-C Data

Notifications You must be signed in to change notification settings

ghaiyan/NeRV-3D-DC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NeRV-3D-DC

NeRV-3D-DC: A Nonlinear Dimensionality Reduction visualization method for 3D Chromosome Structure Reconstruction with high Resolution Hi-C Data

python environment

python 3.8.10 numpy pandas matplotlib xlrd openpyxl

Data used in the experiment

../chrtest:simulation data and results

../GM12878:results of 50kb and 5kb resolution in real Hi-C data

../IMR90:results of 50kb and 5kb resolution in real Hi-C data

generate simulation strcture

run 'generate_test_structer.ipynb'

generate Hi-C contact matrix

run functions of "normalize.py"

or

bash generateKR_hic_matrix.sh

then

python tuple2matrix(in_dir,out_dir,resolution)

reconstrcute the simulation structure

change the directory of the input Hi-C contact file and output file, and the conversion factor alpha.

bash generateStructue.sh

reconstrcute the 3D structure from true Hi-C contact matrix (low resolution Hi-C)

change the directory of the input Hi-C contact file and output file, and the conversion factor alpha.

bash generateStructuetruehic.sh

reconstrcute the 3D structure from true Hi-C contact matrix (high resolution Hi-C data)

bash generateHighStructure.sh

plot the strcuture you have generated

run 'plot3D.ipynb'

evalute the quality of the 3D structure generated by different methods

python evalMetrics.py

before running, please modify the dirctores of your structure files.

plot the comparsion of metrics

python plotmetrics.py

or

run 'plotmetric.ipynb'

evaluate the 3D strcuture with avaliable FISH data

run calculateFISHRMSDLoop.sh

Or run evaluate_with_FISH.ipynb

the real system we use for running the scripts

cpu: 95 cpu cores: 24*95 mem: 503 GB

the parameters for other tools that can run at high resolution Hi-C contact matrix

1)miniMDS:download source code from https://github.com/seqcode/miniMDS

By default, full MDS is used:

python minimds.py GM12878_combined_22_5kb.bed

To use partitioned MDS:

python minimds.py --partitioned GM12878_combined_22_5kb.bed

2)Hierarchical3DGenome:download source code from https://github.com/BDM-Lab/Hierarchical3DGenome

java -jar HierarchicalModeller.jar chr_id resolution observed_contact_data normalized_contact_data domain_file output_folder

Parameters:

chr_id: eg. 1, 2, ..

resolution: e.g 5000

observed_contact_data: observed hi-C contact file, each line contains 3 numbers (separated by a space) of a contact, position_1 position_2 interaction_frequencies (input/chr10_5kb.RAWobserved)(can be downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525)

normalized_contact_data: normalized hi-C contact file, each line contains 3 numbers (separated by a space) of a contact, position_1 position_2 interaction_frequencies (input/chr10_5kb_gm12878_list.txt) (can be downloaded and normalized from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525)

domain_file: file contains domains identified by Juicer (input/GSE63525_GM12878_primary+replicate_Arrowhead_domainlist_whole.txt) (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525) output_folder: output folder

About

NeRV-3D-DC: A Nonlinear Dimensionality Reduction visualization method for 3D Chromosome Structure Reconstruction with high Resolution Hi-C Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •