Skip to content
jandot edited this page Sep 13, 2010 · 3 revisions

Configuration file

File is called config.yml and is in YAML format. YAML is like XML but human-readable.

---
width: 1280
height: 800
data_directory: 'data/example'

Spaces are important. The file should start with a single line with 3 dashes. The rest is key-values separated by a colon and space. At least width, height and data_directory have to be specified. In case you want to start the application already zoomed into one or more loci, you can specify the loci tag with a list of, well, loci. Format each locus as in the example below. Note that you have to use the same format if you want to use only one locus as well.

---
width: 1280
height: 800
data_directory: 'data/example'
loci:
  - chromosome: 2
    start: 132_500_000
    stop: 133_000_000
  - chromosome: 16
    start: 33_650_000
    stop: 34_000_000

For a bit more information on YAML, see here

Chromosome meta data

File is called meta_data.tsv. This file just contains chromosome name, length and centromere start/stop.

1     247249719     121100000     128000000
2     242951149     91000000      95700000
3     199501827     89400000      93200000
4     191273063     48700000      52400000

Read pair information

Filename has to be read_pairs.txt. These data are best heavily filtered: large datasets (>50,000 readpairs) will slow down the display considerably. You can for example remove readpairs that map as they are expected to map as well as singletons (where a readpair is the only one linking two disparate loci together). You can also cluster readpairs that link to loci into one or just pick a representative. I hope to add scripts that do this later.

Columns:

  • source chromosome
  • source position
  • target chromosome
  • target position
  • code:
    • DIST: read pairs are further apart than they should be
    • FF: readpairs are in forward-forward orientation
    • RR: readpairs are in reverse-reverse orientation
    • RF: readpairs are in reverse-forward orientation
  • mapping quality
1     1016287        1     3590188       DIST     59
1     1016287        1     1025027       FF       55
1     197161848      1     204332960     DIST     39.6
1     197161848      1     197162987     RR       53.2
1     143840263      1     148159432     DIST     63
1     143840263      1     143841351     RR       42
1     65970649       1     67123551      DIST     39
1     10234366       2     179529743     DIST     52
1     14153568       2     198789357     DIST     49.7
1     32760776       2     126917453     DIST     61.5
1     37680860       2     132743496     DIST     48
1     52182495       2     213606476     DIST     49

Copy number

Contains copy number information and is basically a reworked overview of read depth.

Columns:

  • chromosome
  • start position
  • stop position
  • value
10       43330        702224      31.7698
10      704324        706808      71.5
10      709328      16263066      37.2231
10    16265184      16269314      75
10    16271414      35286266      38.6989
10    35288524      35293178      0.6667
10    35295366      38243048      25.751
10    38245201      42648604      0.175
10    42651614      46390208      28.112

Segmental duplications

Not used at the moment, but file should be there.

Genes

Filename is genes.txt. Columns:

  • gene name
  • chromosome
  • start position
  • stop position
MPZL3         11     117602619      117628245
ATPBD3        19      56293588       56299659
POLR2H         3     185563888      185569057
C2orf37        2     171999071      172049806
SIM1           6     100939606      101019494
KIF19         17      69833946       69863554
AL163171.4    14      82228800       82229152
AL356799.3    14      81998185       81998284
AL357095.4    14      81965968       81966252
CLCN2          3      185547034     185562085

Clone this wiki locally