Skip to content

fixes to load properly VCF samples dict from _samples.avro#9

Open
jpdna wants to merge 4 commits intofnothaft:genotypes-rddfrom
jpdna:genotypes-rdd-fix-vcf-sampledict
Open

fixes to load properly VCF samples dict from _samples.avro#9
jpdna wants to merge 4 commits intofnothaft:genotypes-rddfrom
jpdna:genotypes-rdd-fix-vcf-sampledict

Conversation

@jpdna
Copy link

@jpdna jpdna commented May 25, 2016

This PR fixes a problem that in current code the sequence of actions:

  1. Run 'vcf2adam' to produce adam parquet from a genotype cotnaining VCF
  2. Load that adam genotype parquet data using ADAMContext.loadGenotypes()

fails for two reasons:

This PR fixes these issues such that writing and loading genotype data works correctly.

fnothaft and others added 4 commits May 20, 2016 09:28
Resolves bigdatagenomics#909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in bigdatagenomics#1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants