Skip to content

Added a script to convert vcf to sapcemix inputs.#4

Open
StuntsPT wants to merge 3 commits intogbradburd:masterfrom
StuntsPT:master
Open

Added a script to convert vcf to sapcemix inputs.#4
StuntsPT wants to merge 3 commits intogbradburd:masterfrom
StuntsPT:master

Conversation

@StuntsPT
Copy link

@StuntsPT StuntsPT commented Feb 8, 2016

This script makes it a lot easier to get those input files.
It converts directly from vcf to something that spacemix can read directly.
I hope it helps.

@peterdfields
Copy link
Collaborator

This script seems to work well! Maybe a modification could be made to better deal with variation in vcf format? Rather than hard coding the line that contains vcf column names, just detect the line that contains them for the downstream conversion?

@StuntsPT
Copy link
Author

That should be an easy to make improvement (I'm thinking of grep).
Good idea.
I'll implement that as soon as I have a moment.

@StuntsPT
Copy link
Author

The reason I was using head instead of grep was simple performance. "Grepping" over huge VFC files will take a while, but I learned something new in the process:
grep -m X will stop after the "Xth" match, which effectively solves the performance problem, no matter how large the VCF is.
Thanks for calling it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants