You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 14, 2026. It is now read-only.
Hey, not sure if you're aware but there's really a lot of garbage there, as OpenSNP is probably not checking what users are uploading.
Here's a normalized list of file types I've found in your db:
7-zip
Apple binary property list
ASCII text
bgzip
bzip2
Composite-documents
CSV
data?
empty
Excel
EXE (???)
gzip
JPEG
Word
PDF
PNG
RAR
RSID sidtune (?!)
Unicode Text
VCF
Word
XML
Zip
zlib
I was curious about the EXEs, at least they don't seem to contain virus. One of them are from a tool called "MyHeritage Family Builder Genealogy Software" and all the rest are called "23andme to FASTA".
It shouldn't be too hard to clean it and to put some checks after people are uploading something. I did this analysis using the file linux utility, I think it could probably be done on the server side as well? Watch out for command injection in case. A neat improvement would be to have all the files in the same format.
I'm attaching a list of files with their format: file_type.csv
Also the phenotype section doesn't seem very well monitored as someone created a "naked body phenotype" to use it to share a naked picture of himself. Not sure about the scientific value of that lol