Genetic disorders are traditionally categorized into three main groups: single-gene, chromosomal, and multifactorial disorders. Single gene or Mendelian disorders result from errors in DNA sequence of a gene and include autosomal dominant, autosomal recessive, X-linked recessive, X-linked dominant and Y-linked (holandric) disorders. Chromosomal disorders are due to chromosomal aberrations including numerical and structural damages. Our study aims to use probabilistic methods to gain insights into the distribution, parental influence, and risk factors associated with genetic disorders
We will apply a series of probabilistic techniques to our dataset. Our dataset contains 22083 patients and 45 variables. Link to our dataset taken from kaggle challenge 2021
Calculate the probability of each genetic disorder given the presence of specific maternal or paternal gene indicators/contributions
Examine the probability of concurrent genetic disorders eg. Tay- Sachs and Leigh syndrome. Look into multiple linear regression
Explore conditional probabilities of specific disorders based on clinical factors. Eg. CF and respiratory rate; radiation history and cancer; hemochromatosis and rbc. Multiple linear regression.
to identify potential causative relationships, such as whether maternal gene indicators have a stronger causal link to certain disorders than paternal indicators; Question: Are specific genetic markers causally linked to the severity rather than just the presence of disorders (e.g., mutations that exacerbate mitochondrial myopathy)? Impact: This could improve the stratification of patients based on risk severity, allowing for more intensive monitoring and tailored therapeutic strategies for high-risk groups. Use bayesian statistics.