SeriesGenetic association studies
Section snippets
Direct association
The first of these forms of association is termed direct association, and studies of direct association target polymorphisms which are themselves putative causal variants. This type of study is the easiest to analyse and the most powerful, but the difficulty is the identification of candidate polymorphisms. A mutation in a codon which leads to an aminoacid change is a candidate causal variant. However, it is likely that many causal variants responsible for heritability of common complex
Indirect association
In the second type of association, the polymorphism is a surrogate for the causal locus and this type of association allows us to search for causal genes in indirect association studies. However, indirect associations are even weaker than the direct associations they reflect, and it will usually be necessary to type several surrounding markers to have a high chance of picking up the indirect association. Indirect association studies are more difficult to analyse, and there is still debate as to
Confounded association
The final type of association is that due to confounding by stratification and admixture (substructure) within the population. Confounding, as in the rest of epidemiology, raises the possibility both of generating false findings (positive confounding) or obscuring true causal associations (negative confounding). However, although the problem of unobserved confounding is intractable in classic epidemiology, dictating limits on the size of causal effect that can be safely inferred from
Direct association: patterns of genotype–phenotype relationship
We shall consider a diallelic locus, directly related to either a quantitative trait or to a discrete trait such as presence (prevalence), or occurrence (incidence), of a disease. Multiallelic loci lead to more complicated scenarios and generate tests with many degrees of freedom. Even in the simplest diallelic case, different patterns for the genotype–phenotype relationship must be considered. Since there are three possible genotypes, which have a natural order (1/1, 1/2, and 2/2), the
Linear dose-response modelling
In classic mendelian genetics of fully penetrant discrete traits, the description of an allele as dominant implies that the corresponding phenotype will occur irrespective of the number of copies of the allele carried. A recessive allele requires both copies to be present for the phenotype to be evident. In a diallelic system, if neither allele is dominant, 1/2 heterozygotes will display an intermediate phenotype. Fisher16 used the term dominance in a different way to describe the related
Epistasis
The general issue of dominance relates to the extent to which the joint effect of two alleles at a single autosomal locus might be different from the sum (or product in a multiplicative model) of the effects anticipated for each allele independently. A related issue is the degree to which the combined effect of alleles at two or more loci can reasonably be modelled by the individual locus contributions. The fact that inheritance of some traits could only be explained by joint action of two
Indirect association: patterns of linkage disequilibrium
The mapping of susceptibility genes for common complex disorders and genes for other common traits by the indirect method depends on the existence of association, at the population level, between the causal variants and nearby markers. Such association, because of the proximity of loci on the genome, is termed linkage disequilibrium. (Some use this term to describe any population-wide association between loci, whether due to proximity or to another reason such as population stratification and
Study designs
Familiar epidemiological designs such as population-based case-control or cohort designs19, 52 are often used for genetic association studies and the data are analysed much the same way too, risk factors such as smoking and obesity etc, being replaced by the presence or absence of a particular genetic polymorphism. Risk can be considered in terms of either a predisposing allele or genotype, or in terms of multiple categories of disease risk such as the risks associated with different alleles at
Statistical analysis
The analysis of data depends crucially on the study design. In the simplest case, familiar methods such as logistic regression, χ2 tests of association, and odds ratios may be suitable. At a single marker, the issue arises as to whether to analyse on the basis of allele counts or genotype counts. Suppose we have case and control data for a single diallelic genetic locus (table 4). A simple χ2 test for independence has 2 degrees of freedom. Two odds ratios can be calculated: af/be (for genotype
Significance and importance
The standards of statistical proof that have become acceptable in the general biomedical literature are not appropriate for genetic association studies. Something akin to a multiple testing problem pervades the discipline, although there has been no clear consensus about how it should be dealt with. Approaches such as the Bonferroni correction are not appropriate because it is not the number of tests in any one investigation that is important. Rather it is that the vast majority of loci tested
References (110)
- et al.
Genetic associations in large versus small studies: an empirical assessment
Lancet
(2003) - et al.
Control of confounding of genetic associations in stratified populations
Am J Hum Genet
(2003) - et al.
Use of unlinked genetic markers to detect population stratification in association studies
Am J Hum Genet
(1999) - et al.
Association mapping in structured populations
Am J Hum Genet
(2000) - et al.
Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model
Am J Hum Genet
(2001) - et al.
The power of genomic control
Am J Hum Genet
(2000) Effect modification and the limits of biological inference from epidemiologic data
J Clin Epidemiol
(1991)- et al.
A comparison of linkage disequilibrium measures for fine-scale mapping
Genomics
(1995) - et al.
Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data
Am J Hum Genet
(2000) - et al.
Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium
Am J Hum Genet
(2004)