Fine Scale Mapping in addition to the Coalescent The Fundamental Problem The Data Genotype to Phenotype Functions Types of Mapping Population Set-up & Measures of Dependency The Calculations Practical Considerations Genotype in addition to Phenotype Covariation: Gene Mapping Result:The Mapping Function A set of characters. Binary decision (0,1). Quantitative Character. Sampling Genotypes in addition to Phenotypes Pedigree Analysis: Association Mapping: Pedigree known Few meiosis (max 100s) Resolution: cMorgans (Mbases) Pedigree unknown Many meiosis (>104) Resolution: 10-5 Morgans (Kbases) 2N generations Adapted from McVean in addition to others Pedigree Analysis & Association Mapping

Time t ago Now Creates LD Breaks down LD Drift Recombination Selection Gene conversion Admixture Causes of linkage disequilibrium Disease locus Marker locus Disease locus Marker locus Test as long as independence in 2 times 2 Contingency Table Significance of a Single Association Measuring Linkage Disequilibrium between 2 Loci with 2 Alleles Remade from McVean DA,B =fA,B-fAfB =-Da,B =-DA,b =Da,b Correlation Coeffecient Measure [0,1] Hill & Robertson (1968) Range constrained by allele frequencies [0,1] Lewontin (1964) Odds-ratio as long as mulation Devlin & Risch (1995)

Combine Single (Pairwise) to Multiple Tests Bonferroni Sharper bounds using linkage in as long as mation. Examples of Associations: Pairwise, Triple, Martin et al 2000 6 markers with low association Causative SNP ApoE in addition to Alzheimers Syndrome Adapted from Hudson 1990 Recombination: Gene Conversion: The coalescent with recombination or gene conversion

Gene conversion Tree 1 Tree 2 Tree 1 1 4 3 2 1 4 3 2 Recombination 1 4 3 2 1 4 3 2 Tree 1 Tree 2 Tree 3 Local trees as long as recombination in addition to gene conversion 1 3 2 4 1 4 3 2 Target Same topology as target Same MRCA as target 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 Target tree Same tree Same MRCA Same topology Same tree as target Region with no recombination Measures of tree similarity Local trees of the target in addition to other positions Only recombination, r=2. Also gene conversion g/r=4 Sample size = 20 From Mikkel Schierup

Probability that the largest segment does not include the target From Mikkel Schierup A in addition to B are the most distant markers in significant LD with target Quantifying the mosaicism caused by Gene Conversion What is the proportion of markers between these also in significant LD From Mikkel Schierup Based on Morris et al. 2002 Single Marker Methods Kaplan et al. (1995), Rannala & Slatkin (1998) Problem: Difficult to combine markers. Haplotype methods with star-shaped genealogies Terwilliger (1995), Graham & Thompson (1998), McPeek & Strahs(1999), Morris et al.(2000) Problem: wrong genealogy, gives overconfidence in result. Haplotype methods based on the coalescent Rannala & Reeve (2001), Morris et al. (2002), Larribe et al. (2003). Problem: computationally intensive Development of multi-locus association methods

Probability of Data I: 3 step approach: I Probability of Data given topology in addition to branch lengths Felsenstein81 as long as each column Multiply as long as all columns II Integrate over branch lengths III Sum over topologies TCAGCCT TCAGCAT GCAGGTT Conclusion: Exact Calculation Computationally Intractible!! Probability of Data II: Griffiths & Tavavé TPB46.2.131-149 ACCTAGGAT TCCTAGGAT TCCTAGGAT n= 393 mutations (1,2) coalescence ACCTAGGAT TCCTAGGAT q(n’’) – determined by equilibrium distribution. Griffiths-Ethier-Tavare Recursions Griffiths-Marjoram (1996) included recombination in the equations.

Example: Solving Linear System q( ) q( ) q( ) r(,) r(,) r(,) r(,) r(,) r(,) r(,) r(,) r(,) Example: Solving Linear System Construct Markov transition function, A(x,y), with following properties: i) A(x,y) > 0 when r(x,y) >0 ii) The chain visits A with certainty. Introduced in coalescence theory by Griffiths & Tavare (1994) Griffiths & Marjoram (1996) included recombination Donnelly-Stephens-Fearnhead (2000-) accelerated these algorithms The position of the marker locus is missing data Larribe in addition to Lessard.(2002) Data: haplotype phenotype multiplicity 15 3 6 2 1 2 1 Where is the disease causing disease Likelihood as function of disease locus position

Continuous version of Bayes as long as mula f (parameters) = prior distribution of parameters P(dataparameters) = L(parameters) = likelihood function f (PD) = posterior distribution of parameters given data The evolutionary parameter (e.g. disease location) is considered to have prior distribution (any prior knowledge we may have) in addition to we learn about parameters through data Advantage: f (parametersdata) is the full distribution of parameters of interest given data, e.g. confidence intervals Bayesian approach to LD mapping Marginal posterior distribution of disease position: The basic equation Parameters in Shattered Coalescent Model Morris, Whittaker in addition to Balding (2001,,2003,2004 P(x,h,W,T,z,N,rA,U) ~ L(A,Ux,h,W,T,z,N) p(W,T,zr) p(r) p(r) = 2r, p(W,T,zr) prior distribution of genealogies (coalescent like) x Location of disease locus h Population marker-haplotype proportions W branch lengths of genealogical tree T topology (branching pattern) Z Parental-status N effective population size r shattering parameter A, U cases, controls Probability of Haplotypes associated Mutant At recombination markers are incorporated from the population distribution.

Morris et al: The Shattered Coalescent Advantages: Allows as long as multiple origins of the disease mutant + sporadic occurrences of the disease without the mutation Coalescent tree Morris, Whittaker & Balding,2002 Evaluate the function in the current point p, f(p)=x Suggest a new point, p’ Evaluate the function in this point f(p’) = y If x < y, go to point p' If x > y, go to point p’ with the probability y/x Monte-Carlo (Metropolis) sampling in addition to integration Metropolis et al.(1953) Due to Jesper Nymann Projection on one axis equivalent to integration over the remaining parameters 1 1 2! 2 3 1 Monte-Carlo (Metropolis) Due to Jesper Nymann

Morris et al. (2002). 11 19 Example 1 – Cystic fibrosis Due to Jesper Nymann 1132 Cases, 54 with known mutation 758 Controls Icel in addition to Genomics Corporation: Example 2 – BRCA2 Due to Jesper Nymann 1 3 5 7 9 11 13 15 1 3 5 7 9 11 13 15 True Location Multipoint calculation as long as the full BRCA2 dataset Multipoint calculation where the 54 known mutation cases has been removed. Example 2 – BRCA2 continued Due to Jesper Nymann

Books Encyclopedia of the Human Genome (2003) Nature Publishing Group Liu, . J(2001) “Monte Carlo Strategies in Scientific Computation” Springer Verlag Ott, J.(1999) Analysis of Human Genetic Linkage 3rd edition Publisher: John Hopkins Strachan & Read (2004) Human Molecular Genetics III Publisher: Biosciences Weiss,K.(1993) “Genetic Variation in addition to Human Disease” Cambridge University Press. Web-sites Jeff Reeve in addition to Bruce Rannala A multipoint linkage disequilibrium disease mapping program (DMLE+) that allows genotype data to be used directly in addition to allows estimation of allele ages. Liu, J.S., Sabatti, C., Teng, J., Keats, B.J.B. in addition to N. Risch (Version upgraded by Xin Lu, June/9/2002) This is the software as long as the Bayesian haplotype analysis method developed by Liu, J.S., Sabatti, C., Teng, J., Keats, B.J.B. in addition to N. Risch in article Bayesian Analysis of Haplogypes as long as Linkage Disequilibrium Mapping. Genome Research 11:1716, 2001 J. N. Madsen, M.H. Schierup, C. Storm, in addition to L. Schauser, T. Mailund CoaSim is a tool as long as simulating the coalescent process with recombination in addition to geneconversion under the assumption of exponential population growth Books & Www-sites

