diff --git a/introduction/introduction.tex b/introduction/introduction.tex index 01a4167..eea2a57 100644 --- a/introduction/introduction.tex +++ b/introduction/introduction.tex @@ -3,12 +3,67 @@ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{An overview of meiotic recombination} +% \section{Hypothesis and goals} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Meiosis occurs in all sexually reproducing organisms, and is essential to the generation of gametes. Recombination plays a key role in this process, facilitating the pairing and alignment of chromosomes, while the exchange of genetic material has important implications in inheritance, natural selection, and evolution. +Most research has focused on crossover, the reciprocal exchange of genetic material between homologous chromosomes. +There are a number of factors that influence the placement of crossovers within the genome, and there is tremendous variability among individuals, sexes, and species. +In this thesis, I aim to use a number of methods to gain insight into factors affecting crossover placement, and into the recombination process as a whole. + +Much of existing research focuses on human subjects, and has a specific goal of learning more about recombination in humans. +I have focused the majority of work within this thesis on research in human subjects, with the specific goal of learning more about human recombination. +% In Chapter \ref{ch:cointEsc}, I present +In Chapter \ref{ch:cointEsc}, I present a pedigree analysis of recombination in humans. +Using a large set of human families, I identify crossovers within the genome. +I use this data to investigate specific properties of recombination and how it differs between sexes, individuals, populations. +I analyze hotspot useage on an individual level, as well as across males and females, and look for age effects on recombination within this dataset. +I also further describe how crossover interference varies between individuals, sexes, and populations. + + +In Chapter \ref{ch:dogPed}, I focus on crossover patterns using pedigree of inbred domestic dogs. +Dogs, and the entire canid family, are unique among mammals because they have undergone a series of mutations within PRDM9, rendering this gene inactive in meiosis. +Since this protein has been shown to be essential to meiosis, its absence raises a number of questions as to how recombination has altered +dogs provide an interesting cohort on which to study the effects of the loss of PRDM9 on the recombination landscape as a whole. +Comparing recombination properties in dogs to those of humans will provide additional insight into recombination in other organisms, and within our own species. +% This data provides valuable insight into the effects of PRDM9, which is missing in dogs, on crossover within our own species. +% I have also included a chapter in which I analyse recombination in domestic dogs. + +In Chapter \ref{ch:cointExtras}, I present unpublished data. +This consists of an extension to the data from Chapter \ref{ch:cointEsc}, which extends the analysis of age effects in humans. +Additionally, I include a re-analysis of public data from single cell sperm and oocytes. +Here, I focus on crossover interference properties, which can provide valuable information on how crossover placement varies on an individual level. + + +Finally, in Chapter \ref{ch:geneConv}, I focus on gene conversion, an alternate outcome of recombination. +Gene conversion (or non-crossover) is the non-reciprocal transfer of genetic information that is limited to smaller intervals. +I examine current methods for crossover and gene-conversion detection, and propose a new model to detect gene conversion events using admixed population genetic data. + +In general, this thesis can be divided into two main sections, each focusing on one of the two outcomes of recombination: crossover and gene conversion. +In part one (Chapters \ref{ch:cointEsc}, \ref{ch:cointExtras}, and \ref{ch:dogPed}), I focus on crossover, and use a pedigree approach to study recombination in humans and dogs. +Part two (Chapter \ref{ch:geneConv}) focuses on a discussion of a statistical model for the detection of gene conversion events. +% Overall, +These two aspects dovetail to provide a further picture of the recombination landscape as a whole. + + + + +% % Hypothesis / goal of this thesis. +% \begin{titemize} +% %\item To examine how the absence of PRDM9 has affected the recombination landscape in dogs. +% \item To examine how crossover interference varies between individuals and across species. +% \item To further investigate possibilities to model gene conversion in the human genome. +% \end{titemize} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{An overview of meiotic recombination} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + \afterpage{ \begin{figure}[P] @@ -29,7 +84,7 @@ \section{An overview of meiotic recombination} \subsection{The biology of meiotic recombination} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -Prior to meiosis, a diploid cell contains pairs of homologous chromosomes, one of the pair inherited from the father, the other from the mother. +Prior to meiosis, a diploid cell contains pairs of homologous chromosomes, with one of the pair inherited from the father, and the other from the mother. This diploid DNA is replicated just prior to the cell entering the meiotic cycle, in premeiotic S-phase\cite{Bell2002} to generate exact copies of each pair of chromosomes, referred to as sister chromatids. Meiosis consists of two stages; recombination occurs in the first, meiosis I, while meiosis II sees sister chromatids separate into their respective daughter cells. Meiosis I is the most complex and lengthy stage, with chromosome pairing, synapsis, and recombination all occurring in succession within prophase I, which is divided into several sub-stages. @@ -40,22 +95,22 @@ \subsection{The biology of meiotic recombination} In the zygotene phase (``paired threads''), pairing between the unraveled DNA begins to occur at regions of homology in a process known as synapsis. Homologous chromosomes connect to to the SC by transverse filaments, drawing them into the SC structure in a progressive zipper-like mechanism, completing synapsis\cite{Yang2009}. -Evidence from cytological studies suggests that DNA DSBs occur prior to, or at this stage\cite{Oliver-Bonet2005,Gruhn2013}. +Evidence from cytological studies suggests that DNA DSBs occur at this stage\cite{Oliver-Bonet2005,Gruhn2013}. By the pachytene stage (``thick threads''), synapsis, and the SC assembly are fully complete, and the pairs of homologous chromosomes bound within the SC are referred to as a tetrad or bivalent. -Recombination occurs here, mediated by the structure of the SC. +Strand exchange and DSB repair occurs here, mediated by the structure of the SC. A subset of the DSBs are processed as crossovers, at locations called chiasmata. The remaining DSBs are repaired through a different pathway, as non-crossovers, also known as gene conversion. In the diplotene stage (``two threads''), the SC is disassembled, allowing the tetrad to relax slightly. The homologous chromosomes are still held together at chiasmata locations. -In the final substage of prophase I, diakinesis (``moving through''), the chromosomes condense into visible threads, while the cellular machinery begins to prepare for cell division. -The remaining step of meiosis I are metaphase I, and anaphase I. -Chiasmata, holding the chromosomes together as crossover points are cut, allowing the homologous chromosomes to segregate to their respective cellular poles. +In the final substage of prophase I, diakinesis (``moving through''), the chromosomes condense into visible threads. -Meiosis II is procedurally similar to mitosis, with different results. -Here, the separation of sister chromatids occurs, producing four haploid gametes. +The cellular machinery begins to prepare for cell division, which occurs in the remaining step of meiosis I, metaphase I, and anaphase I. +Chiasmata, holding the chromosomes together as crossover points are cut, allowing the homologous chromosomes to segregate to their respective cellular poles. +Following this, the cell proceeds through meiosis II, which is procedurally similar to mitosis. +Here, the separation of sister chromatids occurs, and four haploid gametes are produced. @@ -66,13 +121,13 @@ \subsection{The biology of meiotic recombination} RAD51 has been detected in autosomes as early as late leptotene or early zygotene\cite{Oliver-Bonet2005}, suggesting that DSBs form early. Roughly concurrent with DSB formation in late leptotene, the axial elements of the SC assemble\cite{Yang2009}. -The axial elements of the SC run along its length, and each associates with one pair of sister chromatids. +The axial elements form the backbone of the SC, and each associates with one pair of sister chromatids. The axial elements are attached to chromatin, containing the compacted DNA of the sister chromatids in a series of loops that radiate outwards from the core axis of the SC. -In the zygotene stage, peak levels of MSH4 foci are found, which mark most DSBs and is though to promote synapsis\cite{Oliver-Bonet2005}. -MLH1 foci, thought to specifically influence DSBs to be repaired as crossovers begin to appear in late zygotene. -By then end of zygotene, the paired sister chromatids in each axial element complete synapsis. -The axial elements progressively join together at homologous regions by transverse elements that ``zip'' the structure together, bound in the central core by central element protein\cite{Yang2009}. +In the zygotene stage, peak levels of MSH4 foci are found, which mark most DSBs and is thought to promote synapsis\cite{Oliver-Bonet2005}. +MLH1 foci, thought to specifically influence DSBs to be repaired as crossovers, begin to appear in late zygotene. +By the end of zygotene, the paired sister chromatids in each axial element complete synapsis. +The axial elements progressively join together at homologous regions, assisted by transverse elements that ``zip'' the structure together, and are bound in the central core by central element proteins\cite{Yang2009}. When complete by the end of zygotene, the SC is composed of two axial elements, a central element, and a number of transverse elements\cite{Yang2009}. The DNA is bound within this complex, with homologous pairing within the SC core, and compacted DNA within chromatin located outside the central core in large loops. @@ -82,7 +137,7 @@ \subsection{The biology of meiotic recombination} % sex chromosomes break later Additional evidence from mouse studies suggests that sex chromosomes have a different timing of these events. There are two isoforms of \textit{Spo11} in humans, and mice, and a recent study in mice suggests that they may have differing functions, with \textit{Spo11$\beta$} being expressed earlier in meiosis, coinciding with most DSBs occurring on the autosomes. -Male mice with only \textit{Spo11$\beta$} had meiotic defects, with the majority of spermatocytes failing to recombine in the pseudoautosomal region (PAR). +Male mice with only \textit{Spo11$\beta$} had meiotic defects, with the majority of spermatocytes failing to recombine in the pseudoautosomal region (PAR)\cite{Kauppi2011}. Following this, \textit{Spo11$\alpha$} was found to be expressed later in meiosis, and coincided with DSBs located within the sex chromosomes, including the PAR\cite{Kauppi2011,DeMassy2013}. This evidence indicates that the initiation of DSBs is a complex, multi-stage process, with autosomal DNA processed earlier than DNA from the sex chromosomes. @@ -97,7 +152,7 @@ \subsection{The biology of meiotic recombination} This indicates that recombination rate must remain above a certain level to prevent non-disjunction. This is supported by the finding that many trisomies involve achiasmate chromosomes, in which recombination is absent in that chromosome\cite{Nagaoka2012}. -This supports the idea that chiasmata, which serve to tether homolgous chromosomes together after the dissolution of the SC, and through the first, provide a crucial tension that serves to inhibit non-disjunction. +This supports the idea that chiasmata, which serve to tether homolgous chromosomes together after the dissolution of the SC, and through the end of anaphase I, provide a crucial tension that serves to inhibit non-disjunction. Research suggests that there is a requirement of one chiasma per chromosome to prevent non-disjunction, but that there may be a backup mechanism to enable chromosomes to properly segregate even without any chiamata\cite{Fledel-Alon2009}. @@ -109,7 +164,7 @@ \subsection{Timing of meiotic events} %Human: The fundamental steps of meiosis are the same in males and females, but the timing of these events, both prior and during, differs significantly between the sexes\cite{Lynn2004}, and even between species. In humans, male meiosis begins at puberty and continues in a cycle that lasts throughout the lifespan. -As male meiosis is continually occurring, the precursor cells undergo a minimum of 30 mitotic divisions prior to entering meiosis, and this number continues to rise with age. +The precursor cells undergo a minimum of 30 mitotic divisions prior to entering meiosis, and this number continues to rise with age, since male meiosis is continually occurring. For example, a 15 year old male is estimated to have 35 germ-cell divisions, with this number rising to 380 at age 30, and 840 by age 50\cite{Crow2000a}. % Male progenitor cells undergo potentially many more mitotic divisions prior meiotic entry. As the number of mitioc proliferations increases in males, so does the number of mutations accumulated through DNA replication errors. @@ -120,10 +175,11 @@ \subsection{Timing of meiotic events} In females, meiosis begins prenatally, and oocytes progress through the diplotene stage of prophase I before undergoing an arrest period\cite{Hassold2001,Crow2000a}. This arrest is called the dictyotene stage, or dictyate arrest, and meiosis is frozen at the point at which the chromosomes have fully synapsed and chiasmata have formed (Figure \ref{fig:introTiming}). This arrest period ends only upon ovulation, and thus meiosis can be potentially very lengthy, taking one to five decades to complete. -Additionally, while each male meiosis produces four haploid sperm products, female meiosis yields one haploid oocyte contain the majority of the cytoplasm, the remaining meiosis I and II division products produce polar bodies, which contain DNA but typically apoptose\cite{Schmerler2011}. +Additionally, while each male meiosis produces four haploid sperm products, female meiosis yields one haploid oocyte contain the majority of the cytoplasm. +The remaining meiosis I and II division products produce polar bodies, which contain DNA but typically apoptose\cite{Schmerler2011}. -Dog meiosis differs from that of humans in some key respects. +Meiosis in dogs differs from that of humans in some key respects. Meiosis in female dogs begins later, starting in the neonatal period\cite{Freixa1987}. The meiotic arrest occurs at the same dictyotene stage in both species, but is shorter in dogs, given the later onset of meiosis in dogs as well as a reduced lifespan. In addition, while meiosis exits the arrest period prior to ovulation in humans, dogs ovulate immature, primary oocytes, which only mature to fertility 48-60 hours after ovulation\cite{Tsutsui1989,Chastant-Maillard2011}. @@ -143,14 +199,19 @@ \subsection{Timing of meiotic events} Several biological possibilities have been proposed to explain what happens to oocytes during meiotic arrest. One is the production line hypothesis, first proposed in 1968\cite{Henderson1968}, which proposes that oocytes exit meiosis and are ovulated in the order in which they enter. -An implication of this is an oocyte produced early in the fetal stage, which will thus exit early, must have more robust crossover connections and are less prone to aneuploidy when compared to later oocytes. +An implication of this is that an oocyte produced early in the fetal stage, which will thus exit early, must have more robust crossover connections and are less prone to aneuploidy when compared to later oocytes. Furthermore, the production line hypothesis suggests that chiasmata frequency would decrease with age in females, as the rate of aneuploidy increases. Several tests of the production line hypothesis have been done in mammals. A study in mice found support for the existence of a production line in mice\cite{Polani1991}, supporting the idea that oocytes exit meiosis in the order in which they enter. However several studies in humans contradict the assumption of a decrease in crossover count in older mothers\cite{Kong2004,Martin2015}. Most recently, a study in over 8,000 human oocytes found no evidence for a decrease in crossover count with age. -%%% other possibilities? +Another possibility, suggested by a study in the Icelandic population\cite{Kong2004}, is oocytes that have higher recombination rates are more likely to survive to become successful embryos. +Since a higher rate of recombination is linked to a lower incidence of aneuploidy, it is possible that oocytes with lower crossover counts were more likely to be aneuploid. +These aneuploid oocytes would therefore be discarded somehow, by a cell cycle checkpoint mechanism for example. +Therefore, these could not be observed in a study looking as presumably healthy, or at least viable, offspring and would giving the impression of a recombination rate increase. + +%%% other possibilities? rec events during arrest period. @@ -165,14 +226,14 @@ \section{Historical studies of meiotic recombination} Thomas Hunt Morgan first observed the separation of linked traits while studying Drosophila in 1911\cite{Morgan1911}, and proposed the theory of crossing over between chromosomes. In addition he suggested that the recombination rate could increase with the distance between factors. Morgan's student, Alfred Henry Sturtevant, quantified this change in rate over physical distance into ``map distance,'' using this concept to construct the first genetic map. -This map represented the order of, and crossover rates between, genes on the X chromosome in Drosophila\cite{Sturtevant1913}. +This map represented the order of, and crossover rates between genes on the X chromosome in Drosophila\cite{Sturtevant1913}. In addition, Sturtevant observed that one crossover tended to inhibit the placement of a second nearby, an early description of interference. A later study by Harriet Creighton and Barbara McClintock in corn (\textit{Zea mays}) in 1931 demonstrated that recombination between genes was tied to an exchange of chromosomal segments\cite{Creighton1931}. Tracing the inheritance of markers from one generation to the next within a family pedigree provided the first genome-wide measurement of recombination across the human genome, prior to the completion of the Human Genome Project. Early studies used restriction fragment length polymorphism (RFLP) probes to identify specific loci within the genome, and determine if they are linked. An early study described the use of RFLPs to generate a linkage map of recombination in the human genome\cite{Botstein1980}. -Further linkage studies increased the marker density across the genome by using microsatellite, short tandem repeat polymorphisms (STRPs) and other approaches to capturing genetic variation\cite{Morton1991,Matise1994,Dib1996}. +Further linkage studies increased the marker density across the genome by using microsatellite, short tandem repeat polymorphisms (STRPs), and other approaches to capturing genetic variation\cite{Morton1991,Matise1994,Dib1996}. The Marshfield map, generated in 1998 by \citet{Broman1998}, was an important step in characterizing recombination on a genome-wide basis. With the completion of the Human Genome Project and the publication of the draft sequence of the human genome\cite{Venter2001,Lander2001}, human genetic variation has become increasingly well characterized. @@ -193,10 +254,6 @@ \section{Methods for studying recombination} \subsection{Pedigree analysis} Tracking the transmission of alleles from one generation to the next within known pedigrees provided the first data on recombination in early linkage studies, and pedigree analyses are still widely in use today. -It is interesting to note that for a pedigree analysis, -while whole-genome sequencing technology allows the discovery of a higher density of markers across the genome, its use is often not worth the higher cost. -A higher variant coverage will help to narrow the region of uncertainty surrounding a particular crossover, but it will most likely not assist in the detection of additional crossovers in a single meiosis. - Regardless of the method used for obtaining markers, the principle of detecting recombination in a pedigree remains. Crossovers can be identified by tracing the allele transmissions from parent to child. Figure \ref{fig:introPedfig} provides a simple visual example, showing a family quartet. @@ -204,6 +261,9 @@ \subsection{Pedigree analysis} The male child has a 1-0 haplotype, and therefore must have inherited a recombinant haplotype from his mother. We can identify here a crossover event and localize that event to an interval flanked by two informative genetic variants. This region of uncertainty can vary in size and depends on the spacing and genotypes of polymorphic variants within the genome. +It is interesting to note that for a pedigree analysis, +while whole-genome sequencing technology allows the discovery of a higher density of markers across the genome, its use is often not worth the higher cost. +A higher variant coverage will help to narrow the region of uncertainty surrounding a particular crossover, but it will most likely not assist in the detection of additional crossovers in a single meiosis. Beyond this intuitive example, the problem of determining the parental phase of a recombinant chromosome has been addressed in a number of methods. % \afterpage{ @@ -229,7 +289,7 @@ \subsection{Linkage disequilibrium approach} The inference of recombination in such a dataset relies on the quantification of levels of linkage disequilibrium (LD) within the samples, a measurement of linkage between loci. For example, when alleles at one locus are inherited completely independently of alleles at another locus they are considered to be in linkage equilibrium. -However many alleles exhibit a non-random association, and individuals of the same species tend to share haplotype segments that reflect a shared evolutionary history. +However, many alleles exhibit a non-random association, and individuals of the same species tend to share haplotype segments that reflect a shared evolutionary history. When two alleles on an ancestral haplotype are inherited together they are considered linked, and are not independent of each other. These alleles exhibit evidence of linkage disequilibrium, a deviation from the assumption of random assortment of alleles. @@ -239,9 +299,9 @@ \subsection{Linkage disequilibrium approach} From measurements of LD within a number of unrelated samples, methods based on coalescent theory have been developed to estimate the recombination rate\cite{Auton2012}. In software such as LDhat\cite{Mcvean2004,Auton2007,Auton2014}, the population-scale recombination rate, $\rho$, is estimated from the data, and the per-generation recombination rate can be calculated by the relationship $\rho = 4 N_e r$ -where $N_e$ represents the effective population size. +where $N_e$ represents the effective population size, and $r$ the per-generation recombination rate. These methods are quite powerful and have produced high-quality estimates of recombination in humans\cite{hapmap2007}, however, they are subject to limitations. -First, that this method requires the knowledge of the genealogical history of a sample, which is unknown, and thus relies on an often simplistic approximation. +First, that this method requires the knowledge of the genealogical history of a sample, which is unknown, and thus relies on simplifying approximation. Second, these maps by their nature generate sex-averaged data only, since recombination events that are inferred have occurred over the course of potentially thousands of generations. @@ -256,12 +316,12 @@ \subsubsection{Sperm cell assays} Sperm typing was first used in 1989 to study crossing over in humans\cite{Cui1989}, and uses an allele-specific polymerase chain reaction (PCR) assay to identify recombination events at a given locus. In this method, DNA is extracted from multiple haploid sperm cells from a single donor and subject to PCR. -A common reverse primer is used in conjunction with two different allele-specific forward primers, which correspond to polymorphic site in the diploid genome, and are designed to produce different amplicon sizes depending on the matching nucleotide. +A common reverse primer is used in conjunction with two different allele-specific forward primers, which correspond to a polymorphic site in the diploid genome, and are designed to produce different amplicon sizes depending on the matching nucleotide. Analysis of the PCR products from many sperm cells can reveal the phase of the donor individual, and the recombinant status of each sperm cell. % \cite{Jeffreys1998,Jeffreys2000,Jeffreys2004}. % first, second(TAP2), review Sperm typing has been used to produce high-quality data from a number of loci throughout the genome. -One of the first major findings to come out of sperm typing was the characterization of a recombination hotspot in the human major histocompatability complex (MHC), first within one gene, \textit{TAP2}\cite{Jeffreys2000}, then expanded to cover a wider 216 kb region of the MHC\cite{Jeffreys2001}. +One of the first major findings to come out of sperm typing was the characterization of recombination hotspots in the human major histocompatability complex (MHC), first within one gene, \textit{TAP2}\cite{Jeffreys2000}, then expanded to cover a wider 216 kb region of the MHC\cite{Jeffreys2001}. All six hotspots found within the region of the MHC were found to be tightly correlated to regions in which LD broke down, providing molecular evidence that recombination hotpots have severe effects on LD patterns. @@ -270,7 +330,7 @@ \subsubsection{Sperm cell assays} Further work using whole genome, single-cell sequencing approaches has shed light on recombination on an individual basis for the first time. A study by \citet{Lu2012} sequenced 99 sperm cells from an Asian male, providing valuable information on the level of variation across the entire genome of a single individual. -Here, +% Here, In addition, a further study analyzed genome-wide recombination using more than 100 sperm cells\cite{Wang2012}. These studies are promising, and when expanded to include measurements from multiple individuals, will provide much needed data on individual level variation in crossover, and potentially gene conversion as well. @@ -294,28 +354,25 @@ \subsubsection{Recombination initiation maps} \citet{Pratto2014} utilized chromatin immunoprecipitation coupled with sequencing to identify DSBs associatated with the strand-exchange protein DMC1 to generate DSB maps in four unrelated human males. This method identifies the majority of DSBs within the meiotic cell, only a fraction of which will be resolved as crossovers that could be identified via genotyping methods, the remainder ending up as non-crossover gene conversions. The researchers found the DSB cluster into hotspots, of which 51\% overlapped with the LD crossover hotspots\cite{hapmap2007}, and 80\% of DSB hotspots overlapped regions with elevated recombination rate. -In addition, the DSB locations were largely tied to the specific PRDM9 allele for each particular individual. -PRDM9$_\text{A}$ and PRDM$_\text{B}$ alleles appear to specify similar DSB hotspots, while PRDM9$_\text{C}$ has a separate specificity. -PRDM9 heterozygosity also affects hotspot strength. % \section{Current ``gold standard'' maps (Hapmap2 LD map, deCODE pedigree map).} \section{Genetic maps of recombination.} \subsection{Marshfield map} -The Marshfield map, generated by \citet{Broman1998} in 1998, was the first genetic map of the human genome at a resolution high enough to make inferences on the recombination properties in humans, using $>$8,000 short tandem repeat polymorphsims (STRPs) in 188 meioses. +The Marshfield map, generated by \citet{Broman1998} in 1998, was the first genetic map of the human genome with a resolution high enough to make inferences on the recombination properties in humans, using $>$8,000 short tandem repeat polymorphsims (STRPs) in 188 meioses. Here, estimates of the genome wide map lengths, inferences on individual variation, and sex differences in recombination were highlighted. %% move to heterochiasmy? The ratio of female to male autosomal map length was estimated at 1.56, indicating that the recombination rate in females is substantially higher than males. -This ratio has proved stable over a number of studies in the intervening years (summarized in Table \ref{tab:introHeterochiasmy}). +This ratio has proved stable over a number of subsequent studies (summarized in Table \ref{tab:introHeterochiasmy}). Analysis of this this ratio as a function of chromosome position revealed that male recombination tends to be highest in the telomeres, while females had a higher ratio towards the centromeres. -This study provided valuable insight into sex dimporphism in recombination +This study provided valuable insight into broad-scale sex dimorphism in recombination, and raised questions as to the extent of fine-scale differences between males and females. \subsection{deCODE maps} -Another major stride in pedigree-based genetic maps came from deCODE genetics, an Icelandic pharmaceutical company that used their database of genealogical and genetic data on many Icelandic families to infer recombination, producing many high quality studies. -The first was in 2002, where 146 families, comprising 1257 meioses, were genotyped using 5136 microsatellite markers\cite{Kong2002}. +Another major stride in pedigree-based genetic maps came from deCODE genetics, an Icelandic pharmaceutical company that used their database of genealogical and genetic data on many Icelandic families to infer recombination, producing several high quality studies. +The first was in 2002, in which 146 families, comprising 1,257 meioses, were genotyped using 5,136 microsatellite markers\cite{Kong2002}. This data, in conjunction with the draft sequence of the human genome\cite{Venter2001,Lander2001} was used to improve the marker order and their placement within the reference sequence. -The genetic map generated from this study confirmed much found by \citet{Broman1998} in terms of sex dimorphism in recombination, and further characterized fine scale variation between the chromosomes. +The genetic map generated from this study confirmed much found by \citet{Broman1998} in terms of sex dimorphism in recombination, and further characterized fine scale variation. One particular finding was that of recombination ``jungles,'' regions of high crossover rate that clustered towards the telomeres. In addition, recombination rate was found to correlate with GC content, CpG motif occurrance, and tracts of poly(A)/poly(T), together explaining $\sim$37\% of variation. @@ -326,7 +383,7 @@ \subsection{deCODE maps} This enables recombination be called in 15,257 parent-offspring pairs. %%% One effect of this parent-child phasing approach is that inference of recombination events near the telomeres is difficult, and \citet{Kong2010} omit the most distal 5 Mb for each chromosome. -The omission of the telomeric regions, where male recombination is higher, contributes to the inflation of the female:male map length ratio (Table \ref{tab:introHeterochiasmy}), at 1.78 in this study, although the true value is more likely to be closer to the consensus ratio of around 1.6. +The omission of the telomeric regions, where male recombination is higher, contributes to the inflation of the female:male map length ratio, at 1.78 in this study, although the true value is more likely to be closer to the consensus ratio of around 1.6 (Table \ref{tab:introHeterochiasmy}). Since its release in 2010 the deCODE genetic map has proven quite valuable as a high quality sex-specific map of recombination in the human genome. @@ -678,12 +735,14 @@ \subsection{PRDM9 alleles} Furthermore the PRDM9 allele status of an individual has a strong effect on the hotspot overlap. In the Hutterite study, A/A individuals have significantly different overlap compared to A/I, and A/B, with the A/I heterozygous having lower hotspot usage overall\cite{Baudat2010}. - Sperm typing in men of African ancestry revealed that PRDM9 variants similar to the ``C'' allele (termed C-type), were more common in this population. Furthermore, these C-type alleles specified different hotspots with a motif different from those seen Europeans\cite{Berg2011}. These ``African-enhanced hotspots'' all contained a common motif, CCNCNNTNNNCNTNNC, but were associated with the PRDM9 C-type alleles. - +A study using recombination initiation maps of recombination to locate DSBs found that PRDM9 alleles affected DSB locations, which represent a set of both crossovers and gene conversion\cite{Pratto2014}. +The DSB locations were largely tied to the specific PRDM9 allele for each particular individual. +PRDM9$_\text{A}$ and PRDM$_\text{B}$ alleles appear to specify similar DSB hotspots, while PRDM9$_\text{C}$ has a separate specificity. +PRDM9 heterozygosity was also found to influence hotspot strength. %%%%%%%%%%%%%%%%%%%% @@ -1049,50 +1108,6 @@ \section{Gene conversion} % \cite{Cole2014} length -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{Hypothesis and goals} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -In this thesis, I aim to use a number of genetic methods to gain insight into the recombination process as a whole. -Much of this research focuses on human subjects, and has a specific goal of learning more about recombination in humans. -I have, however, included a chapter in which I analyse recombination in domestic dogs. -Dogs, and the entire canid family, are unique among mammals because they have undergone a series of mutations within PRDM9, rendering this gene inactive in meiosis. -Since this protein has been shown to be essential to meiosis in mice, its absence raises a number of questions as to how recombination has altered to -dogs provide an interesting cohort on which to study the effects of the loss of PRDM9 on the recombination landscape as a whole. -Comparing recombination properties in dogs to those of humans will provide additional insight into recombination in other organisms. - -This thesis can be divided into two main sections, each focusing on one of the two outcomes of recombination: crossover and gene conversion. -In part one (Chapters \ref{ch:cointEsc}, \ref{ch:cointExtras}, and \ref{ch:dogPed}), I focus on crossover, and use a pedigree approach to study recombination in humans and dogs. - -In Chapter \ref{ch:cointEsc}, I present a pedigree analysis of recombination in humans. -Using a large set of human families, I identify crossovers within the genome. -I use this data to investigate specific properties of recombination and how it differs between sexes, individuals, populations -I analyze hotspot useage on an individual level, as well as across males and females. -I look for age effects on recombination within this dataset. -I am to further describe how crossover interference varies between individuals, sexes, and populations. - -In Chapter \ref{ch:cointExtras}, I present a previously unpublished data. -This consists of an extension to the data from Chapter \ref{ch:cointEsc}, which extends the analysis of age effects in humans. -Additionally, I include a re-analysis of public data from single cell sperm and oocytes. -This provides valuable information on how recombination varies on an individual level. - -In Chapter \ref{ch:dogPed}, I focus on crossover patterns using pedigree of inbred domestic dogs. -This data provides valuable insight into the effects of PRDM9, which is missing in dogs, on crossover within our own species. - -Finally, in Chapter \ref{ch:geneConv}, I focus on gene conversion. -I propose a new model to detect gene conversion events using admixed population genetic data. - - -% % Hypothesis / goal of this thesis. -% \begin{titemize} -% %\item To examine how the absence of PRDM9 has affected the recombination landscape in dogs. -% \item To examine how crossover interference varies between individuals and across species. -% \item To further investigate possibilities to model gene conversion in the human genome. -% \end{titemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% diff --git a/main.tex b/main.tex index 160012c..333493b 100644 --- a/main.tex +++ b/main.tex @@ -14,6 +14,7 @@ \usepackage{array} % get left justified columns \newcolumntype{P}[1]{>{\raggedright\arraybackslash}p{#1}} % get left justified columns \usepackage{pdfpages} +\raggedbottom % solve inconsistent paragraph spacing issue %%% bibliography: \usepackage[super,comma,sort&compress,sectionbib]{natbib} @@ -193,7 +194,7 @@ \Author \\ \end{centering} \vspace{10pt} -% \DoubleSpacing + \DoubleSpacing Recombination during meiosis is an essential process to the generation of gametes. @@ -255,7 +256,7 @@ \chapter{Acknowledgements} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapterstyle{ger} -% \DoubleSpacing + \DoubleSpacing %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Introduction} \label{ch:introduction} \include{introduction/introduction}