In: Biology
2. (3 pts.) Someone in class asked whether it wasn't possible for a region of the genome to appear to be IBD (identical by descent) due to a chance sharing of alleles at a number of consecutive loci. Where in their paper (and how) do Ralph and Coop address this possibility? What do they conclude?
Use the below resources to answer the above question
"The geography of recent genetic ancestry across Europe." Ralph P, Coop G. PMID: 23667324
IBD or Identical by Descent, in the context of heredity and genetics, implies that DNA sequences in two indoviduals may be similar due to a common ancestor from who these individuals have inherited their current geentic make-up. The length of an IBD segment provides information about how far that common ancestor is from these two individuals. IBS or Identity by State, on the other hand, refers to the similarity in DNA sequence between two individuals, which may or may not be as a result of a common ancestor. Similarities between two DNA sequences may arise due to single-nucleotide polymorphisms (SNP) in either of the two sequences. That is, originally thes etwo sequences might have differed by one nucleotide, but one SNP event might have made them similar to each other. In such cases, the stretch of DNA sequence under considerarion seems to be IBD because of the similar sequences, but they are not. In other words, all IBD segments are most definitely IBS, but IBS can occur with or without IBD (Similairty in DNA sequence doesn't always imply the existence of a common ancestor). Such cases of IBS due to single nucleotide polymorphisms often lead to false-positive results when looking for IBD segments between individuals or populations.
In the "Materials and Methods" section of the paper by Ralph P and Coop G (as stated in the question), the authors discuss the possibility of such false-positive cases under the sub-heading "Power and False Positive Simulations". Here they state, that long IBS haplotypes can also result from multiple short IBD haplotypic segments consecutively placed in a DNA. Haplotype refers to the set of genes or genetic determinants located on a single chromosome. The only method of assessing IBD and the distance of the common ancestor from the current individuals is by assessing the length of the similar sequences between these two individuals. But Ralph and Coop here point out that such similar stretches of IBD may not be a continuous stretch of IBD sequence all the time, but a number of shorter stretches of IBD sequences, placed side by side on a DNA molecule. Thus two short similar sequences may be interspersed by a non-similar sequence which has undergone recombination and does not fall under the IBD category. In other words, multiple genetic loci concatenated or linked to each other in a consecutive manner on the same chromosome may be true IBD sequences individually, but together they may not be a long continuous stretch of IBD, since they have intervening sequences between them which are different. This may lead to confusion regarding the length of the IBD sequences (as in whether one should consider the shorter stretches of true IBD or take the whole length, all the consecutive IBD sequences together) to be used for determining ancestry.
In this context, Ralph and Coop conclude that such concatenated stretches of consecutive shorter IBD sequences do not represent single haplotypes without recombinant stretches of DNA, and therefore they do not qualify as true IBD sequences. They stated that this problem with false positives decreases as the genetic length of the shared haplotype increases. That is, longer the stretch of similarity (without intervening recombinant DNA), higher the chances of eliminating false postitives and getting true IBD data.