In: Math
Consider a region of single stranded DNA 7 nucleotides long with the 5' end on the left. Assume any site in this region may be occupied with equal probability by any of the 4 bases A,T,G or C.
How many possible base sequences are possible?
What is the probability the sequence contains only the base G?
What is the probability the sequence contains only two type of bases?
What is the probability the sequence is CGTGAGA?
What is the probability the sequence does not contain any A or G bases?
What is the probability the sequence contains alternating G and C sequence?
This problem can be simply reduced to =>
We have 7 places to fill with 4 nucleotides, and repetition is allowed.
1. Possible base sequences = All the 7 places can be filled in 4 ways (by any of A, G, T, or C)
= 4*4*4*4*4*4*4 = 4^7 = 16384
2. P(Seq. has only G base) = No. of base seq. with only G base/Total Base Seq.
=> only one base seq is possible with only base G => GGGGGGG
=> P(Seq. has only G base) = 1/16384 = 6.1 * 10^(-5)
3. No. of Seq. with only two type of bases = (No. of ways to select 2 bases out of 4)*(NO. of possible arrangements of 2 bases in 7 places)
= (4C2)*(2^7)
=[4!/(2!2!)] * 128
= (24/4)*128
= 768
P(seq. has only two bases) = 768/16384 = 0.0469
4. P(Seq. is CGTGAGA) = 1/16384 = 6.1 * 10^(-5)
5. P(seq. does not contain A or G) = no. of seq without A or G/16384
= (no. of ways to arrange T and C in 7 places with repetition)/16384
= (2^7)/16384
= 128/16384
= 0.0078
6. P(Seq. Contain alternate G and C seq.) = no. of such possible seq./16384
=> no. of seq. with alternating G & C = 2 (CGCGCGC or GCGCGCG)
=> P(Seq. Contain alternate G and C seq.) = 2/16384 = 0.00012