In: Biology
1. What are the k-mers of length k = 21 for this sequence read in FASTQ format?
@K000384:75:HM57CBBXX:1:1101:25530:1384 1:N:0:GTGGCC
CTGGCACTGGGCTTCAAGCTGGGCTACCTTCTGTTT
+
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
[please be detailed, thanks]
Answer-
FASTA format is a text-based format for representing either nucleotide sequencesor peptide sequences, in which base pairs or amino acids are represented using single-letter codes. A sequence in FASTA format begins with a single-line description, followed by lines of sequence data.
FASTA format need to be searched from the NCBI
website
https://www.ncbi.nlm.nih.gov
Then select the search as Nucleotide.
After that paste the sequence to get the FASTA Sequence.
ATTGACAGAACCTTGACCCTGGCCACAGGCTGGGCTCTGCCCTGACTGTGGGTGAGTGGCTTGCTTGGCT CACCTGGCTGGTGGCTGGGCTGGGTCTCAGTTCAGGTGTGTGTGATTAGGAGGCAGCCATGCAGGACCTT TTGTTAGTGGCAGAAGAAACTTCAGGGGACAAGGTAGTTGGAAGTGACTGATTTCCCAGCGGGGAGGACT TCAAGAATGTTCCTTTTCCCAGCTTTGAGTTTCTCTGGGTTGTACTGGGAGAGGCTCTTAGAGAGGTGGG TGTAGGGAAGGGAGCCCTCTGCCCTCTCTACATCCGTCTCCCGGGGTGGGTCCCAGAGCCCTGGCGGTGG GGCCCCATGACTCCTCTTGAGCAAAGACTGCAAGTGTCAACTTTTCTAAAGGTGCAGGTTCCCCGTTCAC TCCAGTTCATCACTGGTTCCTCAAGGGACCACCTGGATTTTATCCTGCTTCAAGTCATCCTAATCCATTC TCTAATTTGGATCGGGATGTGAGGCTGCTGAGCAAGAGTGGTAAGCCACTTTCTGACCTCATTTCCTCAC GGGGATAGGTAAGCCTGCCTGTAGACTTGGATGAGAGCAAGTACCTGCACAGGGCTGGGCACAGACAAGG GACAGTTGCTGTGGGGCTTGACATGAAGTAGTGTGTCACGTATTTTCACTGAACCAGACTCATTACACTC TGGAGGCTGGCTCAGCATGGAACCTTCTAGAAAGGCAAGGCTCCCACCATACCCAAAGCTATGAGTGGCC CCATGAGCTGTTTCTGATCTCCAGGCTGGATTGAGACCCAGCCACAACACCAGCTGGACCCTGTCTCCTC AGTGATTTTCCTAAAGTGTGCTTTCCTAGGCTGGGTGTATGGGGGACAGGGGTCGCAGGGCTGGAGCTTG GGGGGTGGCAGGTGGCAGGGATGGCTGGAGGGCATGCTGAATGCTCTGAATCCTGACTGCCTTCTCCCTT TCCGTTTCTGGCCAATCCTGTACTGGCATCTTTCTCCACCGATGGTCCAATTACAGTGCTTGATTACCTG GACGAAACAATGGAAGGTAGGCCCCCAGACCA