Question

In: Biology

You’re interested in studying the catalytic activity of a rare protein expressed in an exotic plant...

You’re interested in studying the catalytic activity of a rare protein expressed in an exotic plant but present in very low amounts in plant tissues. Your supervisor asks you to search an NCBI genomic database for a potential homologous DNA sequence and you think you found a good match. Propose a strategy to obtain large quantities of this plant protein for your studies. Your complete experimental strategy must be outlined to receive full marks for this question.

Solutions

Expert Solution

DNA may be preserved for thousands of years in very cold or dry environments, and plant tissue fragments and pollen trapped in soils and shallow aquatic sediments are well suited for the molecular characterization of past floras. However, one obstacle in this area of study is the limiting bias in the bioinformatic classification of short fragments of degraded DNA from the large, complex genomes of plants.

Methods

To establish one possible baseline protocol for the rapid classification of short‐read shotgun metagenomic data for reconstructing plant communities, the read classification programs Kraken, Centrifuge, and MegaBLAST were tested on simulated and ancient data with classification against a reference database targeting thaliana genome and the draft sequences for two rice genomes has provided a reference platform for plant genomics. The in-depth analysis of the known and predicted coding sequences from these plants has provided an invaluable resource on the gene-content for model plants and has allowed basic functional resolution of the plant genome.

The currently available plant genomes cover the basic gene repertoire needed for dicotyledonous and monocotyledonous plants. There is, however, a gene information deficit for the other plants that form model systems for e.g. root development in sugar beet, nitrogen-fixing root nodule formation in Lotus japonicus or perhaps fruit development in avocado . There is also such an information deficit for researchers working on rapidly evolving and highly specific genes in other plant species. This information deficit is unlikely to be resolved by high throughput genome sequencing in the near-future.

IMPLEMENTATION AND DATABASE STRUCTURE

Sputnik has been implemented as an EST, cluster and peptide management, annotation and data display pipeline. The application has been programmed as a collection of Python scripts that interact with a PostgreSQL relational database system.

These annotations contain such valuable information as the tissue used in the original cDNA library production, plant cultivar/variety information, and developmental stage information. Additionally, clone library information and any keywords or additional library descriptions are archived for subsequent searches. This infrastructure is additionally used on proprietary EST collections where available annotation on plant, cDNA library and tissue challenges are applied on an ad hoc basis.

DATABASE CONTENTS

Currently, EST collections from all plant species with in excess of 10 000 public sequences have been integrated into Sputnik. Table ​Table11 shows the plant EST collections available within Sputnik and basic statistics on the EST collections. The collection of species includes additional model species in terms of tuber development, fruit development, nodule association and other agronomically important plant species. In excess of 2 million ESTs have been analysed and resolved into ∼550 000 sequence clusters and singletons. The clustered sequences form the core basis for the annotation.

Krishna Tulsi, a member of Lamiaceae family, is a herb well known for its spiritual, religious and medicinal importance in India. The common name of this plant is ‘Tulsi’ (or ‘Tulasi’ or ‘Thulasi’) and is considered sacred by Hindus. We present the draft genome of Ocimum tenuiflurum L (subtype Krishna Tulsi) in this report. The paired-end and mate-pair sequence libraries were generated for the whole genome sequenced with the Illumina Hiseq 1000, resulting in an assembled genome of 374 Mb, with a genome coverage of 61 % (612 Mb estimated genome size). We have also studied transcriptomes (RNA-Seq) of two subtypes of O. tenuiflorum, Krishna and Rama Tulsi and report the relative expression of genes in both the varieties.

Thank You.


Related Solutions

ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT