The DFCI Gene Indices

Frequently Asked Questions About the DFCI Gene Indices


The purpose of this page is to provide answers for commonly asked questions about the DFCI's Gene Indices.
Before you begin, please note:

  • All of the Gene Indices are built with the same structure so the answers to the questions are valid for ANY of the gene indices.
  • All answers requiring navigation of TGI web pages begin from the main index page, links are provided for each of the species by clicking on the picture icon or the organism name.

Category: Availability

  1. How do I deep link to a THC/TC, EST, EGAD, or Library Report?
  2. How can I download data for a specific gene index database?
  3. How do I access the files I downloaded and what are they?
  4. How do I obtain data for the gene index related tools like EGO or Resourcerer?
  5. Is it possible to obtain protein sequences for a specific gene index database?
  6. How can I obtain TGI software?
  7. How do I order clones for the EST sequences I've found searching DFCI's Gene Indices?
  8. Can I order clones for THCs or TCs?
  9. Where can I find publications on the DFCI Gene Indices and its tools?

1. How do I deep link to a THC/TC, EST, EGAD, or Library Report?

  1. Get Species Information (ex. g_gallus).
  2. Get the identifier (ex. TC4081).
  3. Create the URL like the following:

For THC/TC,

    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/tc_report.pl?gudb=GUDB_name&tc=TC_NUMBER">TC_NUMBER</A>
    Example:
    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/tc_report.pl?gudb=g_gallus&tc=TC304078">TC304078</A>

For EST,

    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/est_report.pl?EST=EST_NUMBER&species=SPECIES">EST_NUMBER</A>
    Example:
    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/est_report.pl?EST=EST246553&species=Tomato">EST246553</A>

For HT, ET, or NP,

    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/egad_report.pl?id=ET_NUMBER">ET_NUMBER</A>
    Example:
    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/egad_report.pl?id=NP251625">NP251625</A>

For cat#,

    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/lib_report.pl?CATNUM=CAT_NUMBER&species=SPECIES">CAT_NUMBER</A>
    Example:
    <A HREF="http://compbio.dfci.harvard.edu/cgi-bin/tgi/lib_report.pl?CATNUM=1380&species=S.pombe">1380</A>

2. How can I download data for a specific gene index database?

The DFCI gene index databases are free for download via the public ftp site. Please visit ftp://occams.dfci.harvard.edu/pub/bio/tgi/data to obtain data.


3. How do I access the files I downloaded and what are they?

To retrieve the data files, pkunzip (windows) or unzip (linux) the archive.
There are three files for the gene index database.
GI.mmddyy-multi fasta file with TC sequences (annotation in the defline) and singleton sequences.
GI.TC_EST.mmddyy-Lists current TC identifiers followed by their component GenBank Accession numbers.
GI.TCs.mmddyy-TC history in the defline
GO assignment for the THC/TCs is also available.
GI.GO.mmddyy-multi-fasta file indexed by TC; information includes GO ID, GO Term, E.C. Number, GO category.
Oligomer data is available for some gene indices. OLIGO_README explains the file format.


4. How do I obtain data for the gene index related tools like EGO or Resourcerer?

The DFCI gene index related tools are free for download via the public ftp site. Please visit ftp://occams.dfci.harvard.edu/pub/bio/tgi and check under project specific sub-directories.


5.Is it possible to obtain protein sequences for a specific gene index database?

Gene index protein sequences are not available at this time.
Open Reading Frame (ORF) prediction using ESTScan and framefinder (also DIANA for HGI) is available in individual TC Reports if available. The predicted ORFs are hyperlinked to ORF Reports which show the amino acid sequences and allow for protein database searches.


6. How can I obtain TGI software?

Please go to the software main page for the availability of various TGI tools.


7.How do I order clones for the EST sequences I've found searching DFCI's Gene Indices?

DFCI does not distribute clones. This is done through outside sources. Clones for human, rat, and mouse sequences are available through the ATCC and the DFCI/ATCC Special Collection. Arabidopsis clones are accessible through The Arabidopsis Information Resource TAIR. Tomato clones are available through Clemson State University Genomic Institute. Potato clones can be obtained through the University of Arizona Genomics Institute.

The best way to check the availability of a clone or to order one is to check the GenBank Record of the EST sequence for such information.


8. Can I order clones for THCs or TCs?

No. THCs, "Tentative Human Consensus" sequences and TCs, "Tentative Consensus" sequences are created by assembling ESTs into virtual transcripts and therefore you cannot obtain a clone. You can, however, obtain the clones for the individual ESTs underlying the THC or TC. Please note that DFCI does not distribute clones.


9. Where can I find publications on the DFCI Gene Indices and its tools?

Please go to the TGI publications page for references of articles on the DFCI Gene Indices, its related tools, and studies done using TGI tools.


Return to TGI main page.

Comments and suggestions : Contact Us

Acknowledgements
   The Gene Index Project is supported in part by funding from the US National Science Foundation through grant #DBI-0552416.