Save time and resources with the local CGSB repository of commonly used genomic data sets. Data is obtained from Ensembl and NCBI. New versions/releases will be made available periodically or upon request. Previous versions/releases will be preserved.

Types of data available:

  • Whole genome Fasta sequences
  • Transcriptome Data
  • Fasta indexes
  • GTF files
  • GFF files
  • Bowtie indexes
  • BWA indexes
  • Picard reference dictionary

Locations:

mercer:/scratch/work/cgsb/reference_genomes
butinah:/scratch/Reference_Genomes

Publicly Available Datasets:

ERCC:
ERCC92

Fungi:
Saccharomyces cerevisiae
Candida orthopsilosis
Lachancea thermotolerans
Meyerozyma guilliermondii

Invertebrate:
Caenorhabditis elegans
Drosophila melanogaster

Metagenomic:
Marine metagenome

Plant:
Arabidopsis thaliana
Chlamydomonas reinhardtii
Chlorella
Coccomyxa subellipsoidae
Medicago truncatula
Oriza sativa
Phoenix dactylifera

Vertebrate mammalian:
Homo sapiens
Mus musculus
Sus scrofa

Vertebrate other:
Danio rerio

Don’t see your organism/version? Send requests to mkhalfan @ nyu.edu (Mohammed Khalfan – NY) or nd48 @ nyu.edu (Nizar Drou – AD)