I recently helped the Rockman lab basecall their MinION sequencing data on the HPC, leveraging the power of the GPUs available there. This allowed us to bring the total time required for basecalling down to around five hours, from the two weeks(!) it was going to take on the desktop. Since more people are beginning… [read more]
OpenStack, a project originally started by NASA and Rackspace, is an open source cloud computing platform that enables users to access and control pools of compute, storage, and networking resources. TechCrunch calls it “one of the most important and complex open-source projects you’ve never heard of”. Like competitor Amazon Web Services (AWS) and other cloud… [read more]
reform is a python-based command line tool that allows for fast, easy and robust editing of reference genome sequence and annotation files. With the increase in use of genome editing tools such as CRISPR/Cas9, and the use of reference genome based analyses, the ability to edit existing reference genome sequences and annotations to include novel… [read more]
The NYU Center For Genomics and Systems Biology in New York and Abu Dhabi have developed a new website with resources for mastering NGS analysis: https://learn.gencore.bio.nyu.edu/ Modules are designed to provide hands on experience with analyzing next generation sequencing data. Standard pipelines are presented that provide the user with and step-by-step guide to using state… [read more]
In this post we will build a pipeline for the HPC using Python 3. We will begin by building the foundation for a pipeline in Python in part 1, and then use that to build a simple NGS analysis pipeline in part 2. At NYU, we submit jobs to the HPC using the Slurm Workload… [read more]
Save time and resources with the local CGSB repository of commonly used genomic data sets. Data is obtained from Ensembl and NCBI. New versions/releases will be added periodically or upon request. Previous versions/releases will be preserved. All files are readable from within the shared genome resource. There is no need to copy the file(s) to… [read more]
Salmon and kallisto might sound like a tasty entree from a hip Tribeca restaurant, but the duo are in fact a pair of next-generation applications for rapid transcript quantification. They represent a new approach to transcript quantification using NGS that has a number of advantages over existing alignment-based methods. I’ve tried them both out and… [read more]
In this post we’ll discuss maximizing your performance on the HPC. This entry is aimed towards experienced HPC users; for new users, please see Getting Started on the HPC. Recent advances in sequencing technology have made High Performance Computing (HPC) more critical than ever in data-driven biological research. NYU’s HPC resources are available to all… [read more]
Identifying genomic variants, such as single nucleotide polymorphisms (SNPs) and DNA insertions and deletions (indels), can play an important role in scientific discovery. To this end, a pipeline has been developed to allow researchers at the CGSB to rapidly identify and annotate variants. The pipeline employs the Genome Analysis Toolkit (GATK) to perform variant calling… [read more]
The Genomics Core Facility (Gencore) is a state of the art genomics facility enabling high throughput genome sequencing, proteomics and quantitative analyses.
Gencore is located in the Center for Genomics and Systems Biology in the Biology Department at NYU-NY and the Center for Genomics and Systems Biology in NYU-AD.