Non fungible tokens (NFTs) for academic publications?

Being rejected from the preprint server, bioRxiv, seemed like a new low for me.  But, I was heartened to know that I was in good company.  At least an explanation was provided by the bioRxiv team that “every submitted manuscript is examined by affiliate scientists to determine its suitability for posting. bioRxiv is intended for full research papers and on screening, our affiliate scientists determined that this manuscript fell short of that description.”  Fair enough.  Read more…

Streamlined RNA-Seq Analysis Using Nextflow

nf-core is a community effort to collect a curated set of analysis pipelines built using Nextflow. This post will walk you through running the nf-core RNA-Seq workflow. The pipeline uses the STAR aligner by default, and quantifies data using Salmon, providing gene/transcript counts and extensive quality control. Prior to alignment, the pipeline uses Trim Galore to automatically trim low quality bases from the 3′ end of reads, and perform adapter trimming, attempting to auto-detect which Read more…

JBrowse Genome Browser

During the summer of 2020, the Ghedin and Gresham labs at New York University sequenced several SARS-CoV-2 isolates from clinical samples acquired in New York City. To visualize and share the data among researchers and collaborators we built a JBrowse web server. JBrowse is a web-based genome visualization software allowing you to visualize your genomic data files, such as FA, VCF, BAM, CRAM, and GFF3 files. To benefit all researchers at NYU engaged in genomics Read more…

Variant Calling Pipeline using GATK4

This is an updated version of the variant calling pipeline post published in 2016 (link). This updated version employs GATK4 and is available as a containerized Nextflow script on GitHub. Identifying genomic variants, including single nucleotide polymorphisms (SNPs) and DNA insertions and deletions (indels), from next generation sequencing data is an important part of scientific discovery. At the NYU Center for Genomics and Systems Biology (CGSB) this task is central to many research programs. For Read more…

Single Cell RNA-Seq Analysis

Single Cell RNA-Seq Allows For An Unprecedented Look At Plant Root Meristem Cell Identity

In the Kenneth Birnbaum Lab, we are interested in understanding how the plant root is able to grow continuously over the plant’s life and maintain its specific root structure (Fig.1). More specifically, what is the role of the stem cells present at the very tip of the root (QC and initials), in this maintenance and development. The fascinating part of plant growth in general is its capacity of continuous growth over the plant’s life. This Read more…

Beginner’s Guide to Bioinformatics Tools for Analyzing Microbiome Data

Next-generation sequencing technologies have allowed for sequencing at a low cost and fast speed, and is used more and more to study microbial communities. RNA-seq metatranscriptome and WGS metagenome studies aim to investigate microbial communities at genome and transcriptome levels. In this article, I will introduce a few tools that I frequently use to analyze metagenomic and metatranscriptomic datasets. Generating Microbial community taxonomy profiles Since a variety of microbes live in the microbial community at Read more…

Three Useful Nextflow Patterns Every Computational Biologist Should Know

In this article I’ll go over three Nextflow patterns I frequently use to make development of Nextflow data processing pipelines easier and faster. I use each of these in most of my workflows, so they really come in handy. I am assuming here that you know what processes, channels, strings, closures, directives and operators are and are somewhat comfortable writing Groovy and Nextflow code. If you want further details on any of the topics I Read more…

HighPrep PCR Beads as an AMPureXP Alternative

Comparing HighPrep PCR and AMPureXP for cleanup and size selection Written by Hana Husic High-throughput sequencing requires precise size selection of DNA fragments in order to increase the amount of usable data generated. If the fragments are too small, sequencing reads could be contaminated with adapter sequences. If the fragments are too long, library quantification is not as accurate and the run could under-cluster, producing less reads. Therefore, size selection is one of the most Read more…

Gene Set Enrichment Analysis in Minutes with the NASQAR Web App

Gene Set Enrichment Analysis (GSEA) is a common method to analyze RNA-Seq data that determines whether a predefined defined set of genes (for example those in a GO term or KEGG pathway) show statistically significant and concordant differences between two biological phenotypes. There are a myriad of tools for GSEA analysis, and one of them which I particularly like is clusterProfiler. Developed as an R-based tool, clusterProfiler has until now been inaccessible to users unfamiliar Read more…

Analyze your Data Faster with NASQAR: Nucleic Acid SeQuence Analysis Resource

The bioinformatics team at the NYU Center for Genomics and Systems Biology in Abu Dhabi and New York have recently developed NASQAR (Nucleic Acid SeQuence Analysis Resource), a web-based platform providing an intuitive interface to popular R-based bioinformatics data analysis and visualization tools including Seurat, DESeq2, Shaman, clusterProfiler, and more. These tools, although powerful, typically require significant computational experience and lack a graphical user interface (GUI), making them inaccessible to many researchers. NASQAR addresses this Read more…