Submit a request via the Biology Computation Support Form to be added to the CGSB Linux working group on the HPC, and to be granted permission to your lab’s sequencing results directory.
Data Policy and Retention
Demultiplexed fastqs or raw lane fastqs are copied to lab directories on /scratch on the HPC.
Owners of the data will have read access to the fastqs on /scratch.
Sequencing data in lab directories on /scratch are backed up and not subject to flushing.
Raw and processed sequencing run directories are archived and backed up locally for a minimum of five years.
Raw sequencing run directories can be made available to users on request.
Lab shares are kept up to 3 years after PI departure from CGSB.
HPC Best Practices
Your job should run in, and output written to, your personal directory on scratch. i.e. /scratch/netID/my-project/job-xyz/
Your Slurm scripts should live in your job or project directory
Keeping a copy of the job script in it’s run directory is good practice as it allows you to check later what parameters were used and facilitates reproducibility.
All other scripts (ex: python scripts, other executables) should live in your home folder (ie. /home/netID/)
If you need to run a script that you created (a python script for example), call it from your home directory (accessible via the $HOME variable) in your slurm script
If you need a software package which is not available on the HPC, please email the HPC team at hpc@nyu.edu with your request. You can check if a module already exists by typing module avail tool_name on the command line.