OneCellPipe

From Cheaha
Jump to navigation Jump to search


Attention: Research Computing Documentation has Moved
https://docs.rc.uab.edu/


Please use the new documentation url https://docs.rc.uab.edu/ for all Research Computing documentation needs.


As a result of this move, we have deprecated use of this wiki for documentation. We are providing read-only access to the content to facilitate migration of bookmarks and to serve as an historical record. All content updates should be made at the new documentation site. The original wiki will not receive further updates.

Thank you,

The Research Computing Team

Summary

OneCellPipe is available for use on Cheaha. From the documentation<ref>https://1cell-bio.com/wp-content/uploads/2019/01/onecellpipe_version-1.1_Dec18.pdf</ref>, OneCellPipe

is a software wrapper ... which controls the management and execution of the indrops software pipeline for processing single-cell sequencing data generated using 1CellBio’s inDrop™ sequencing technology. The software leverages the NextFlow workflow management software to control the processing steps in a validated and consistent Singularity environment.

As mentioned, the pipeline is available as a NextFlow pipeline file which automatically downloads dependencies. To make this work seamlessly with Cheaha, some setup and modifications need to be done. The first step is to obtain the pipeline. The second step is to modify the Singularity configuration so cache files do not use /tmp. Finally, the pipeline may be run on your data.

Setup

To obtain the pipeline, use the following commands in a terminal in an interactive job. They will download the pipeline from a public repository and place it in the /data/user/<username> directory. Replace <username> with your Cheaha login name. You can use echo $USER to find your login name.

cd $USER_DATA
wget https://s3.amazonaws.com/da-ocb-public/onecellpipe.1.19_cf.zip
unzip onecellpipe.1.19_cf.zip
rm onecellpipe.1.19_cf.zip

This process should create a subdirectory onecellpipe. Inside that is another directory bin. The file bin/nextflow.singularity.config must be modified. Replace the file contents with the contents of the block below, and save the changes.

singularity {
    enabled = true
    // Adjust this to a shared directory on your system when using compute clusters.
    // Also add --cache <DIR> or update the cache variable in bin/nextflow.standard.config when changing this!
    cacheDir = '/scratch/<username>'
    autoMounts = true
}

As mentioned in the comments in the above block, it is also necessary to modify the bin/nextflow.standard.config file. Open that file and find the line starting with cache and replace it with the following.

cache = '/scratch/<username>'

Usage

To use the pipeline, navigate to the downloaded location of onecellpipe.nf and run the following commands. Replace <input_directory> with the location of your input files, and replace <output_directory> with any desired existing subdirectory of /data/user/<username>. The --worker $SLURM_NTASKS and <core>--worker2 parameters tell the pipeline to use all of the cores available to the SLURM job for parallel processing, which should speed up processing.

module load Singularity/2.6.1-GCC-5.4.0-2.26
module load Nextflow/19.10.0
export SINGULARITY_BINDPATH=/data:/
nextflow onecellpipe.nf --dir <input_directory> --out $USER_DATA/<output_directory> --worker $SLURM_NTASKS --worker2 $SLURM_NTASKS

References

<references />