UploadLargeData: Difference between revisions

From Cheaha
Jump to navigation Jump to search
No edit summary
No edit summary
Line 10: Line 10:
### Chmod og+x /lustre/scratch/''user''/proj1
### Chmod og+x /lustre/scratch/''user''/proj1
# transfer the files with SCP
# transfer the files with SCP
## scp *.fastq.gz ''user''@cheaha.uabgrid.uab.edu:/lustre/scratch/''user''/proj1
## I used "Secure Shell Client" for windows, available at [http://www.uab.edu/it/software/displaytitle.php?ItemID=34 UABIT]
## I used "Secure Shell Client" for windows, available at [http://www.uab.edu/it/software/displaytitle.php?ItemID=34 UABIT]
## open source client is [http://www.chiark.greenend.org.uk/~sgtatham/putty/ PuTTY]
# '''UNCOMPRESS fastq.gz files!!'''
# '''UNCOMPRESS fastq.gz files!!'''
## cd /lustre/scratch/''user''/proj1
## cd /lustre/scratch/''user''/proj1
Line 27: Line 29:
### add Datasets
### add Datasets
#### Upload option: Upload files from system path
#### Upload option: Upload files from system path
##### Using ssh run {find `pwd`} in /lustrate/scratch/''user''/proj1/*
##### Get a list of absolute path names using one of the following
###### cd /lustre/scratch/''user''/proj1 THEN RUN find `pwd` -name "*.fastq"
###### find /lustre/scratch/''user''/proj1 -name "*.fastq"
##### paste list of absolute path names into URL/Text box in Web Admin GUI
##### paste list of absolute path names into URL/Text box in Web Admin GUI
#### Change "Copy data into Galaxy?" to "Link to files without copying into Galaxy"
#### Change "Copy data into Galaxy?" to "Link to files without copying into Galaxy"

Revision as of 20:56, 23 June 2011

Load and Link approach

transfer and uncompress (slow)

  1. login to cheaha.uabgrid.uab.edu (linux),
  2. create directory for this data set in your scratch dir
    1. mkdir /lustre/scratch/user/proj1
    2. make sure that directory is readable by galaxy user
      1. Chmod og+x /lustre/scratch/user
      2. Chmod og+x /lustre/scratch/user/proj1
  3. transfer the files with SCP
    1. scp *.fastq.gz user@cheaha.uabgrid.uab.edu:/lustre/scratch/user/proj1
    2. I used "Secure Shell Client" for windows, available at UABIT
    3. open source client is PuTTY
  4. UNCOMPRESS fastq.gz files!!
    1. cd /lustre/scratch/user/proj1
      1. find `pwd` -name "*.gz" -exec ksh -c 'qrsh "gzip -d \{}" &' \;
      2. ls -1 *.gz | xargs -L 1 -i_f_ ksh -c 'qrsh -cwd gzip -d _f_ &' \;
      3. gzip -d filename
  5. make sure the files are readable by galaxy user
    1. Chmod og+r /lustre/scratch/user/proj1/*

link into galaxy dataset (fast)

  1. get Admin privileges on galaxy
    1. either get Shantanu to make you admin in Galaxy
    2. or grab someone who is (John, Curtis)
  2. In Galaxy GUI:
    1. admin > Manage Data Libraries > create new library
      1. add Datasets
        1. Upload option: Upload files from system path
          1. Get a list of absolute path names using one of the following
            1. cd /lustre/scratch/user/proj1 THEN RUN find `pwd` -name "*.fastq"
            2. find /lustre/scratch/user/proj1 -name "*.fastq"
          2. paste list of absolute path names into URL/Text box in Web Admin GUI
        2. Change "Copy data into Galaxy?" to "Link to files without copying into Galaxy"
        3. Put something mnemonic in Message box.

link data into a history (fast)

I could then select the datasets and, at bottom of page "For selected datasets: <Import to histories>" and get them into a history so I can compute on them.