Galaxy Importfs: Difference between revisions

From Cheaha
Jump to navigation Jump to search
(added: overview of ftp upload, data transfer methods and data import in Galaxy)
 
(updated importfs directory name from /lustre/importfs/galaxy to /scratch/importfs/galaxy)
Line 5: Line 5:
The first step of data staging in FTP directory happens outside of Galaxy. In second step, a Galaxy user can select files in FTP staging directory and import them into Galaxy. When files in FTP staging directory are imported in Galaxy, their original version in FTP staging directory is deleted by Galaxy application.  
The first step of data staging in FTP directory happens outside of Galaxy. In second step, a Galaxy user can select files in FTP staging directory and import them into Galaxy. When files in FTP staging directory are imported in Galaxy, their original version in FTP staging directory is deleted by Galaxy application.  


Although Galaxy referes to this data upload method as 'FTP upload', any other transfer protocol like SCP or wget can be used to transfer data files in this directory. UAB Galaxy instance supports 'FTP upload' option of Galaxy, where Galaxy application is configured to import files from a '/lustre/importfs/galaxy/$USER' directory. The '/lustre/importfs/galaxy/$USER' location is part of Cheaha cluster, which Galaxy application uses for analysis. Following documentation outlines steps involved in using Galaxy's 'FTP upload' method.
Although Galaxy referes to this data upload method as 'FTP upload', any other transfer protocol like SCP or wget can be used to transfer data files in this directory. UAB Galaxy instance supports 'FTP upload' option of Galaxy, where Galaxy application is configured to import files from a '/scratch/importfs/galaxy/$USER' directory. The '/scratch/importfs/galaxy/$USER' location is part of Cheaha cluster, which Galaxy application uses for analysis. Following documentation outlines steps involved in using Galaxy's 'FTP upload' method.
== Account setup ==
== Account setup ==
# You need a Cheaha cluster account in order to transfer your data files to '/lustre/importfs/galaxy/$USER' directory. Please refer to Cheaha_GettingStarted#Access page for getting Cheaha account.
# You need a Cheaha cluster account in order to transfer your data files to '/scratch/importfs/galaxy/$USER' directory. Please refer to Cheaha_GettingStarted#Access page for getting Cheaha account.
# Your '/lustre/importfs/galaxy/$USER' directory should get configured within 30-minutes after you get your Cheaha account.
# Your '/scratch/importfs/galaxy/$USER' directory should get configured within 30-minutes after you get your Cheaha account.
# Make sure you can login to Cheaha cluster as documented [[Cheaha_GettingStarted#Login]].
# Make sure you can login to Cheaha cluster as documented [[Cheaha_GettingStarted#Login]].


== Data Transfer ==
== Data Transfer ==
Data transfer to Cheaha is briefly described in Cheaha_GettingStarted#Uploading_Data. Following examples cover Galaxy specific use cases to transfer data in '/lustre/importfs/galaxy/$USER' directory.
Data transfer to Cheaha is briefly described in Cheaha_GettingStarted#Uploading_Data. Following examples cover Galaxy specific use cases to transfer data in '/scratch/importfs/galaxy/$USER' directory.
=== Pulling data from external URLs ===
=== Pulling data from external URLs ===
A file accessible via external FTP or HTTP location can be downloaded using FTP or HTTP network downloader tools. Below is an example using wget network downloader tool which should work for both ftp and http URLs.
A file accessible via external FTP or HTTP location can be downloaded using FTP or HTTP network downloader tools. Below is an example using wget network downloader tool which should work for both ftp and http URLs.
<pre>
<pre>
# Change to Directory "/lustre/importfs/galaxy/$USER"
# Change to Directory "/scratch/importfs/galaxy/$USER"
$ cd /lustre/importfs/galaxy/$USER
$ cd /scratch/importfs/galaxy/$USER
$ wget http://pavgi.uabgrid.uab.edu/seq.bam
$ wget http://pavgi.uabgrid.uab.edu/seq.bam


Line 26: Line 26:
[http://en.wikipedia.org/wiki/Secure_copy SCP] is a file transfer protocol to securely copy files between two computer systems. Mac and Linux systems natively comes with SCP support, however, for Windows system you will need to download external client like pscp or WinSCP. Following example shows native SCP application usage from Linux or Mac system. It should work in a similar manner with other SCP clients.
[http://en.wikipedia.org/wiki/Secure_copy SCP] is a file transfer protocol to securely copy files between two computer systems. Mac and Linux systems natively comes with SCP support, however, for Windows system you will need to download external client like pscp or WinSCP. Following example shows native SCP application usage from Linux or Mac system. It should work in a similar manner with other SCP clients.
<pre>
<pre>
# scp <path-to-file-on-local-desktop> <BlazerID>@cheaha.uabgrid.uab.edu:/lustre/importfs/galaxy/<BlazerID>/
# scp <path-to-file-on-local-desktop> <BlazerID>@cheaha.uabgrid.uab.edu:/scratch/importfs/galaxy/<BlazerID>/
$ scp users.csv pavgi@cheaha.uabgrid.uab.edu:/lustre/importfs/galaxy/pavgi/
$ scp users.csv pavgi@cheaha.uabgrid.uab.edu:/scratch/importfs/galaxy/pavgi/
</pre>
</pre>
Above command will transfer a file on a local desktop system to '/lustre/importfs/galaxy/<BlazerID>/' directory on Cheaha.  
Above command will transfer a file on a local desktop system to '/scratch/importfs/galaxy/<BlazerID>/' directory on Cheaha.  


== Data import in Galaxy ==
== Data import in Galaxy ==
Files deposited in '/lustre/importfs/galaxy/$USER' directory can be seen by Galaxy application and they are listed in data upload method as shown below.
Files deposited in '/scratch/importfs/galaxy/$USER' directory can be seen by Galaxy application and they are listed in data upload method as shown below.
[[File:FTP_Staged_Files.png]]
[[File:FTP_Staged_Files.png]]
Above example shows only file staged in importfs directory, however, you can multiple files staged and selected for data import in Galaxy. All files successfully imported in Galaxy will show up right-side history panel and they will be deleted from importfs directory.
Above example shows only file staged in importfs directory, however, you can multiple files staged and selected for data import in Galaxy. All files successfully imported in Galaxy will show up right-side history panel and they will be deleted from importfs directory.

Revision as of 17:12, 7 March 2013

Galaxy provides FTP upload option in the UI to import files from a user's FTP directory. This is a two step process as follows:

  1. Stage data files in FTP or data import data directory
  2. Select files that need to be uploaded into the galaxy

The first step of data staging in FTP directory happens outside of Galaxy. In second step, a Galaxy user can select files in FTP staging directory and import them into Galaxy. When files in FTP staging directory are imported in Galaxy, their original version in FTP staging directory is deleted by Galaxy application.

Although Galaxy referes to this data upload method as 'FTP upload', any other transfer protocol like SCP or wget can be used to transfer data files in this directory. UAB Galaxy instance supports 'FTP upload' option of Galaxy, where Galaxy application is configured to import files from a '/scratch/importfs/galaxy/$USER' directory. The '/scratch/importfs/galaxy/$USER' location is part of Cheaha cluster, which Galaxy application uses for analysis. Following documentation outlines steps involved in using Galaxy's 'FTP upload' method.

Account setup

  1. You need a Cheaha cluster account in order to transfer your data files to '/scratch/importfs/galaxy/$USER' directory. Please refer to Cheaha_GettingStarted#Access page for getting Cheaha account.
  2. Your '/scratch/importfs/galaxy/$USER' directory should get configured within 30-minutes after you get your Cheaha account.
  3. Make sure you can login to Cheaha cluster as documented Cheaha_GettingStarted#Login.

Data Transfer

Data transfer to Cheaha is briefly described in Cheaha_GettingStarted#Uploading_Data. Following examples cover Galaxy specific use cases to transfer data in '/scratch/importfs/galaxy/$USER' directory.

Pulling data from external URLs

A file accessible via external FTP or HTTP location can be downloaded using FTP or HTTP network downloader tools. Below is an example using wget network downloader tool which should work for both ftp and http URLs.

# Change to Directory "/scratch/importfs/galaxy/$USER"
$ cd /scratch/importfs/galaxy/$USER
$ wget http://pavgi.uabgrid.uab.edu/seq.bam

Note, above is a simple wget example, wget has many other useful options and you can take go through them by reading it's manual page using 'man wget' command.

Copying data to Cheaha using SCP

SCP is a file transfer protocol to securely copy files between two computer systems. Mac and Linux systems natively comes with SCP support, however, for Windows system you will need to download external client like pscp or WinSCP. Following example shows native SCP application usage from Linux or Mac system. It should work in a similar manner with other SCP clients.

# scp <path-to-file-on-local-desktop> <BlazerID>@cheaha.uabgrid.uab.edu:/scratch/importfs/galaxy/<BlazerID>/
$ scp users.csv pavgi@cheaha.uabgrid.uab.edu:/scratch/importfs/galaxy/pavgi/

Above command will transfer a file on a local desktop system to '/scratch/importfs/galaxy/<BlazerID>/' directory on Cheaha.

Data import in Galaxy

Files deposited in '/scratch/importfs/galaxy/$USER' directory can be seen by Galaxy application and they are listed in data upload method as shown below. FTP Staged Files.png Above example shows only file staged in importfs directory, however, you can multiple files staged and selected for data import in Galaxy. All files successfully imported in Galaxy will show up right-side history panel and they will be deleted from importfs directory.