Galaxy: Difference between revisions

From Cheaha
Jump to navigation Jump to search
 
(31 intermediate revisions by 4 users not shown)
Line 2: Line 2:


= Overview =
= Overview =
The UAB Galaxy platform for experimental biology and comparative genomics designed to help you analyze multiple alignments, compare genomic annotations, profile metagenomic samples and more from your web browser. This platform is built on [http://main.g2.bx.psu.edu/ Galaxy], backed by the [http://docs.uabgrid.uab.edu/wiki/Cheaha Cheaha compute cluster], and powered by [http://uabgrid.uab.edu/  UABgrid]. Documentation on the UAB installation can be found on the [http://docs.uabgrid.uab.edu/wiki/Galaxy UAB Galaxy wiki].
The UAB Galaxy platform for experimental biology and comparative genomics is designed to help you analyze multiple alignments, compare genomic annotations, profile metagenomic samples and more from your web browser. This platform is built on [http://main.g2.bx.psu.edu/ Galaxy], backed by the [http://docs.uabgrid.uab.edu/wiki/Cheaha Cheaha compute cluster], and powered by [http://uabgrid.uab.edu/  UABgrid].  
 
The primary uses of UAB Galaxy are to provide a simple web interface for NGS (short read sequencing) analysis for genomic and transcriptomic datasets, using tools like BWA, Bowtie, Tophat and Cufflinks, as well as simple sequence manipulation via the EMBOSS toolkit.
 
== Using Galaxy / [[UAB Galaxy Workshop Tutorial|Tutorials]] ==
 
There are numerous [http://wiki.g2.bx.psu.edu/Learn/Screencasts general tutorials] online at the [http://main.g2.bx.psu.edu/ Penn State public Galaxy site] that are worth looking at.
 
There are also several [[UAB Galaxy Workshop Tutorial|UAB tutorials on NGS Analysis with Galaxy]], created for [[2011_HPC_Boot_Camp|HPC Boot Camp 2011]] and a nice talk by Jeremy Goecks during [[2011|Research Computing Day 2011.]]
 
== Support ==
UAB galaxy-users list-serv: [https://listserv.uab.edu/scgi-bin/wa?SUBED1=GALAXY-HELP&A=1 subscribe] [https://listserv.uab.edu/scgi-bin/wa?SUBED1=GALAXY-HELP&A=1 search].
 
UAB galaxy-help list-serv: [mailto:galaxy-help@listserv.uab.edu] to contact admins of the UAB galaxy instance.
 
== Privacy ==
 
Note that your data will be stored on the cluster filesystem, and while not accessible to ordinary users, it can be easily accessed by any of the galaxy or cluster administrators. It is not encrypted. Do not store sensitive information in this system.


= Galaxy@UAB =
= Galaxy@UAB =
The UAB Galaxy instance can be accessed at http://galaxy.uabgrid.uab.edu using BlazerID credentials. The https/ssl access will be available soon. The UAB Galaxy instance is using revision 50e249442c5a from the upstream [https://bitbucket.org/galaxy/galaxy-dist galaxy repository].
The UAB Galaxy instance can be accessed at https://galaxy.uabgrid.uab.edu using BlazerID credentials. No account on the cluster is needed.  
However, the tools installed for galaxy (BWA, etc) can be accessed via the command line if you have an account on the cluster.


[http://docs.uabgrid.uab.edu/wiki/UploadLargeData Temporary Protocol] for moving large sequence files (>2GB) to UAB's galaxy instance (or very large numbers of files).
== Loading Data ==
See [[Galaxy_File_Uploads]].


== Hardware ==
== Available Tools ==  
Behind the scenes the Galaxy server at UAB is powered by [http://docs.uabgrid.uab.edu/wiki/Cheaha Cheaha cluster].  
Following is a partial list highlighting some of the important tools available. Additional tools can be installed upon request. To search for tools already integrated into the Galaxy system, see the [http://toolshed.g2.bx.psu.edu/ Galaxy ToolShed].


== Available Tools ==
Following is a list of tools available through Galaxy platform right now. More description will be added soon.


{| border="1"
{| border="1"
Line 20: Line 37:
|-
|-
! bwa
! bwa
| 0.5.9 || Further information
| 0.5.9-r26 || Align genomic short reads to a reference genome
|-
|-
! bowtie  
! bowtie  
| 0.12.7 || Further Information
| 0.12.7 || Align genomic short reads to a reference genome
|-
! tophat
| 1.4.0 || Align transcriptome short reads to a reference genome
|-
|-
! lastz
! cufflinks, cuffdiff, cuffcompare
| 1.02.00 || Further information
| 1.3.0 || Reconstruct and quantify transcript levels from tophat alignments.
|-
|-
! samtools
! samtools
| 0.1.12a || Further information
| 0.1.12a || Alignment (SAM/BAM file) manipulations
|-
! velvet
| 1.1.03 || Denovo Assembly
|-
! [http://en.wikipedia.org/wiki/EMBOSS EMBOSS]
| 6.3.1  || European Molecular Biology Open Software Suite - sequence manipulation and format conversion
|-
|-
! Legacy blast (megablast)
|}
| 2.2.25 || Further information
 
== Installed Genome Indexes ==
 
You can always use your own genome by uploading the .fasta into your history, but alignments against installed (pre-indexed) genomes run much more quickly. If you need an additional genome installed, please contact [mailto:galaxy-help@vo.uabgrid.uab.edu].
{| border="1"
|+
! dbkey !! Genome !! Accessions
|-
|-
! srma
| hg19 || Human Feb. 2009 (GRCh37/hg19) (hg19)
| 0.1.15 || Further information
|-
|-
! velvet
| hg18 || Human Mar. 2006 (NCBI36/hg18) (hg18)
| 1.1.03 || Further information
|-
|-
! Top Hat
| hg17 || Human May 2004 (NCBI35/hg17) (hg17)
| 1.2.0 || Further information
|-
|-
! Cuff Links
| hg16 || Human July 2003 (NCBI34/hg16) (hg16)
| 1.0.1 || Further information
|-
|-
! Lift Over
| mm10 || Mouse Dec. 2011 (GRCm38/mm10) (mm10)
| 26-Apr-2011 18:26  2.6M || Further information
|-
|-
! R
| mm9 || Mouse July 2007 (NCBI37/mm9) (mm9)
| R-2.13.0 || Further information
|-
|-
! RPy
| mm8 || Mouse Feb. 2006 (NCBI36/mm8) (mm8)
| 1.0.3 || Further information
|-
|-
! ps2pdf
| mm7 || Mouse Aug. 2005 (NCBI35/mm7) (mm7)
| ?? || Further information
|-
|-
! MACS
| mm6
| 1.4.0rc2 || Further information
|-
|-
! taxonomy2tree
| mm5
| r3 || Further information
|-
! sputnik
| NA || Further information
|-
! beam2
| Unknown || Further information
|-
|-
! addscores
|sacCer3 || S. cerevisiae Apr. 2011 (SacCer_Apr2011/sacCer3) (sacCer3)
| NA || Further information
|-
! clustalw
| 2.1 || Further information
|-
! gmaj
| NA || Further information
|-
! gpass
| NA ||  Further information
|-
! HYPHY
| 2.0020110330 beta ||  Further information
|-
! laj
| NA ||  Further information
|-
|-
! pass2
|sacCer2 || S. cerevisiae June 2008 (SGD/sacCer2) (sacCer2)
| NA || Further information
|-
|-
! twoBitToFa
|ce10 || C. elegans Oct. 2010 (WS220/ce10) (ce10)
| NA || Further information
|-
! Perl
| revision 5 version 8 subversion 8 ||  Further information
|-
|-
! perM
|rn5 || Rat Mar. 2012 (RGSC 5.0/rn5) (rn5)
| 3.3 ||  Further information
|-
! GNUPlot
| 4.4.3 ||  Further information
|-
! Numpy
| 1.6.0 ||  Further information
|-
|-
! numexpr
|rn4 || Rat Nov. 2004 (Baylor 3.4/rn4) (rn4)
| 1.4.2 ||  Further information
|-
! hdf5
| 1.8.7 ||  Further information
|-
|-
! Cython
|danRer7 || Zebrafish Jul. 2010 (Zv9/danRer7) (danRer7)
| 0.14.1 || Further information
|-
|-
! Python Tables (tables)
|eschColi_APEC_O1 || Escherichia coli APEC O1 || chr=5082025
| 2.2.1|| Further information
|-
|-
! FastX Toolkit
|eschColi_CFT073 || Escherichia coli CFT073 || chr=5231428
| 0.0.13 || Further information
 
|}
 
== Adding Novel Datasets ==
 
=== Prerequisites ===
 
You should have checked out your own galaxy instance and run it from git as described in http://projects.uabgrid.uab.edu/galaxy/wiki/GalaxyDevelopment
 
=== Introduction ===
 
In order to add a new data set, a series of dependent files must be created and configured on cheaha.uabgrid.uab.edu. The configuration files are located in or under:
* /share/apps/galaxy/galaxy-latest
 
The dependent files should be located in or under:
* /lustre/project/public_datasets
Some of the older data sets are still located in or under:
* /lustre/project/galaxy/public_dataset
 
I describe here setting up a basic genome and include only a description of how to set up 3 critical pieces:
* bwa
* bowtie
* samtools
 
Obviously you will need to have an account on cheaha but you will also need to be in the galaxy-admin group.
 
=== FASTA File ===
Download your FASTA file (doing any conversions needed) and place the file in:
/lustre/project/public_datasets/primary/MY_GENOME/MY_GENOME.fa
 
You should following the naming conventions in tool-data/shared/ucsc/builds.txt as shown below.
* sacCer2 S. cerevisiae June 2008 (SGD/sacCer2) (sacCer2)
 
For instance, if there is already an entry for your build of S. cerevisiae, then use the dbkey (leftmost column) in builds.txt to name MY_GENOME. In this case it would be sacCer2. In some cases (tree shrew, obscure chimeric mouse genomes you construct yourself) there will be no entry. You will need to edit and create one yourself and update builds.txt.
 
Make sure the extension is .fa and don't worry if there are multiple files, they can be concatenated together as shown below.
 
=== Directory Creation (example for sacSer2) ===
 
<pre>
mkdir /lustre/project/public_datasets/primary/sacCer2
cd /lustre/project/public_datasets/primary/sacCer2
wget http://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/chromFa.tar.gz
tar xzvf chromFa.tar.gz
cat chr*.fa 2micron.fa > sacCer2.fa
mkdir /lustre/project/public_datasets/derived/sacCer2
cd /lustre/project/public_datasets/derived/sacCer2
mkdir bowtie
mkdir bowtie/color
mkdir bowtie/base
mkdir bwa
mkdir sam
</pre>
 
 
=== Bowtie Indices (Required for tophat too) ===
Switch to your bowtie base directory and copy in the 2 bowtie scripts from another directory, for example sacCer2.
 
<pre>
cd /lustre/project/public_datasets/derived/MY_GENOME/bowtie/base
cp /lustre/project/public_datasets/derived/sacCer2/bowtie/base/reindex_bowtie_sacCer2.bsh .
cp /lustre/project/public_datasets/derived/sacCer2/bowtie/base/submit_index_job .
</pre>
 
Find/Replace sacCer2 with the name of your genome (MY_GENOME) in both files. Run:
<pre>
./submit_index_job
</pre>
This will submit the job on cheaha.
 
The next step is to add the index to the bowtie configuration files in your personal galaxy project directory. For example to add the treeshrew genome add the line
<pre>
treeshrew62    treeshrew62    Tree Shrew Build 62    /lustre/project/public_datasets/derived/treeshrew62/bowtie/base/treeshrew6
</pre>
 
to the bottom of:
* tool-data/bowtie_indices.loc
and
* tool-data/bowtie_indices_color.loc
 
 
=== BWA Indices ===
Switch to your bwa directory and copy in the 2 bwa scripts from another directory, for example sacCer2.
 
<pre>
cd /lustre/project/public_datasets/derived/MY_GENOME/bwa
cp /lustre/project/public_datasets/derived/sacCer2/bwa/reindex_bwa_sacCer2.bsh .
cp /lustre/project/public_datasets/derived/sacCer2/bwa/submit_index_job .
</pre>
 
Find/Replace sacCer2 with the name of your genome (MY_GENOME) in both files. Run:
<pre>
./submit_index_job
</pre>
This will submit the job on cheaha.
 
 
The next step is to add the index to the bwa configuration files in your personal galaxy project directory. For example to add the treeshrew genome add the line
<pre>
treeshrew62    treeshrew62    Tree Shrew Build 62    /lustre/project/public_datasets/derived/treeshrew62/bwa/Tupaia_belangeri.TREESHREW.62.dna.nonchromosomal.fa
</pre>
 
to the bottom of:
* tool-data/bwa_indices.loc
and
* tool-data/bwa_indices_color.loc
 
=== Sam and FAIDX indices ===
Switch to your sam directory and copy in the 2 bwa scripts from another directory, for example sacCer2.
 
<pre>
cd /lustre/project/public_datasets/derived/MY_GENOME/sam
cp /lustre/project/public_datasets/derived/sacCer2/sam/reindex_sam_sacCer2.bsh .
cp /lustre/project/public_datasets/derived/sacCer2/sam/submit_index_job .
</pre>
 
Find/Replace sacCer2 with the name of your genome (MY_GENOME) in both files. Run:
<pre>
./submit_index_job
</pre>
This will submit the job on cheaha.
 
 
The next step is to add the index to the sam configuration files in your personal galaxy project directory. For example to add the treeshrew genome add the line
<pre>
index  treeshrew62    /lustre/project/public_datasets/derived/treeshrew62/sam/Tupaia_belangeri.TREESHREW.62.dna.nonchromosomal.fa
</pre>
 
to the bottom of:
* tool-data/sam_fa_indicess.loc
 
Also add the index to the srma configuration file (again tree shrew example)
<pre>
treeshrew62    treeshrew62    Tree Shrew Build 62    /lustre/project/public_datasets/derived/treeshrew62/sam/Tupaia_belangeri.TREESHREW.62.dna.nonchromosomal.fa
</pre>
in the file
* tool-data/srma_index.loc
 
=== Final Steps ===
Log in to your local galaxy and see if you can run your job. If it all works out, contact Shantanu and push the changes to production.
 
== Available datasets ==
 
{| border="1"
|+
! Genome !! Downloaded !! Blast Database !! BWA Index !! Bowtie Index !! PerM Index !! Sam Index !! SRMA Dict
|-
|-
! hg19 (by chromosome)
|eschColi_EC4115 || Escherichia coli EC4115 || chr=5572075,plasmid_pO157=94644,plasmid_pEC4115=37452
| Yes || Yes || No || Yes || Yes || Yes || Yes
|-
|-
! Mouse (mm9)
|eschColi_K12 || Escherichia coli K12 || chr=4639675
| Yes || Yes || No || Yes || Yes || Yes || Yes
|-
|-
! Vaccinia Western Reserve
|eschColi_EDL993 || Escherichia coli O157:H7 EDL933 || NC_007414=92077,NC_002655=5528445
| Yes || Yes || Yes || Yes || Yes || Yes || Yes
|-
|-
! Mycoplasma pneumonniae (M129)
|eschColi_O157H7 || Escherichia coli O157:H7 EDL933 || NC_007414=92077,NC_002655=5528445
| Yes || Yes || No || Yes || Yes || Yes || Yes
|-
|-
! Mycoplasma pneumonniae (FH)
|eschColi_TW14359 || Escherichia coli TW14359 || chr=5528136,plasmid_pO157=94601
| Yes || Yes || No || Yes || Yes || Yes || Yes
|-
|-
! Chromosome 11 Mouse Contigs
| Yes || Yes || No || Yes || Yes || Yes || Yes
|}
|}


= Public instance =
== Additional Genomes that can be quickly installed ==
A public instance of Galaxy maintained by Penn State University is at http://usegalaxy.org/
These are pre-indexed genomes we can easily download from Penn State's [http://wiki.galaxyproject.org/Admin/Data%20Integration Galaxy Data-Cache]
 
=== Organisms ===
* AaegL1
* Acropora_digitifera
* AgamP3
* Arabidopsis_thaliana_TAIR10
* Arabidopsis_thaliana_TAIR9
* Araly1
* Bombyx_mori_p50T_2.0
* CpipJ1
* Homo_sapiens_AK1
* Homo_sapiens_nuHg19_mtrCRS
* Hydra_JCVI
* IscaW1
* PhumU1
* Physcomitrella_patens_patens
* Ptrichocarpa_156
* Saccharomyces_cerevisiae_S288C_SGD2010
* Schizosaccharomyces_pombe_1.1
* Spur_v2.6
* Sscrofa9.58
* Tcacao_1.0
* Tcas_3.0
* Theobroma_cocoa
* Zea_mays_B73_RefGen_v2
* ailMel1
* anoCar1
* anoCar2
* anoGam1
* apiMel1
* apiMel2
* apiMel3
* apiMel4.5
* aplCal1
* bighorn_sheep
* borEut13
* bosTau2
* bosTau3
* bosTau4
* bosTau5
* bosTau6
* bosTau7
* bosTauMd3
* braFlo1
* caeJap1
* caePb1
* caePb2
* caeRem2
* caeRem3
* calJac1
* calJac3
* canFam1
* canFam2
* canFam3
* cavPor2
* cavPor3
* cb3
* ce10
* ce2
* ce3
* ce4
* ce5
* ce6
* ce7
* ce8
* ce9
* choHof1
* chrPic1
* ci2
* danRer2
* danRer3
* danRer4
* danRer5
* danRer6
* danRer7
* dasNov1
* dasNov2
* dipOrd1
* dm1
* dm2
* dm3
* dp3
* dp4
* droAna1
* droAna2
* droAna3
* droEre1
* droEre2
* droGri1
* droGri2
* droMoj1
* droMoj2
* droMoj3
* droPer1
* droSec1
* droSim1
* droVir1
* droVir2
* droVir3
* droWil1
* droYak1
* droYak2
* echTel1
* emf
* equCab1
* equCab2
* equCab2_chrM
* eriEur1
* felCat3
* felCat4
* fr1
* fr2
* fr3
* galGal2
* galGal3
* galGal4
* gasAcu1
* geoFor1
* gorGor1
* gorGor3
* hetGla1
* hetGla2
* hg16
* hg17
* hg18
* hg19
* hg_g1k_v37
* lMaj5
* lengths
* loxAfr3
* loxAfr4
* macEug1
* melGal1
* melUnd1
* micMur1
* mm10
* mm5
* mm6
* mm7
* mm8
* mm9
* monDom4
* monDom5
* myoLuc1
* myoLuc2
* nomLeu1
* nomLeu2
* ochPri2
* ornAna1
* oryCun1
* oryCun2
* oryLat1
* oryLat2
* oryza_sativa_japonica_nipponbare_IRGSP4.0
* otoGar1
* oviAri1
* pUC18
* panTro1
* panTro2
* panTro3
* papHam1
* petMar1
* phiX
* ponAbe2
* priPac1
* rheMac2
* rheMac3
* rn3
* rn4
* rn5
* sacCer1
* sacCer2
* sacCer3
* sarHar1
* sorAra1
* strPur2
* strPur3
* susScr2
* susScr3
* taeGut1
* tarSyr1
* tetNig1
* tetNig2
* triCas2
* tupBel1
* venter1
* xenTro1
* xenTro2
* xenTro3
 
=== Microbes ===
 
* Staphylococcus_aureus_aureus_USA300_FPR3757
* Xanthomonas_oryzae_PXO99A
* acidBact_ELLIN345
* acidCell_11B
* acidCryp_JF_5
* acidJS42
* acinSp_ADP1
* actiPleu_L20
* aerPer1
* aeroHydr_ATCC7966
* alcaBork_SK2
* alkaEhrl_MLHE_1
* anabVari_ATCC29413
* anaeDeha_2CP_C
* anapMarg_ST_MARIES
* aquiAeol
* archFulg1
* arthFB24
* azoaSp_EBN1
* azorCaul2
* baciAnth_AMES
* baciHalo
* baciSubt
* bactThet_VPI_5482
* bartHens_HOUSTON_1
* baumCica_HOMALODISCA
* bdelBact
* bifiLong
* blocFlor
* bordBron
* borrBurg
* bradJapo
* brucMeli
* buchSp
* burk383
* burkCeno_AU_1054
* burkCeno_HI2424
* burkCepa_AMMD
* burkMall_ATCC23344
* burkPseu_1106A
* burkThai_E264
* burkViet_G4
* burkXeno_LB400
* caldMaqu1
* caldSacc_DSM8903
* campFetu_82_40
* campJeju
* campJeju_81_176
* campJeju_RM1221
* candCars_RUDDII
* candPela_UBIQUE_HTCC1
* carbHydr_Z_2901
* caulCres
* chlaPneu_CWL029
* chlaTrac
* chloChlo_CAD3
* chloTepi_TLS
* chroSale_DSM3043
* chroViol
* clavMich_NCPPB_382
* colwPsyc_34H
* coryEffi_YS_314
* coxiBurn
* cytoHutc_ATCC33406
* dechArom_RCB
* dehaEthe_195
* deinGeot_DSM11300
* deinRadi
* desuHafn_Y51
* desuPsyc_LSV54
* desuRedu_MI_1
* desuVulg_HILDENBOROUG
* dichNodo_VCS1703A
* ehrlRumi_WELGEVONDEN
* ente638
* enteFaec_V583
* erwiCaro_ATROSEPTICA
* erytLito_HTCC2594
* eschColi_APEC_O1
* eschColi_CFT073
* eschColi_EC4115
* eschColi_EDL933
* eschColi_K12
* eschColi_MG1655
* eschColi_O157H7
* eschColi_TW14359
* flavJohn_UW101
* franCcI3
* franTula_TULARENSIS
* fusoNucl
* geobKaus_HTA426
* geobMeta_GS15
* geobSulf
* geobTher_NG80_2
* geobUran_RF4
* gloeViol
* glucOxyd_621H
* gramFors_KT0803
* granBeth_CGDNIH1
* haemInfl_KW20
* haemSomn_129PT
* haheChej_KCTC_2396
* halMar1
* haloHalo1
* haloHalo_SL1
* haloWals1
* heliAcin_SHEEBA
* heliHepa
* heliPylo_26695
* heliPylo_HPAG1
* heliPylo_J99
* hermArse
* hypeButy1
* hyphNept_ATCC15444
* idioLoih_L2TR
* jannCCS1
* lactLact
* lactPlan
* lactSali_UCC118
* lawsIntr_PHE_MN1_00
* legiPneu_PHILADELPHIA
* leifXyli_XYLI_CTCB0
* leptInte
* leucMese_ATCC8293
* listInno
* magnMC1
* magnMagn_AMB_1
* mannSucc_MBEL55E
* mariAqua_VT8
* mariMari_MCS10
* mculMari1
* mesoFlor_L1
* mesoLoti
* metAce1
* metMar1
* metaSedu
* methAeol1
* methBark1
* methBoon1
* methBurt2
* methCaps_BATH
* methFlag_KT
* methHung1
* methJann1
* methKand1
* methLabrZ_1
* methMari_C5_1
* methMari_C7
* methMaze1
* methPetr_PM1
* methSmit1
* methStad1
* methTher1
* methTherPT1
* methVann1
* moorTher_ATCC39073
* mycoGeni
* mycoTube_H37RV
* myxoXant_DK_1622
* nanEqu1
* natrPhar1
* neisGono_FA1090_1
* neisMeni_FAM18_1
* neisMeni_MC58_1
* neisMeni_Z2491_1
* neorSenn_MIYAYAMA
* nitrEuro
* nitrMult_ATCC25196
* nitrOcea_ATCC19707
* nitrWino_NB_255
* nocaFarc_IFM10152
* nocaJS61
* nostSp
* novoArom_DSM12444
* oceaIhey
* oenoOeni_PSU_1
* onioYell_PHYTOPLASMA
* orieTsut_BORYONG
* paraDeni_PD1222
* paraSp_UWE25
* pastMult
* pediPent_ATCC25745
* peloCarb
* peloLute_DSM273
* peloTher_SI
* photLumi
* photProf_SS9
* picrTorr1
* pireSp
* polaJS66
* polyQLWP
* porpGing_W83
* procMari_CCMP1375
* propAcne_KPA171202
* pseuAeru
* pseuHalo_TAC125
* psycArct_273_4
* psycIngr_37
* pyrAby1
* pyrAer1
* pyrFur2
* pyrHor1
* pyroArse1
* pyroCali1
* pyroIsla1
* ralsEutr_JMP134
* ralsSola
* rhizEtli_CFN_42
* rhodPalu_CGA009
* rhodRHA1
* rhodRubr_ATCC11170
* rhodSpha_2_4_1
* rickBell_RML369_C
* roseDeni_OCH_114
* rubrXyla_DSM9941
* saccDegr_2_40
* saccEryt_NRRL_2338
* saliRube_DSM13855
* saliTrop_CNB_440
* salmEnte_PARATYPI_ATC
* salmTyph
* salmTyph_TY2
* shewANA3
* shewAmaz
* shewBalt
* shewDeni
* shewFrig
* shewLoihPV4
* shewMR4
* shewMR7
* shewOnei
* shewPutrCN32
* shewW318
* shigFlex_2A
* siliPome_DSS_3
* sinoMeli
* sodaGlos_MORSITANS
* soliUsit_ELLIN6076
* sphiAlas_RB2256
* stapAure_MU50
* stapMari1
* streCoel
* strePyog_M1_GAS
* sulSol1
* sulfAcid1
* sulfToko1
* symbTher_IAM14863
* synePCC6
* syneSp_WH8102
* syntAcid_SB
* syntFuma_MPOB
* syntWolf_GOETTINGEN
* therAcid1
* therElon
* therFusc_YX
* therKoda1
* therMari
* therPend1
* therPetr_RKU_1
* therTeng
* therTher_HB27
* therTher_HB8
* therVolc1
* thioCrun_XCL_2
* thioDeni_ATCC25259
* thioDeni_ATCC33889
* trepPall
* tricEryt_IMS101
* tropWhip_TW08_27
* uncuMeth_RCI
* ureaUrea
* vermEise_EF01_2
* vibrChol1
* vibrChol_O395_1
* vibrFisc_ES114_1
* vibrPara1
* vibrVuln_CMCP6_1
* vibrVuln_YJ016_1
* wiggBrev
* wolbEndo_OF_DROSOPHIL
* woliSucc
* xantCamp
* xyleFast
* yersPest_CO92
* zymoMobi_ZM4


= Support =
In order to facilitate interaction among UAB Galaxy users, share experience, and provide peer-support we have established a galaxy-users group. To join this group and participate in email discussions please subscribe to the [http://vo.uabgrid.uab.edu/sympa/subscribe/galaxy-user galaxy-user group]. On-line archives of these discussions are available [https://vo.uabgrid.uab.edu/sympa/arc/galaxy-user here]. Please note, the email discussions are a public forum. You are advised to only post information you are authorized to share and comfortable with being public.


= References =
[[Category:Software]][[Category:Bioinformatics]][[Category:NGS]]
* [http://bitbucket.org/galaxy/galaxy-central/wiki/Home Galaxy Wiki]
* [http://bitbucket.org/galaxy/galaxy-central/wiki/ImplementationInfo Galaxy Architecture]
* [http://bitbucket.org/galaxy/galaxy-central/wiki/GetGalaxy Get Galaxy]
* [http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer Galaxy Advanced Config]

Latest revision as of 14:48, 13 December 2017

Overview

The UAB Galaxy platform for experimental biology and comparative genomics is designed to help you analyze multiple alignments, compare genomic annotations, profile metagenomic samples and more from your web browser. This platform is built on Galaxy, backed by the Cheaha compute cluster, and powered by UABgrid.

The primary uses of UAB Galaxy are to provide a simple web interface for NGS (short read sequencing) analysis for genomic and transcriptomic datasets, using tools like BWA, Bowtie, Tophat and Cufflinks, as well as simple sequence manipulation via the EMBOSS toolkit.

Using Galaxy / Tutorials

There are numerous general tutorials online at the Penn State public Galaxy site that are worth looking at.

There are also several UAB tutorials on NGS Analysis with Galaxy, created for HPC Boot Camp 2011 and a nice talk by Jeremy Goecks during Research Computing Day 2011.

Support

UAB galaxy-users list-serv: subscribe search.

UAB galaxy-help list-serv: [1] to contact admins of the UAB galaxy instance.

Privacy

Note that your data will be stored on the cluster filesystem, and while not accessible to ordinary users, it can be easily accessed by any of the galaxy or cluster administrators. It is not encrypted. Do not store sensitive information in this system.

Galaxy@UAB

The UAB Galaxy instance can be accessed at https://galaxy.uabgrid.uab.edu using BlazerID credentials. No account on the cluster is needed. However, the tools installed for galaxy (BWA, etc) can be accessed via the command line if you have an account on the cluster.

Loading Data

See Galaxy_File_Uploads.

Available Tools

Following is a partial list highlighting some of the important tools available. Additional tools can be installed upon request. To search for tools already integrated into the Galaxy system, see the Galaxy ToolShed.


Software Version Information
bwa 0.5.9-r26 Align genomic short reads to a reference genome
bowtie 0.12.7 Align genomic short reads to a reference genome
tophat 1.4.0 Align transcriptome short reads to a reference genome
cufflinks, cuffdiff, cuffcompare 1.3.0 Reconstruct and quantify transcript levels from tophat alignments.
samtools 0.1.12a Alignment (SAM/BAM file) manipulations
velvet 1.1.03 Denovo Assembly
EMBOSS 6.3.1 European Molecular Biology Open Software Suite - sequence manipulation and format conversion

Installed Genome Indexes

You can always use your own genome by uploading the .fasta into your history, but alignments against installed (pre-indexed) genomes run much more quickly. If you need an additional genome installed, please contact [2].

dbkey Genome Accessions
hg19 Human Feb. 2009 (GRCh37/hg19) (hg19)
hg18 Human Mar. 2006 (NCBI36/hg18) (hg18)
hg17 Human May 2004 (NCBI35/hg17) (hg17)
hg16 Human July 2003 (NCBI34/hg16) (hg16)
mm10 Mouse Dec. 2011 (GRCm38/mm10) (mm10)
mm9 Mouse July 2007 (NCBI37/mm9) (mm9)
mm8 Mouse Feb. 2006 (NCBI36/mm8) (mm8)
mm7 Mouse Aug. 2005 (NCBI35/mm7) (mm7)
mm6
mm5
sacCer3 S. cerevisiae Apr. 2011 (SacCer_Apr2011/sacCer3) (sacCer3)
sacCer2 S. cerevisiae June 2008 (SGD/sacCer2) (sacCer2)
ce10 C. elegans Oct. 2010 (WS220/ce10) (ce10)
rn5 Rat Mar. 2012 (RGSC 5.0/rn5) (rn5)
rn4 Rat Nov. 2004 (Baylor 3.4/rn4) (rn4)
danRer7 Zebrafish Jul. 2010 (Zv9/danRer7) (danRer7)
eschColi_APEC_O1 Escherichia coli APEC O1 chr=5082025
eschColi_CFT073 Escherichia coli CFT073 chr=5231428
eschColi_EC4115 Escherichia coli EC4115 chr=5572075,plasmid_pO157=94644,plasmid_pEC4115=37452
eschColi_K12 Escherichia coli K12 chr=4639675
eschColi_EDL993 Escherichia coli O157:H7 EDL933 NC_007414=92077,NC_002655=5528445
eschColi_O157H7 Escherichia coli O157:H7 EDL933 NC_007414=92077,NC_002655=5528445
eschColi_TW14359 Escherichia coli TW14359 chr=5528136,plasmid_pO157=94601

Additional Genomes that can be quickly installed

These are pre-indexed genomes we can easily download from Penn State's Galaxy Data-Cache

Organisms

  • AaegL1
  • Acropora_digitifera
  • AgamP3
  • Arabidopsis_thaliana_TAIR10
  • Arabidopsis_thaliana_TAIR9
  • Araly1
  • Bombyx_mori_p50T_2.0
  • CpipJ1
  • Homo_sapiens_AK1
  • Homo_sapiens_nuHg19_mtrCRS
  • Hydra_JCVI
  • IscaW1
  • PhumU1
  • Physcomitrella_patens_patens
  • Ptrichocarpa_156
  • Saccharomyces_cerevisiae_S288C_SGD2010
  • Schizosaccharomyces_pombe_1.1
  • Spur_v2.6
  • Sscrofa9.58
  • Tcacao_1.0
  • Tcas_3.0
  • Theobroma_cocoa
  • Zea_mays_B73_RefGen_v2
  • ailMel1
  • anoCar1
  • anoCar2
  • anoGam1
  • apiMel1
  • apiMel2
  • apiMel3
  • apiMel4.5
  • aplCal1
  • bighorn_sheep
  • borEut13
  • bosTau2
  • bosTau3
  • bosTau4
  • bosTau5
  • bosTau6
  • bosTau7
  • bosTauMd3
  • braFlo1
  • caeJap1
  • caePb1
  • caePb2
  • caeRem2
  • caeRem3
  • calJac1
  • calJac3
  • canFam1
  • canFam2
  • canFam3
  • cavPor2
  • cavPor3
  • cb3
  • ce10
  • ce2
  • ce3
  • ce4
  • ce5
  • ce6
  • ce7
  • ce8
  • ce9
  • choHof1
  • chrPic1
  • ci2
  • danRer2
  • danRer3
  • danRer4
  • danRer5
  • danRer6
  • danRer7
  • dasNov1
  • dasNov2
  • dipOrd1
  • dm1
  • dm2
  • dm3
  • dp3
  • dp4
  • droAna1
  • droAna2
  • droAna3
  • droEre1
  • droEre2
  • droGri1
  • droGri2
  • droMoj1
  • droMoj2
  • droMoj3
  • droPer1
  • droSec1
  • droSim1
  • droVir1
  • droVir2
  • droVir3
  • droWil1
  • droYak1
  • droYak2
  • echTel1
  • emf
  • equCab1
  • equCab2
  • equCab2_chrM
  • eriEur1
  • felCat3
  • felCat4
  • fr1
  • fr2
  • fr3
  • galGal2
  • galGal3
  • galGal4
  • gasAcu1
  • geoFor1
  • gorGor1
  • gorGor3
  • hetGla1
  • hetGla2
  • hg16
  • hg17
  • hg18
  • hg19
  • hg_g1k_v37
  • lMaj5
  • lengths
  • loxAfr3
  • loxAfr4
  • macEug1
  • melGal1
  • melUnd1
  • micMur1
  • mm10
  • mm5
  • mm6
  • mm7
  • mm8
  • mm9
  • monDom4
  • monDom5
  • myoLuc1
  • myoLuc2
  • nomLeu1
  • nomLeu2
  • ochPri2
  • ornAna1
  • oryCun1
  • oryCun2
  • oryLat1
  • oryLat2
  • oryza_sativa_japonica_nipponbare_IRGSP4.0
  • otoGar1
  • oviAri1
  • pUC18
  • panTro1
  • panTro2
  • panTro3
  • papHam1
  • petMar1
  • phiX
  • ponAbe2
  • priPac1
  • rheMac2
  • rheMac3
  • rn3
  • rn4
  • rn5
  • sacCer1
  • sacCer2
  • sacCer3
  • sarHar1
  • sorAra1
  • strPur2
  • strPur3
  • susScr2
  • susScr3
  • taeGut1
  • tarSyr1
  • tetNig1
  • tetNig2
  • triCas2
  • tupBel1
  • venter1
  • xenTro1
  • xenTro2
  • xenTro3

Microbes

  • Staphylococcus_aureus_aureus_USA300_FPR3757
  • Xanthomonas_oryzae_PXO99A
  • acidBact_ELLIN345
  • acidCell_11B
  • acidCryp_JF_5
  • acidJS42
  • acinSp_ADP1
  • actiPleu_L20
  • aerPer1
  • aeroHydr_ATCC7966
  • alcaBork_SK2
  • alkaEhrl_MLHE_1
  • anabVari_ATCC29413
  • anaeDeha_2CP_C
  • anapMarg_ST_MARIES
  • aquiAeol
  • archFulg1
  • arthFB24
  • azoaSp_EBN1
  • azorCaul2
  • baciAnth_AMES
  • baciHalo
  • baciSubt
  • bactThet_VPI_5482
  • bartHens_HOUSTON_1
  • baumCica_HOMALODISCA
  • bdelBact
  • bifiLong
  • blocFlor
  • bordBron
  • borrBurg
  • bradJapo
  • brucMeli
  • buchSp
  • burk383
  • burkCeno_AU_1054
  • burkCeno_HI2424
  • burkCepa_AMMD
  • burkMall_ATCC23344
  • burkPseu_1106A
  • burkThai_E264
  • burkViet_G4
  • burkXeno_LB400
  • caldMaqu1
  • caldSacc_DSM8903
  • campFetu_82_40
  • campJeju
  • campJeju_81_176
  • campJeju_RM1221
  • candCars_RUDDII
  • candPela_UBIQUE_HTCC1
  • carbHydr_Z_2901
  • caulCres
  • chlaPneu_CWL029
  • chlaTrac
  • chloChlo_CAD3
  • chloTepi_TLS
  • chroSale_DSM3043
  • chroViol
  • clavMich_NCPPB_382
  • colwPsyc_34H
  • coryEffi_YS_314
  • coxiBurn
  • cytoHutc_ATCC33406
  • dechArom_RCB
  • dehaEthe_195
  • deinGeot_DSM11300
  • deinRadi
  • desuHafn_Y51
  • desuPsyc_LSV54
  • desuRedu_MI_1
  • desuVulg_HILDENBOROUG
  • dichNodo_VCS1703A
  • ehrlRumi_WELGEVONDEN
  • ente638
  • enteFaec_V583
  • erwiCaro_ATROSEPTICA
  • erytLito_HTCC2594
  • eschColi_APEC_O1
  • eschColi_CFT073
  • eschColi_EC4115
  • eschColi_EDL933
  • eschColi_K12
  • eschColi_MG1655
  • eschColi_O157H7
  • eschColi_TW14359
  • flavJohn_UW101
  • franCcI3
  • franTula_TULARENSIS
  • fusoNucl
  • geobKaus_HTA426
  • geobMeta_GS15
  • geobSulf
  • geobTher_NG80_2
  • geobUran_RF4
  • gloeViol
  • glucOxyd_621H
  • gramFors_KT0803
  • granBeth_CGDNIH1
  • haemInfl_KW20
  • haemSomn_129PT
  • haheChej_KCTC_2396
  • halMar1
  • haloHalo1
  • haloHalo_SL1
  • haloWals1
  • heliAcin_SHEEBA
  • heliHepa
  • heliPylo_26695
  • heliPylo_HPAG1
  • heliPylo_J99
  • hermArse
  • hypeButy1
  • hyphNept_ATCC15444
  • idioLoih_L2TR
  • jannCCS1
  • lactLact
  • lactPlan
  • lactSali_UCC118
  • lawsIntr_PHE_MN1_00
  • legiPneu_PHILADELPHIA
  • leifXyli_XYLI_CTCB0
  • leptInte
  • leucMese_ATCC8293
  • listInno
  • magnMC1
  • magnMagn_AMB_1
  • mannSucc_MBEL55E
  • mariAqua_VT8
  • mariMari_MCS10
  • mculMari1
  • mesoFlor_L1
  • mesoLoti
  • metAce1
  • metMar1
  • metaSedu
  • methAeol1
  • methBark1
  • methBoon1
  • methBurt2
  • methCaps_BATH
  • methFlag_KT
  • methHung1
  • methJann1
  • methKand1
  • methLabrZ_1
  • methMari_C5_1
  • methMari_C7
  • methMaze1
  • methPetr_PM1
  • methSmit1
  • methStad1
  • methTher1
  • methTherPT1
  • methVann1
  • moorTher_ATCC39073
  • mycoGeni
  • mycoTube_H37RV
  • myxoXant_DK_1622
  • nanEqu1
  • natrPhar1
  • neisGono_FA1090_1
  • neisMeni_FAM18_1
  • neisMeni_MC58_1
  • neisMeni_Z2491_1
  • neorSenn_MIYAYAMA
  • nitrEuro
  • nitrMult_ATCC25196
  • nitrOcea_ATCC19707
  • nitrWino_NB_255
  • nocaFarc_IFM10152
  • nocaJS61
  • nostSp
  • novoArom_DSM12444
  • oceaIhey
  • oenoOeni_PSU_1
  • onioYell_PHYTOPLASMA
  • orieTsut_BORYONG
  • paraDeni_PD1222
  • paraSp_UWE25
  • pastMult
  • pediPent_ATCC25745
  • peloCarb
  • peloLute_DSM273
  • peloTher_SI
  • photLumi
  • photProf_SS9
  • picrTorr1
  • pireSp
  • polaJS66
  • polyQLWP
  • porpGing_W83
  • procMari_CCMP1375
  • propAcne_KPA171202
  • pseuAeru
  • pseuHalo_TAC125
  • psycArct_273_4
  • psycIngr_37
  • pyrAby1
  • pyrAer1
  • pyrFur2
  • pyrHor1
  • pyroArse1
  • pyroCali1
  • pyroIsla1
  • ralsEutr_JMP134
  • ralsSola
  • rhizEtli_CFN_42
  • rhodPalu_CGA009
  • rhodRHA1
  • rhodRubr_ATCC11170
  • rhodSpha_2_4_1
  • rickBell_RML369_C
  • roseDeni_OCH_114
  • rubrXyla_DSM9941
  • saccDegr_2_40
  • saccEryt_NRRL_2338
  • saliRube_DSM13855
  • saliTrop_CNB_440
  • salmEnte_PARATYPI_ATC
  • salmTyph
  • salmTyph_TY2
  • shewANA3
  • shewAmaz
  • shewBalt
  • shewDeni
  • shewFrig
  • shewLoihPV4
  • shewMR4
  • shewMR7
  • shewOnei
  • shewPutrCN32
  • shewW318
  • shigFlex_2A
  • siliPome_DSS_3
  • sinoMeli
  • sodaGlos_MORSITANS
  • soliUsit_ELLIN6076
  • sphiAlas_RB2256
  • stapAure_MU50
  • stapMari1
  • streCoel
  • strePyog_M1_GAS
  • sulSol1
  • sulfAcid1
  • sulfToko1
  • symbTher_IAM14863
  • synePCC6
  • syneSp_WH8102
  • syntAcid_SB
  • syntFuma_MPOB
  • syntWolf_GOETTINGEN
  • therAcid1
  • therElon
  • therFusc_YX
  • therKoda1
  • therMari
  • therPend1
  • therPetr_RKU_1
  • therTeng
  • therTher_HB27
  • therTher_HB8
  • therVolc1
  • thioCrun_XCL_2
  • thioDeni_ATCC25259
  • thioDeni_ATCC33889
  • trepPall
  • tricEryt_IMS101
  • tropWhip_TW08_27
  • uncuMeth_RCI
  • ureaUrea
  • vermEise_EF01_2
  • vibrChol1
  • vibrChol_O395_1
  • vibrFisc_ES114_1
  • vibrPara1
  • vibrVuln_CMCP6_1
  • vibrVuln_YJ016_1
  • wiggBrev
  • wolbEndo_OF_DROSOPHIL
  • woliSucc
  • xantCamp
  • xyleFast
  • yersPest_CO92
  • zymoMobi_ZM4