Box.com bulk transfer

From UABgrid Documentation
(Difference between revisions)
Jump to: navigation, search
(Warning: time stamps)
Line 60: Line 60:
  
 
Filezilla on create times
 
Filezilla on create times
request closed, no plans to fix: https://trac.filezilla-project.org/ticket/2347
+
* request closed, no plans to fix: https://trac.filezilla-project.org/ticket/2347
  
* Bulleted list item
+
== lftp mirror -R example ==
* Bulleted list item
+
 
 +
lftp mirror
 +
* "mirror" copies directory hierarchies DOWN from box.com to local
 +
* "mirror -R" copies directory hierarchies UP from local to box.com
 +
 
 +
error handling
 +
* the box server frequently looses connection (fails) on particular files
 +
* just re-run the "mirror -R" and it will upload only new/failed files..
 +
 
 +
=== Interactive lftp ===
 +
{| class="wikitable" border="1"
 +
|-
 +
| lftp ftp.box.com<br />
 +
> user ''BLAZERID''@uab.edu ''users_BOX_external_password''<br />
 +
> mirror --parallel=10 -R ''local_src_dir'' ''box_dest_dir''
 +
|}
 +
 
 +
 
 +
=== scripted lftp ===
 +
{| class="wikitable" border="1"
 +
|-
 +
| cat > box_test.lftp << EOF<br />
 +
open ftp.box.com<br />
 +
user ''BLAZERID''@uab.edu ''users_BOX_external_password''<br />
 +
mirror --parallel=10 -R ''local_src_dir'' /''box_dest_dir''<br />
 +
EOF<br />
 +
chmod 700 box_test.lftp<br />
 +
lftp -f box_test.lftp ; echo lftp_RC=$?
 +
|}

Revision as of 12:30, 16 December 2015

UAB has an Enterprise contract with Box.com, which is currently in BETA.

This page describes what we have learned about doing bulk-transfers of data.

Contents

Warning: size limitations

Box.com claims to have a 5G max filesize limit.

There was a rumor this would be increased in early 2016. Offically, this is all we know: https://community.box.com/t5/Managing-Your-Content/What-s-the-maximum-file-size-I-can-upload/ta-p/307

If you need to work around this, you can use the Linux "split" utility

# chop file into 4G pieces

split \
--bytes=4000m \
big_file.fastq.gz \
big_file.fastq.gz.split4g.

# record checksums of original and chunks

md5sum \
big_file.fastq.gz \
big_file.fastq.gz.split4g.* \
> big_file.fastq.gz.md5

Warning: time stamps

When using an FTP client to transfer data up, it is easy to lose both modification and creation timestamps. In particular, many clients will (optionally) preserve modification time, but few will (optionally) preserve creation date.

FTP client platform modification creation
SmartFTP GUI/Win Only/$$ yes can be enabled
lftp cmd_line/linux/free yes no
fileZilla GUI/linux+win/free can be enabled no
ftp_ssl cmd_line/linux/free yes no

Filezilla on create times

lftp mirror -R example

lftp mirror

  • "mirror" copies directory hierarchies DOWN from box.com to local
  • "mirror -R" copies directory hierarchies UP from local to box.com

error handling

  • the box server frequently looses connection (fails) on particular files
  • just re-run the "mirror -R" and it will upload only new/failed files..

Interactive lftp

lftp ftp.box.com

> user BLAZERID@uab.edu users_BOX_external_password
> mirror --parallel=10 -R local_src_dir box_dest_dir


scripted lftp

cat > box_test.lftp << EOF

open ftp.box.com
user BLAZERID@uab.edu users_BOX_external_password
mirror --parallel=10 -R local_src_dir /box_dest_dir
EOF
chmod 700 box_test.lftp
lftp -f box_test.lftp ; echo lftp_RC=$?

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox