Box.com bulk transfer: Difference between revisions
No edit summary |
|||
(12 intermediate revisions by 2 users not shown) | |||
Line 5: | Line 5: | ||
== Warning: size limitations == | == Warning: size limitations == | ||
Box.com claims to have a 5G max filesize limit, | Box.com claims to have a '''''5G max filesize''''' limit | ||
* Now '''''15G max filesize''''' (2/18/016; private email), but "that file size limit is still considered to be in a beta phase" | |||
There was a rumor this would be increased in early 2016. | |||
Offically, this is all we know: https://community.box.com/t5/Managing-Your-Content/What-s-the-maximum-file-size-I-can-upload/ta-p/307 | |||
If you need to work around this, you can use the Linux "[http://ss64.com/bash/split.html split]" utility | If you need to work around this, you can use the Linux "[http://ss64.com/bash/split.html split]" utility | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
| split \<br /> | | # chop file into 4G pieces <br /> | ||
split \<br /> | |||
--bytes=4000m \<br /> | --bytes=4000m \<br /> | ||
big_file.fastq.gz \<br /> | big_file.fastq.gz \<br /> | ||
big_file.fastq.gz.split4g. | big_file.fastq.gz.split4g.<br /> | ||
|- | |||
| # record checksums of original and chunks<br /> | |||
md5sum \<br /> | |||
big_file.fastq.gz \<br /> | |||
big_file.fastq.gz.split4g.* \<br /> | |||
> big_file.fastq.gz.md5 | |||
|} | |} | ||
Line 23: | Line 36: | ||
|- | |- | ||
! FTP client | ! FTP client | ||
! cost | |||
! platform | ! platform | ||
! | ! preserve mod_date | ||
! | ! preserve create_date | ||
|- | |||
| lftp | |||
| free | |||
| linux/cmd_line | |||
| yes | |||
| no | |||
|- | |- | ||
| SmartFTP | | SmartFTP | ||
| | | $$ | ||
| Win Only/GUI | |||
| yes | | yes | ||
| can be enabled | | can be enabled | ||
|- | |- | ||
| fileZilla | | fileZilla | ||
| | | free | ||
| linux & win/GUI_only | |||
| can be enabled | | can be enabled | ||
| no | | [https://trac.filezilla-project.org/ticket/2347 no] | ||
|- | |- | ||
| ftp_ssl | | ftp_ssl | ||
| | | free | ||
| linux/cmd_line | |||
| yes | | yes | ||
| no | | no | ||
|} | |} | ||
Filezilla on create times | |||
* request closed, no plans to fix: https://trac.filezilla-project.org/ticket/2347 | |||
== Warning: Shared-to-you folders can't be moved == | |||
If someone creates a folder and shares it to you (as an reader, editor, co-owner, etc), it will live in your top level directory, and you will NOT be able to move it into any subfolder. | |||
"Currently, users can't rearrange their own view of folders they are invited to collaborate within. As you note, when someone invites you to collaborate in a folder that you have never had access to before, you will see that folder on your root level." [box.com] | |||
'''Workaround''': if your collaborator makes you the full owner of the folder, then you will be able to move it. | |||
'''Fix timeframe''': "We've heard requests that people be able to rearrange their views before, and this is being considered as part of a larger product experience change next year" [box.com] | |||
== lftp mirror -R examples (UPload) == | |||
lftp mirror | |||
* "mirror" copies directory hierarchies DOWN from box.com to local | |||
* "mirror -R" copies directory hierarchies UP from local to box.com | |||
error handling | |||
* the box server frequently looses connection (fails) on particular files | |||
* just re-run the "mirror -R" and it will upload only new/failed files.. | |||
=== Interactive lftp === | |||
{| class="wikitable" border="1" | |||
|- | |||
| lftp ftp.box.com<br /> | |||
> user ''BLAZERID''@uab.edu ''users_BOX_external_password''<br /> | |||
> mirror --parallel=10 -R ''local_src_dir'' ''box_dest_dir'' | |||
|} | |||
=== Single-line lftp (non-shared box) === | |||
'''Warning''': This one makes the password visible to "ps" so should only be used on personal machines | |||
{| class="wikitable" border="1" | |||
|- | |||
| lftp -u ''BLAZERID''@uab.edu,''users_BOX_external_password'' ftp.box.com << EOF<br /> | |||
mirror --parallel=10 -R ''local_src_dir'' ''box_dest_dir''<br /> | |||
EOF | |||
|} | |||
=== scripted lftp === | |||
{| class="wikitable" border="1" | |||
|- | |||
| cat > box_upload.lftp << EOF<br /> | |||
open ftp.box.com<br /> | |||
user ''BLAZERID''@uab.edu ''users_BOX_external_password''<br /> | |||
mirror --parallel=10 -R ''local_src_dir'' /''box_dest_dir''<br /> | |||
EOF<br /> | |||
chmod 700 box_upload.lftp<br /> | |||
lftp -f box_upload.lftp ; echo lftp_RC=$? | |||
|} | |||
=== scripted lftp - externalize password === | |||
file ~/.netrc contains your box external password once (works for wget, lftp, etc) | |||
{| class="wikitable" border="1" | |||
|- | |||
| cat >> ~/.netrc << EOF<br /> | |||
machine ftp.box.com<br /> | |||
login ''BLAZERID''@uab.edu<br /> | |||
password user_Box_External_PW <br /> | |||
<br /> | |||
EOF<br /> | |||
<br /> | |||
chmod 700 ~/.netrc<br /> | |||
|} | |||
then for each transfer, you create a local .lftp file w/o a password. | |||
Much more secure and easy to keep up to date. | |||
{| class="wikitable" border="1" | |||
|- | |||
| cat > box_upload.lftp << EOF<br /> | |||
open ftp.box.com<br /> | |||
mirror -R ''local_src_dir'' /''box_dest_dir''<br /> | |||
EOF<br /> | |||
<br /> | |||
lftp -f box_upload.lftp ; echo lftp_RC=$? | |||
|} | |||
== lftp mirror examples (DOWNload) == | |||
=== scripted lftp === | |||
Arguments | |||
* --loop keep restarting until there are no new files left to download - helps if someone else is uploading to that directory while you're downloading it! | |||
* -v verbose level 1: includes bytes transferred and transfer speed. | |||
* --parallel=10 use 10 concurrent TCP/IP connections (much faster) | |||
{| class="wikitable" border="1" | |||
|- | |||
| cat > box_download.lftp << EOF<br /> | |||
open ftp.box.com<br /> | |||
user ''BLAZERID''@uab.edu ''users_BOX_external_password''<br /> | |||
mirror --loop -v --parallel=10 /''box_remote_src_dir'' ''local_dest_dir''<br /> | |||
EOF<br /> | |||
chmod 700 box_download.lftp<br /> | |||
lftp -f box_download.lftp ; echo lftp_RC=$? | |||
|} | |||
== Issues to resolve == | |||
* Routing over Iternet2 | |||
* we see our traffic randomly going over the commodity internet | |||
== Linux Support == | |||
Unfortunately, Box doesn't provide a Linux client (is it on the road map?). |
Revision as of 15:53, 4 May 2018
UAB has an Enterprise contract with Box.com, which is currently in BETA.
This page describes what we have learned about doing bulk-transfers of data.
Warning: size limitations
Box.com claims to have a 5G max filesize limit
- Now 15G max filesize (2/18/016; private email), but "that file size limit is still considered to be in a beta phase"
There was a rumor this would be increased in early 2016.
Offically, this is all we know: https://community.box.com/t5/Managing-Your-Content/What-s-the-maximum-file-size-I-can-upload/ta-p/307
If you need to work around this, you can use the Linux "split" utility
# chop file into 4G pieces split \ |
# record checksums of original and chunks md5sum \ |
Warning: time stamps
When using an FTP client to transfer data up, it is easy to lose both modification and creation timestamps. In particular, many clients will (optionally) preserve modification time, but few will (optionally) preserve creation date.
FTP client | cost | platform | preserve mod_date | preserve create_date |
---|---|---|---|---|
lftp | free | linux/cmd_line | yes | no |
SmartFTP | $$ | Win Only/GUI | yes | can be enabled |
fileZilla | free | linux & win/GUI_only | can be enabled | no |
ftp_ssl | free | linux/cmd_line | yes | no |
Filezilla on create times
- request closed, no plans to fix: https://trac.filezilla-project.org/ticket/2347
If someone creates a folder and shares it to you (as an reader, editor, co-owner, etc), it will live in your top level directory, and you will NOT be able to move it into any subfolder.
"Currently, users can't rearrange their own view of folders they are invited to collaborate within. As you note, when someone invites you to collaborate in a folder that you have never had access to before, you will see that folder on your root level." [box.com]
Workaround: if your collaborator makes you the full owner of the folder, then you will be able to move it.
Fix timeframe: "We've heard requests that people be able to rearrange their views before, and this is being considered as part of a larger product experience change next year" [box.com]
lftp mirror -R examples (UPload)
lftp mirror
- "mirror" copies directory hierarchies DOWN from box.com to local
- "mirror -R" copies directory hierarchies UP from local to box.com
error handling
- the box server frequently looses connection (fails) on particular files
- just re-run the "mirror -R" and it will upload only new/failed files..
Interactive lftp
lftp ftp.box.com > user BLAZERID@uab.edu users_BOX_external_password |
Warning: This one makes the password visible to "ps" so should only be used on personal machines
lftp -u BLAZERID@uab.edu,users_BOX_external_password ftp.box.com << EOF mirror --parallel=10 -R local_src_dir box_dest_dir |
scripted lftp
cat > box_upload.lftp << EOF open ftp.box.com |
scripted lftp - externalize password
file ~/.netrc contains your box external password once (works for wget, lftp, etc)
cat >> ~/.netrc << EOF machine ftp.box.com |
then for each transfer, you create a local .lftp file w/o a password. Much more secure and easy to keep up to date.
cat > box_upload.lftp << EOF open ftp.box.com |
lftp mirror examples (DOWNload)
scripted lftp
Arguments
- --loop keep restarting until there are no new files left to download - helps if someone else is uploading to that directory while you're downloading it!
- -v verbose level 1: includes bytes transferred and transfer speed.
- --parallel=10 use 10 concurrent TCP/IP connections (much faster)
cat > box_download.lftp << EOF open ftp.box.com |
Issues to resolve
- Routing over Iternet2
* we see our traffic randomly going over the commodity internet
Linux Support
Unfortunately, Box doesn't provide a Linux client (is it on the road map?).