File and data transfer

Research Computing offers two primary mechanisms for data transfer: SSH-based scp and sftp; and GridFTP, including access from Globus Connect.

SSH file transfer (SCP / SFTP)

SSH-based file transfer is not particularly efficient or performant (especially compared to the Globus and GridFTP method detailed below) but, because it uses the same software that is already used for interaction with Research Computing resources, it is still commonly used.

Command-line secure copy (SCP)

Files can be transferred to and from the Research Computing environment using the scp command from a Unix or Linux command-line (including Mac OS X).

$ scp ${local_filename} rc_username@login.rc.colorado.edu:/path/to/target-directory
(Please modify the above command with the appropriate paths and your RC username.
Tutorial: When transferring files using scp, it is particularly useful to use OpenSSH ControlMaster to reduce OTP password entries.

Interactive clients

Files can also be transferred to and from the Research Computing environment using a number of interactive file transfer applications. The most basic of these is sftp, available from a Unix or Linux command-line (including Mac OS X).

$ sftp rc_username@login.rc.colorado.edu

Once an SFTP connection is established, files can be transferred using the get and put commands. More information can be accessed using the help command.

Alternative (and graphical) file transfer clients are available for Windows, including

  • FileZilla, a multi-protocol, multi-platform file-transfer application.
  • WinSCP, a basic SCP/SFTP file-transfer application for Windows.

Graphical file-transfer applications often retain passwords for automatic authentication for later transfers. Because Research Computing uses one-time passwords for authentication, you must disable password retention / saving in your file-transfer client. Failure to do so may cause your account to be temporarily disabled after the client attempts and fails to authenticate repeatedly in the background.

File-system and application integration

SSH-accessible filesystems can be directly integrated into compatible applications and Operating Systems.

Configuration of these OS and application-level integrations is currently outside the scope of this guide.

Globus

Globus addresses deficiencies in secure copy requests by automating large data transfers, by resuming failed transfers, and by simplifying the implementation of high-performance transfers between computing centers.

Globus.org and Globus Online

Globus Online is a Software as a Service (SaaS) deployment of the Globus Toolkit which provides end-users with a browser interface to initiate data transfers between endpoints registered with the Globus Alliance. Globus Online allows registered users to “drag and drop” files from one endpoint to another. Endpoints are terminals for data; they can be laptops or supercomputers, and anything in between. The servers at Globus.org act as intermediaries-negotiating, monitoring and optimizing transfers through firewalls and across network address translation (NAT). Under certain circumstances with high performance hardware transfer rates exceeding 1 GB/s are possible.

We recommend reading through Globus.org's overview documentation .

Getting an account

To use Globus Online, you will first need to sign up for an account at Globus.org. If you prefer to use your CU Identikey account for authentication to Globus Online, choose "University of Colorado Boulder" from the dropdown menu at the Login page and follow the instructions from there.

Transferring data to/from a local workstation

You can use Globus Online to transfer data between your local workstation (e.g., your laptop or desktop) and Research Computing. In this workflow, you configure your local workstation as a Globus endpoint using Globus Connect.

  1. Log in to Globus.org
  2. Use the Manage Endpoints interface to “add Globus Connect Personal” as an endpoint. (More information at Globus.org support.)
  3. Transfer Files, using your new workstation endpoint for one side of the transfer, the Research Computing endpoint (CU-Boulder Research Computing) for the other side. (You will be required to authenticate to the Research Computing endpoint using your RC account and OTP or Duo.)

Transferring data between two remote endpoints

Globus.org can also be used to transfer data between two remote Globus endpoints (e.g., between your local compute center's Globus endpoint and the Research Computing endpoint.)

  1. Log in to Globus.org
  2. Transfer Files, using the Research Computing endpoint (CU-Boulder Research Computing) for one side of the transfer, and another endpoint of your choice for the other side. (You will be required to authenticate to the Research Computing endpoint using your RC account and OTP. The other endpoint may require its own authentication as well.)

Globus Connect command-line interface

Globus.org provides a command-line interface (CLI) as an alternative to its web interface. This command-line interface is provided over an SSH connection to a Globus.org server.

  1. Use the Manage Identities interface at Globus.org to upload your ssh public key.
  2. Connect to Globus.org using an ssh client.

    $ ssh -l globus_username cli.globusonline.org
  3. The Globus.org command-line interface can start and manage transfers, manage files on an endpoint, and configure endpoints associated with your account. Use the help command for more information on the commands available, or visit the Globus.org support system.