AWS CLI download via S3
Overview |Installing AWS CLI Ι | Configuring AWS CLI | Downloading using AWS CLI without saving a configuration | Improving download speeds
Overview
AWS CLI is a command line application that you can use to download files from the Data Access Portal (DAP). It is available for Windows, MacOS and Linux.
AWS CLI (external site)
Installing AWS CLI
Refer to the AWS CLI documentation for installation (external site).
Many Unix-like distributions will have a package that you can install. Refer to your operating system’s documentation for guidance on how to install via your package manager.
Configuring AWS CLI
If you don’t use AWS CLI for anything else you can skip this section.
If you already use AWS CLI and have configured it for connecting to other systems, you may need to ensure your default settings do not cause conflicts.
For example, if you regularly use another third party S3 service (not Amazon Web Services) that uses self-signed certificates then you may have specified your local certificate bundle in your default configuration. Your configuration files are stored at the following location:
Operating System | Path |
---|---|
Linux/Unix/MacOS | ~/.aws/ |
Windows | %USERPROFILE%/.aws/ |
If you don’t have a .aws
folder, it’s not necessary to create one when downloading from the DAP.
If you do have a file called config
in your .aws
folder, look at the contents to see if you have any default settings, e.g.
# Or use your text editor of choice vi ~/.aws/config
notepad %USERPROFILE%/.aws/config
Anything you have in your [default]
profile could cause problems when downloading from the DAP. You should consider configuring specific profiles for the different connections you wish to make. Refer to the AWS CLI documentation (external site) for guidance on how to use configuration files.
Downloading using AWS CLI without saving a configuration
Every time you request access to a DAP collection via S3 you will receive a different set of credentials that only apply to that collection. These credentials only remain valid for around 48 hours. Depending on your needs you may not want to go through the process of saving a configuration.
Downloading files without saving a configuration involves three steps:
- Retrieving the connection information.
- Entering access keys as environment variables.
- Running a download command.
First you need to request the S3 access details for the DAP collection you want (follow the steps listed up to but not including where you click “Open S3 Client”).
Note: The Download Information is unique to a download request for a collection. It is provided here to illustrate the download process and copying this information exactly will cause an error.
You can click the “copy” icons – – to copy the values you need to your clipboard.
Save the Access Key and Secret Access Key as environment variables:
# These are example access keys that will not work export AWS_ACCESS_KEY_ID=ABCDEFGHIJK123456789 export AWS_SECRET_ACCESS_KEY=Aa1234+567/BbCcDcEeFfGgHhIiJjKkLlMmNnOoP
Windows Powershell sets environment variables in a different way to Windows Command Line
# These are example access keys that will not work $env:AWS_ACCESS_KEY_ID = "ABCDEFGHIJK123456789" $env:AWS_SECRET_ACCESS_KEY = "Aa1234+567/BbCcDcEeFfGgHhIiJjKkLlMmNnOoP"
# These are example access keys that will not work SET AWS_ACCESS_KEY_ID=ABCDEFGHIJK123456789 SET AWS_SECRET_ACCESS_KEY=Aa1234+567/BbCcDcEeFfGgHhIiJjKkLlMmNnOoP
An optional step is to create a folder that you want to download the files to. If you skip this step and specify a folder that you have not already created, AWS CLI will create the folder for you. In the below example modify the path to an appropriate place on your file system.
mkdir ~/Downloads/my_dap_download_example
Note that if your local folder has spaces in it you will need to enclose the local folder path in quotes.
md %USERPROFILE%\Downloads\my_dap_download_example
Use the Endpoint URL
and the Remote Directory
values to run a download command, e.g.
# aws s3 --endpoint-url {endpoint url value} --recursive cp s3://{remote directory value} {local path} aws s3 --endpoint-url https://s3.data.csiro.au --recursive cp s3://dapprd/000005588v002/ ~/Downloads/my_dap_download_example/
Note that if your local folder has spaces in it you will need to enclose the local folder path in quotes.
aws s3 --endpoint-url https://s3.data.csiro.au --recursive cp s3://dapprd/000005588v002/ %USERPROFILE%/Downloads/my_dap_download_example/
Improving download speeds
AWS CLI does not have many options that will help you optimise the speed of your downloads. If you are on a fast network connection and feel you should be able to achieve faster speeds, rclone is a command line tool you can use that allows you to download multiple files concurrently, which can improve speeds.