S3 commandline client s5cmd
In this page you will find documentation about the s5cmd S3 client.
s5cmd
The tool s5cmd allows you to parallelise workloads like data transfers. This is very convenient when you want to copy a whole directory with its contents to an S3 bucket or vice versa. More information may be found at https://github.com/peak/s5cmd, https://joshua-robinson.medium.com/s5cmd-for-high-performance-object-storage-7071352cc09d and https://aws.amazon.com/blogs/opensource/parallelizing-s2-workloads-s5cmd/.
The key benefit of s5cmd is its greatly improved performance as compared to s3cmd and aws cli etc.
Installation
Binaries for Windows, Mac and Linux can be downloaded from: https://github.com/peak/s5cmd/releases
Authentication
You can populate the environment with the proper values, after which you do not need to pass anything on the command line:
export S3_ENDPOINT_URL=https://objectstore.surf.nl
export AWS_REGION=default
export AWS_ACCESS_KEY_ID=<access key>
export AWS_SECRET_ACCESS_KEY=<secret key>
Or you an create a configuration file similar to the awscli client. The file must located at ~/.aws/credentials
[profile default]
aws_access_key_id = <access key>
aws_secret_access_key = <secret key>
region = default
In the latter case, you will need to specify the service endpoint like so:
s5cmd --endpoint-url https://objectstore.surf.nl ls
Upload/Download an object to/from a bucket
An object can be uploaded to a bucket by the following command:
s5cmd cp <file name> s3://mybucket/myobject
It can be downloaded by:
s5cmd cp s3://mybucket/myobject <filename>
Upload a folder with contents to a bucket
s5cmd cp /path/to/my/folder s3://mybucket
Download a bucket with contents to a directory
s5cmd cp s3://mybucket/* /path/to/my/folder/.
Large files
Important
By default s5cmd spawns 256 workers to do its tasks in parallel. This tool is really well suited for transferring a large number of small files. For larger files (>= 1GB) we have found it beneficial to reduce the number of workers to a smaller number, like for example 20, in order to reduce the load on the client side. To do that use the commandline flag --numworkers <value>
. An example is shown below:
s5cmd --numworkers 20 cp /path/to/my/folder/with/big/files s3://mybucket