Time-out value
Timeout
It is important to set the '--timeout
' option high enough. As a rule of thumb, set it to 10 minutes for every GB of the biggest file in a collection. So if the biggest file you want to upload in a collection is 10GB, set --timeout 100m
This may look ridiculously large, but it provides a safe margin to avoid problems with timeout issues.
Cookies
The use of the '--use-cookies' flag in rclone is highly recommended, bordering mandatory. Without it, your uploads will be spread over multiple backends, which causes race conditions. For example, one transfer creates a directory, another process tries to place a file therein. If the second process is quicker than the first, the directory does not exist yet and the upload fails with an error. With the '--use-cookies' flag you force all your transfers to pass through the same backend, which fixes this problem.
The only scenario where you do not need to set '--use-cookies' is when you upload a completely flat directory structure, for example 10k of files in a single directory.
The impact of file size
File size has a big impact on performance. It is not a problem to migrate a large number of small files, but keep in mind that because every transfer has some overhead, the throughput for smaller files will be less. For comparison, running the test with 24 parallel processes:
Total transfer | File size | Number of files | Seconds to complete |
---|---|---|---|
50MB | 10kB | 5000 | 350 |
50MB | 100kB | 500 | 45 |
50MB | 500kB | 100 | 25 |
50MB | 1MB | 50 | 15 |
50MB | 2MB | 25 | 10 |
50MB | 5MB | 10 | 10 |
50MB | 10MB | 5 | 10 |
50MB | 25MB | 2 | 15 |
So as you can see, the number of files in the transfer has a big influence on the total transfer time. Set your expectations accordingly
A point about parallelization
A great deal of performance can be gained by running more uploads in parallel. With rclone, this defaults to 4 but can be set with the --transfers argument. However, if it is too high, you can run into problems. Don't set it higher than 24, as performance will degrade when it is higher than that.
Rule of thumb
If the files in your collection are all smaller than 5GB, just set `--transfers 24`
If your collection has files of 5GB or bigger, the way to calculate the optimum is this:
100 / largest filesize in collection (in GB) = number of parallel processes (max 24)
So, if the largest files in your collection are 5GB, you can run a maximum of 20 parallel processes
100 / 5 = 20
If you have outliers, you can generally ignore those. So if you have a large collection of 5GB files, but one of 20GB, you can ignore the latter. If you do run into problems uploading, you can recalculate like this:
100 - size of outlier in GB / typical largest filesize in collection (in GB) = number of parallel processes (max 24)
So that would be 16 parallel processes:
80 / 5 = 16