Parallel Download Optimization
Motivation
We want to download a file of size
Derivation
The main idea is that the downloading is fastest when the most threads possible are running in parallel. Therefore our new method will calculate the threads sizes in a following way: - We start a new thread as soon as possible (after
This is the optimum way, since there will be running as many threads as possible for as long time as possible.
Denote
We have a limit for the speed:
Since we start a new thread after
We can calculate the total amount of time is the time until the last thread stops downloading. The first thread is downloading for
Thus the total amount of time of the download is
The best way is to split it into as many pieces as possible while
Using the inequality
There must also hold that
Let’s answer the second question: having
we can calculate
, we can calculate
Example
Suppose we want to calculate optimal number of threads and later corresponding splits for the following parameters.
What is the best
Therefore the final optimal number of threads to choose
Using this
We can at last check, that
The best way to split the files is to let the first thread download 16 MB, the second 10 MB and the last thread 4 MB for a total time of 160 s.
Comparison to a naive method
Let’s compare it to a naive method of splitting. The naive method splits the file into
and the total length
The total time spent by downloading
and thus
Our method had
If we compare them, then
So if we would start the downloading using both methods at the same time using the same number of threads, then our method would end