Multipart layer fetch by azr · Pull Request #10177 · containerd/containerd

azr · 2024-05-06T14:00:55Z

TLDR: this makes pulls of big images ~2x faster (edit: and a bit more in the latest iteration), and closes #9922.

Hello Containerd People, I have this draft PR I would like to get your eyes on.

It basically makes pulls faster, but also tries to have not such a big memory impact, by getting consecutive chunks of the layers and immediately pushing them in the pipe (that writes to a file + that signature checksum thing).
I noticed it made pulls ~2x faster, when using the correct settings.

The settings have a big impact, and so I did a bunch of perf tests with different settings, here are some results on a ~8GB image using a r6id.4xlarge instance, pulling it from s3.
Gains are somewhat similar on a ~27GB and a ~100GB image (with a little tiny bit of slowdown)
I also tried on an nvme, and a ebs drives, they are ofc slower but gains are still the same.

Metrics on a r6id.4xlarge timing crictl pull of a 8.6GB image.

The first one with 13 tries is with 0 parallelism, it's the current code.
The rest are tries with different settings

c_para (max number of chunks being pulled per layer at once)
chunk_size_mb ( size of chunks in mb )
ctd_max_con ( max # of layer pulled at once )

tmpfs tests:

dst    agv_time          count(*)
-----  ----------------  --------
tmpfs  44.0761538461539  13      

dst    c_para  chunk_size_b  ctd_max_con  agv_time  count(*)
-----  ------  ------------  -----------  --------  --------
tmpfs  110     32            3            22.625    4       
tmpfs  100     32            3            22.64     5       
tmpfs  130     32            2            22.76     1       
tmpfs  120     32            4            22.824    5       
tmpfs  110     32            2            22.85     1       
tmpfs  80      32            4            22.99     1       
tmpfs  110     32            4            23.018    5       
tmpfs  90      64            4            23.09     1       
tmpfs  90      32            3            23.18     1       
tmpfs  110     64            3            23.2125   4       
tmpfs  80      64            3            23.29     1       
tmpfs  90      64            3            23.32     1       
tmpfs  100     32            4            23.352    5       
tmpfs  70      15            4            23.4      1       
tmpfs  100     64            3            23.65     5       
tmpfs  120     15            3            23.68     1       
tmpfs  110     64            2            23.74     1       
tmpfs  100     64            4            23.77     5       
tmpfs  70      32            4            23.81     5       
tmpfs  120     32            3            23.83     5
[...]

nvme (885GB) tests:

dst         agv_time          count(*)
----------  ----------------  --------
added-nvme  47.4008333333333  12      

dst         c_para  chunk_size_mb  ctd_max_con  agv_time  count(*)
----------  ------  ------------  -----------  --------  --------
added-nvme  130     32            3            25.24     1       
added-nvme  70      32            4            26.1      1       
added-nvme  80      32            3            26.31     1       
added-nvme  100     32            3            26.38     1       
added-nvme  120     32            4            26.58     1       
added-nvme  130     32            2            26.71     1       
added-nvme  80      32            4            26.73     1       
added-nvme  120     10            3            26.82     1       
added-nvme  80      64            3            26.93     1

Observations, I did a little go program to multipart download big files directly into a file at different positions with different requests, and that was much faster than piping single threadedly into a file. Containerd pipes in a checksumer and then pipes into a file. I think that this can in some conditions create some sort of thrashing, hence why the parameters are very important here.

That simple go program had pretty bad perfs with one connection, but I was able to saturate the network with multiple connection, with better or on par perfs with aws-crt.

I think that for maximum perfs, we could try to re-architecture things a bit; like concurrently write directly into the tmpfile, and then tell the checksumer our progress, so that it can do that in parallel, and then carry on like usual.

k8s-ci-robot · 2024-05-06T14:01:05Z

Hi @azr. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

swagatbora90 · 2024-05-08T16:02:31Z

@azr Thanks for the PR, this looks promising. I wonder if you were able to get any memory usage data from your tests? Previous effort to use ECR containerd resolver , which has similar multipart layer download, showed that it can take up disproportionate amount of memory specially when we increase the number of parallel chunks(without providing significant benefit to latency). The high memory utilization was mainly from the htcat library that ECR resolver uses to performance parallel Ranged Gets. I think we should understand these tradeoffs.

Also can you share some information about your test image? Number of layers? Size of individual layers?

akhilerm · 2024-05-09T15:43:45Z

/ok-to-test

azr · 2024-05-14T14:00:53Z

Hey @swagatbora90 , of course !

The theory in my mind is that this should use max/worst-case max_concurrent_downloads * (max_parallelism * goroutine footprint) memory ; where the goroutine footprint should be: the goroutine stack, 32 * 1024 bytes (of buffer), request clone. io.Copy will create buffers of 32 * 1024 bytes here; I have not tried playing with buffer sizes, could be an option too.

I think memory usage would be better if we were to directly write in parallel in a file at different positions, with 'holes'. And, sort of tell our progress to the checksumer with no-op writers that tell where we are, etc. (DL actually was so much faster this way in a test program I did, but it was not doing any unpacking, etc.)

I also think it could be nice to be able to have a per registry parallelism setting, because not all registries are s3 backed, and docker.io seems to throttle things at 60mb/s.

Topology of images:

~8GB image

From crictl images, size is 3.97GB

dive infos:

Total Image size: 8.6 GB                                                                                                                -rw-r--r--   1000:1000      205 B  │   │   │   └── README
Potential wasted space: 34 MB                                                                                                           drwxr-xr-x   1000:1000      319 B  │   │   ├── Xresources
Image efficiency score: 99 %

~27GB image

From crictl images, size is 17.7GB

dive infos:

Total Image size: 27 GB
Potential wasted space: 147 MB
Image efficiency score: 99 %

Here are memory usages, where I'm periodically storing ps -p $pid -o rss= of containerd in debug mode started with vscode, and gctraces enabled.

~27GB image pull, max_concurrent_downloads: 2, 0 parallelism (before)

(typo, replace KB by MB here.)

~27GB image pull, max_concurrent_downloads: 2, 110 parallelism, 32mb chunks

(typo, replace KB by MB here.)

GC traces:

8GB image with `GODEBUG=gctrace=1`, parallelism set to 110 and chunksize set to 32

INFO[2024-05-13T14:35:20.998417039Z] PullImage "..." 
gc 6 @7.661s 0%: 0.050+1.4+0.049 ms clock, 0.80+0.24/4.2/9.7+0.79 ms cpu, 7->7->5 MB, 7 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 7 @8.021s 0%: 0.053+1.5+0.053 ms clock, 0.86+0.057/4.5/9.6+0.85 ms cpu, 11->12->6 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 81 @2400.006s 0%: 0.039+0.20+0.002 ms clock, 0.15+0/0.18/0.43+0.010 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 8 @8.246s 0%: 0.14+1.5+0.069 ms clock, 2.3+0.091/4.9/11+1.1 ms cpu, 13->13->9 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 9 @10.168s 0%: 0.70+5.3+0.056 ms clock, 11+12/15/0.12+0.91 ms cpu, 18->20->11 MB, 19 MB goal, 1 MB stacks, 0 MB globals, 16 P
gc 10 @10.181s 0%: 0.32+4.0+0.10 ms clock, 5.2+13/10/0+1.6 ms cpu, 21->22->11 MB, 25 MB goal, 1 MB stacks, 0 MB globals, 16 P
gc 11 @10.868s 0%: 0.16+2.0+0.008 ms clock, 2.5+5.2/6.9/8.7+0.14 ms cpu, 26->26->19 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 12 @11.141s 0%: 0.10+2.3+0.055 ms clock, 1.6+0.23/7.3/14+0.88 ms cpu, 37->37->21 MB, 41 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 13 @11.366s 0%: 0.94+3.0+0.051 ms clock, 15+0.19/8.1/13+0.82 ms cpu, 40->40->22 MB, 43 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 14 @11.940s 0%: 0.81+2.0+0.047 ms clock, 12+0.29/7.1/13+0.76 ms cpu, 41->41->22 MB, 45 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 15 @12.879s 0%: 0.45+2.8+0.084 ms clock, 7.3+0.18/6.8/14+1.3 ms cpu, 43->43->22 MB, 45 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 16 @13.172s 0%: 0.052+2.5+0.089 ms clock, 0.83+0.21/8.1/13+1.4 ms cpu, 45->45->23 MB, 46 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 17 @13.453s 0%: 0.22+3.7+0.069 ms clock, 3.5+0.21/7.6/14+1.1 ms cpu, 46->47->23 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 18 @13.867s 0%: 0.14+2.4+0.080 ms clock, 2.2+0.17/7.2/14+1.2 ms cpu, 47->47->23 MB, 48 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 19 @14.120s 0%: 0.051+3.1+0.047 ms clock, 0.82+0.63/7.6/14+0.75 ms cpu, 48->49->25 MB, 49 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 20 @14.352s 0%: 0.055+2.8+0.007 ms clock, 0.88+0.14/6.4/12+0.11 ms cpu, 50->51->20 MB, 52 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 21 @14.490s 0%: 0.12+2.7+0.052 ms clock, 1.9+0.14/6.7/12+0.84 ms cpu, 41->42->13 MB, 42 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 22 @14.528s 0%: 0.12+1.8+0.051 ms clock, 2.0+0.22/6.5/12+0.81 ms cpu, 27->27->19 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 23 @15.572s 0%: 0.14+2.1+0.078 ms clock, 2.2+0.083/6.6/12+1.2 ms cpu, 39->39->20 MB, 40 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 24 @15.737s 0%: 0.053+1.5+0.076 ms clock, 0.85+0.092/5.3/12+1.2 ms cpu, 40->41->20 MB, 42 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 25 @15.963s 0%: 0.60+2.4+0.082 ms clock, 9.6+0.067/6.4/11+1.3 ms cpu, 39->40->12 MB, 41 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 26 @28.410s 0%: 0.18+1.4+0.004 ms clock, 2.9+0.064/4.8/9.9+0.072 ms cpu, 24->25->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 27 @28.530s 0%: 0.054+2.2+0.051 ms clock, 0.86+0/5.8/8.6+0.82 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 28 @28.638s 0%: 0.044+1.7+0.052 ms clock, 0.71+0.063/5.2/9.0+0.84 ms cpu, 27->27->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 29 @28.771s 0%: 0.043+1.7+0.047 ms clock, 0.69+0.087/5.3/10+0.75 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 30 @29.766s 0%: 0.11+2.1+0.087 ms clock, 1.8+0/6.7/9.7+1.3 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 31 @34.550s 0%: 0.054+1.9+0.004 ms clock, 0.87+0.062/5.3/9.8+0.072 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 32 @34.665s 0%: 0.046+1.5+0.051 ms clock, 0.75+0.057/4.9/10+0.83 ms cpu, 29->29->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 33 @34.779s 0%: 0.043+1.4+0.008 ms clock, 0.70+0.076/4.7/10+0.13 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 34 @34.915s 0%: 0.12+1.7+0.010 ms clock, 1.9+0.10/5.2/10+0.16 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 35 @35.284s 0%: 0.052+1.4+0.005 ms clock, 0.84+0.072/4.7/10+0.081 ms cpu, 31->31->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 36 @35.414s 0%: 0.11+1.7+0.047 ms clock, 1.9+0.095/6.0/10+0.75 ms cpu, 31->32->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 37 @35.544s 0%: 0.049+2.2+0.055 ms clock, 0.79+0.081/6.7/11+0.89 ms cpu, 32->32->17 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 38 @35.695s 0%: 0.10+2.3+0.004 ms clock, 1.6+0.058/5.6/9.6+0.077 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 39 @35.876s 0%: 0.14+2.4+0.047 ms clock, 2.2+0.073/6.0/9.6+0.75 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 40 @35.997s 0%: 0.046+2.2+0.006 ms clock, 0.74+0.064/6.1/10+0.11 ms cpu, 34->35->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 41 @36.117s 0%: 0.10+2.3+0.046 ms clock, 1.7+0.058/5.8/10+0.74 ms cpu, 35->35->17 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 42 @36.237s 0%: 0.039+2.4+0.048 ms clock, 0.63+0.069/4.9/11+0.77 ms cpu, 35->35->18 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 43 @36.362s 0%: 0.038+1.9+0.020 ms clock, 0.61+0.077/5.8/10+0.33 ms cpu, 36->36->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 44 @36.488s 0%: 0.031+2.5+0.008 ms clock, 0.49+0.079/5.7/9.7+0.13 ms cpu, 36->37->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 82 @2430.001s 0%: 0.029+0.17+0.003 ms clock, 0.11+0/0.16/0.41+0.012 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 45 @39.116s 0%: 0.50+2.7+0.089 ms clock, 8.0+0.24/6.5/10+1.4 ms cpu, 37->37->19 MB, 38 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 46 @39.838s 0%: 0.053+2.5+0.051 ms clock, 0.85+0.078/6.8/11+0.81 ms cpu, 38->38->20 MB, 39 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 47 @39.957s 0%: 0.059+1.4+0.055 ms clock, 0.94+0.11/4.5/10+0.88 ms cpu, 40->40->13 MB, 41 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 48 @40.036s 0%: 0.041+1.7+0.056 ms clock, 0.66+0.066/5.5/8.8+0.91 ms cpu, 27->28->7 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 49 @40.061s 0%: 0.030+2.1+0.059 ms clock, 0.49+0.060/6.3/9.7+0.94 ms cpu, 13->14->7 MB, 15 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 50 @40.083s 0%: 0.075+1.7+0.048 ms clock, 1.2+0.17/5.0/8.8+0.78 ms cpu, 13->14->7 MB, 15 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 51 @40.104s 0%: 0.12+1.5+0.081 ms clock, 1.9+0.049/4.9/9.6+1.3 ms cpu, 14->15->9 MB, 16 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 52 @40.115s 0%: 0.18+5.4+0.15 ms clock, 2.9+0.51/11/16+2.4 ms cpu, 17->18->13 MB, 19 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 53 @40.157s 0%: 0.083+1.8+0.092 ms clock, 1.3+0.078/5.2/8.5+1.4 ms cpu, 26->26->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 54 @40.847s 0%: 0.052+1.5+0.050 ms clock, 0.83+0.060/4.6/10+0.80 ms cpu, 26->26->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 55 @41.094s 0%: 0.14+2.5+0.046 ms clock, 2.2+0.14/6.5/10+0.74 ms cpu, 25->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 56 @41.191s 0%: 0.80+1.4+0.050 ms clock, 12+0.062/4.3/10+0.80 ms cpu, 26->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 57 @41.464s 0%: 0.053+1.5+0.004 ms clock, 0.85+0.069/4.7/10+0.078 ms cpu, 27->27->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 58 @41.838s 0%: 0.053+1.4+0.005 ms clock, 0.85+0.067/4.8/11+0.084 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 59 @41.993s 0%: 0.049+2.4+0.065 ms clock, 0.79+0.071/5.2/9.7+1.0 ms cpu, 29->29->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 60 @42.457s 0%: 0.052+1.7+0.050 ms clock, 0.83+0.080/5.5/10+0.81 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 61 @42.654s 0%: 0.053+1.7+0.095 ms clock, 0.85+0.064/4.9/10+1.5 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 62 @42.936s 0%: 0.051+1.7+0.049 ms clock, 0.83+0.079/5.5/10+0.79 ms cpu, 30->30->15 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 63 @43.177s 0%: 0.050+1.6+0.057 ms clock, 0.81+0.068/5.5/11+0.92 ms cpu, 31->31->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 64 @43.333s 0%: 0.12+2.3+0.005 ms clock, 2.0+0.061/5.8/9.8+0.094 ms cpu, 32->32->17 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 65 @43.477s 0%: 0.051+2.0+0.004 ms clock, 0.82+0.067/6.6/11+0.076 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 66 @43.619s 0%: 0.14+2.0+0.10 ms clock, 2.3+0.058/5.3/10+1.7 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 67 @44.696s 0%: 0.053+1.4+0.006 ms clock, 0.85+0.073/4.6/10+0.099 ms cpu, 34->35->14 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 68 @44.768s 0%: 0.034+1.4+0.004 ms clock, 0.55+0.051/4.6/10+0.075 ms cpu, 28->28->6 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 69 @44.814s 0%: 0.034+1.7+0.048 ms clock, 0.55+0.071/4.9/10+0.77 ms cpu, 13->13->6 MB, 13 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 70 @44.845s 0%: 0.12+3.1+0.12 ms clock, 1.9+0/5.4/11+2.0 ms cpu, 17->17->12 MB, 17 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 71 @44.966s 0%: 0.086+1.5+0.005 ms clock, 1.3+0.10/4.7/10+0.089 ms cpu, 24->24->13 MB, 25 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 72 @45.108s 0%: 0.16+2.1+0.082 ms clock, 2.5+0.16/5.8/9.7+1.3 ms cpu, 26->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 73 @45.260s 0%: 0.10+1.3+0.005 ms clock, 1.6+0.058/4.3/10+0.094 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 74 @45.463s 0%: 0.045+1.5+0.004 ms clock, 0.73+0.11/4.9/9.4+0.074 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 75 @45.641s 0%: 0.14+1.7+0.005 ms clock, 2.2+0.063/5.0/10+0.088 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 76 @45.797s 0%: 0.039+1.3+0.006 ms clock, 0.63+0.067/4.6/10+0.097 ms cpu, 29->29->13 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 77 @45.860s 0%: 0.030+1.1+0.004 ms clock, 0.48+0.85/3.8/9.2+0.075 ms cpu, 29->30->11 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P

8GB image with `GODEBUG=gctrace=1`, parallelism set to 0 ( existing code )

gc 6 @6.173s 0%: 0.044+1.4+0.051 ms clock, 0.70+0.15/4.5/9.4+0.82 ms cpu, 8->8->5 MB, 9 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 7 @6.557s 0%: 0.12+1.6+0.004 ms clock, 1.9+0.42/5.7/7.0+0.070 ms cpu, 11->13->8 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 8 @12.802s 0%: 0.16+1.8+0.096 ms clock, 2.5+0.90/5.1/7.6+1.5 ms cpu, 18->19->15 MB, 18 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 9 @13.092s 0%: 0.11+1.2+0.041 ms clock, 1.8+0.095/4.3/9.7+0.67 ms cpu, 29->29->15 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 10 @13.231s 0%: 0.047+1.2+0.054 ms clock, 0.76+0.11/4.0/9.6+0.87 ms cpu, 27->27->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 11 @13.707s 0%: 0.047+1.3+0.056 ms clock, 0.76+0.21/4.3/9.9+0.90 ms cpu, 28->28->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 111 @3300.010s 0%: 0.051+0.19+0.002 ms clock, 0.20+0/0.17/0.44+0.011 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 12 @13.868s 0%: 0.15+1.9+0.004 ms clock, 2.5+0.088/5.2/9.2+0.077 ms cpu, 28->28->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 13 @14.677s 0%: 0.22+1.8+0.095 ms clock, 3.6+0.13/5.4/10+1.5 ms cpu, 31->31->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 14 @14.908s 0%: 0.21+1.9+1.0 ms clock, 3.4+0.14/6.6/10+16 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 15 @15.091s 0%: 0.18+2.5+0.058 ms clock, 2.9+0.085/6.7/11+0.94 ms cpu, 33->34->17 MB, 34 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 16 @15.312s 0%: 0.053+2.4+0.050 ms clock, 0.86+0.080/6.0/9.8+0.80 ms cpu, 34->35->18 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 17 @15.650s 0%: 0.049+2.1+0.005 ms clock, 0.79+0.063/5.7/10+0.088 ms cpu, 36->36->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 18 @15.829s 0%: 0.11+3.2+0.058 ms clock, 1.9+0.084/6.7/9.4+0.93 ms cpu, 36->37->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 19 @16.008s 0%: 0.050+2.9+0.005 ms clock, 0.80+0.070/7.4/10+0.080 ms cpu, 37->38->20 MB, 38 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 20 @16.184s 0%: 0.049+1.8+0.047 ms clock, 0.79+0.11/4.9/9.7+0.76 ms cpu, 40->41->15 MB, 42 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 21 @16.289s 0%: 0.052+2.5+0.005 ms clock, 0.84+0.087/5.3/9.6+0.095 ms cpu, 31->31->8 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 22 @16.338s 0%: 0.28+1.3+0.11 ms clock, 4.5+2.0/4.6/8.1+1.8 ms cpu, 16->22->13 MB, 21 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 23 @17.335s 0%: 0.073+1.2+0.026 ms clock, 1.1+0.20/4.2/9.2+0.42 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 24 @17.461s 0%: 0.051+1.5+0.049 ms clock, 0.82+0.094/4.6/5.8+0.78 ms cpu, 29->29->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 25 @17.592s 0%: 0.11+1.5+0.004 ms clock, 1.8+0.047/4.6/9.6+0.069 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 26 @17.779s 0%: 0.048+1.3+0.047 ms clock, 0.77+0.019/4.2/9.2+0.75 ms cpu, 31->31->8 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 112 @3330.001s 0%: 0.033+0.17+0.003 ms clock, 0.13+0/0.16/0.41+0.015 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 27 @58.347s 0%: 0.14+1.9+0.056 ms clock, 2.2+0/5.3/8.8+0.90 ms cpu, 17->17->14 MB, 17 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 28 @58.451s 0%: 0.13+1.5+0.049 ms clock, 2.0+0.079/4.9/9.6+0.78 ms cpu, 29->30->12 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 29 @58.573s 0%: 0.073+1.4+0.046 ms clock, 1.1+0.11/4.5/9.1+0.74 ms cpu, 25->25->13 MB, 25 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 30 @58.673s 0%: 0.050+1.4+0.046 ms clock, 0.80+0.10/4.5/9.1+0.74 ms cpu, 26->26->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 31 @58.818s 0%: 0.050+1.2+0.048 ms clock, 0.80+0.071/4.1/9.3+0.77 ms cpu, 26->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 32 @62.929s 0%: 0.13+1.7+0.071 ms clock, 2.1+0.083/5.1/9.1+1.1 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 33 @64.699s 0%: 0.11+1.8+0.053 ms clock, 1.8+0.086/4.7/9.0+0.85 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 34 @64.801s 0%: 0.082+2.2+0.099 ms clock, 1.3+0.048/5.3/9.0+1.5 ms cpu, 27->28->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 35 @64.910s 0%: 0.050+1.8+0.051 ms clock, 0.81+0.11/5.2/9.2+0.83 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 36 @65.038s 0%: 0.079+1.7+0.050 ms clock, 1.2+0.056/5.2/9.2+0.81 ms cpu, 29->29->14 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 37 @65.383s 0%: 0.14+1.9+0.004 ms clock, 2.3+0.069/6.1/9.2+0.076 ms cpu, 29->30->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 38 @65.507s 0%: 0.050+1.7+0.060 ms clock, 0.81+0.15/5.2/9.4+0.96 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 39 @65.636s 0%: 0.050+2.4+0.054 ms clock, 0.81+0.15/6.3/9.4+0.86 ms cpu, 30->31->16 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 40 @65.770s 0%: 0.050+2.1+0.052 ms clock, 0.80+0/5.3/10+0.83 ms cpu, 32->32->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 41 @65.942s 0%: 0.052+2.2+0.054 ms clock, 0.83+0.083/5.8/9.5+0.87 ms cpu, 32->32->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 42 @66.057s 0%: 0.047+1.9+0.052 ms clock, 0.75+0.054/5.3/9.5+0.84 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 43 @66.171s 0%: 0.040+2.0+0.004 ms clock, 0.65+0.068/5.6/10+0.077 ms cpu, 33->34->17 MB, 34 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 44 @66.290s 0%: 0.037+1.8+0.050 ms clock, 0.60+0.079/4.9/9.2+0.80 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 45 @66.407s 0%: 0.063+2.2+0.046 ms clock, 1.0+0.20/5.5/9.6+0.74 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 46 @66.527s 0%: 0.048+2.4+0.046 ms clock, 0.78+0.078/6.3/8.2+0.73 ms cpu, 35->35->17 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 47 @69.148s 0%: 0.058+2.5+0.075 ms clock, 0.93+0.17/5.9/10+1.2 ms cpu, 35->35->18 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 48 @69.296s 0%: 0.31+3.0+0.055 ms clock, 5.0+0.057/7.4/8.7+0.88 ms cpu, 36->36->19 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 49 @70.016s 0%: 0.052+1.4+0.050 ms clock, 0.84+0.16/4.4/9.9+0.81 ms cpu, 39->39->13 MB, 40 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 50 @70.060s 0%: 0.092+1.5+0.054 ms clock, 1.4+0.15/4.7/8.2+0.87 ms cpu, 26->27->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 51 @70.111s 0%: 0.099+1.5+0.054 ms clock, 1.5+0.15/4.8/8.5+0.87 ms cpu, 26->27->6 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 52 @70.134s 0%: 0.032+1.5+0.008 ms clock, 0.51+0.17/4.2/9.2+0.13 ms cpu, 13->13->7 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 53 @70.159s 0%: 0.083+1.4+0.084 ms clock, 1.3+0.12/4.2/8.5+1.3 ms cpu, 13->14->7 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 54 @70.179s 0%: 0.029+1.4+0.053 ms clock, 0.46+0.053/4.5/8.7+0.85 ms cpu, 13->13->6 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 55 @70.211s 0%: 0.091+1.2+0.004 ms clock, 1.4+0.89/4.4/8.6+0.068 ms cpu, 12->13->6 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 56 @70.217s 0%: 0.070+1.2+0.080 ms clock, 1.1+0.25/4.0/8.7+1.2 ms cpu, 12->13->11 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 57 @70.296s 0%: 0.14+2.2+0.10 ms clock, 2.3+0/5.2/9.3+1.7 ms cpu, 23->23->12 MB, 24 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 58 @71.177s 0%: 0.051+1.3+0.007 ms clock, 0.82+0.049/4.1/9.3+0.11 ms cpu, 24->24->12 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 59 @71.259s 0%: 0.048+1.5+0.048 ms clock, 0.77+0.043/4.5/8.9+0.77 ms cpu, 24->25->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 60 @71.383s 0%: 0.10+2.2+0.11 ms clock, 1.7+0.048/5.6/9.5+1.8 ms cpu, 25->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 61 @71.904s 0%: 0.052+2.0+0.005 ms clock, 0.83+0.11/5.4/9.4+0.081 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 62 @72.029s 0%: 0.048+1.5+0.046 ms clock, 0.78+0.18/4.2/9.3+0.74 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 63 @72.482s 0%: 0.15+1.8+0.008 ms clock, 2.4+0.060/4.9/9.5+0.14 ms cpu, 27->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 64 @72.640s 0%: 0.052+1.4+0.057 ms clock, 0.83+0.065/4.4/10+0.91 ms cpu, 28->28->14 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 65 @72.849s 0%: 0.051+1.5+0.049 ms clock, 0.82+0.055/4.8/9.8+0.78 ms cpu, 29->29->14 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 66 @73.122s 0%: 0.050+2.2+0.085 ms clock, 0.81+0.11/5.1/9.3+1.3 ms cpu, 29->30->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 67 @73.342s 0%: 0.14+1.6+0.055 ms clock, 2.3+0.087/4.8/10+0.88 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 68 @73.502s 0%: 0.14+1.8+0.004 ms clock, 2.2+0.16/6.0/9.7+0.074 ms cpu, 31->31->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 69 @73.641s 0%: 0.050+2.3+0.005 ms clock, 0.81+0.053/5.9/9.2+0.081 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 113 @3360.001s 0%: 0.048+0.46+0.002 ms clock, 0.19+0/0.43/0.17+0.011 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 70 @73.786s 0%: 0.052+3.2+0.058 ms clock, 0.83+0.063/7.2/10+0.93 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 71 @74.857s 0%: 0.055+1.4+0.048 ms clock, 0.88+0.091/4.3/9.3+0.78 ms cpu, 33->33->13 MB, 34 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 72 @74.925s 0%: 0.042+1.3+0.005 ms clock, 0.67+0.071/4.1/9.2+0.080 ms cpu, 26->26->5 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 73 @74.966s 0%: 0.036+1.2+0.049 ms clock, 0.59+0.067/3.8/9.1+0.78 ms cpu, 11->11->5 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 74 @75.002s 0%: 0.10+1.3+0.005 ms clock, 1.6+1.6/4.6/7.5+0.081 ms cpu, 11->12->6 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 75 @75.006s 0%: 0.020+1.3+0.10 ms clock, 0.32+0.053/3.8/9.0+1.6 ms cpu, 11->12->11 MB, 13 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 76 @75.119s 0%: 0.049+1.4+0.067 ms clock, 0.78+0.10/4.2/9.3+1.0 ms cpu, 22->22->12 MB, 24 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 77 @75.222s 0%: 0.11+1.7+0.077 ms clock, 1.9+0.086/5.0/10+1.2 ms cpu, 24->24->12 MB, 25 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 78 @75.342s 0%: 0.049+1.6+0.046 ms clock, 0.78+0.19/5.4/10+0.73 ms cpu, 24->25->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 79 @75.552s 0%: 0.12+1.3+0.056 ms clock, 2.0+0.052/4.2/9.6+0.90 ms cpu, 25->25->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 80 @75.734s 0%: 0.048+1.5+0.047 ms clock, 0.77+0.052/4.5/9.0+0.76 ms cpu, 27->27->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 81 @75.860s 0%: 0.31+1.4+0.093 ms clock, 5.0+0.055/4.5/9.9+1.4 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 82 @75.985s 0%: 0.044+1.3+0.051 ms clock, 0.70+0.045/3.9/8.8+0.82 ms cpu, 28->28->5 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 83 @76.008s 0%: 0.031+1.3+0.046 ms clock, 0.50+0.70/4.5/9.9+0.75 ms cpu, 15->15->11 MB, 15 MB goal, 0 MB stacks, 0 MB globals, 16 P

Grpc tracing screenshots from the same run (8GB image with `GODEBUG=gctrace=1`, parallelism set to 110 and chunksize set to 32):

Screenshot from another run for a ~27GB image, after a while, all chunks seem to take the same amount of time, ~22s, we've probably reached the writing speed burst limit, and are slowly taking more time to do things:

swagatbora90 · 2024-05-21T23:44:16Z

@azr Thanks for adding the performance numbers. I ran some tests as well using your patch and the memory usage looks better than what I saw in the htcat implementation specially with high parallelism count.

However, I do observe that increasing parallelism does not yield better latency and may lead to higher memory usage (I think there is a number of other factors to consider here mainly type of instance used for testing, network bandwidth). I tried to limit the test to a single image with a single layer and fixing the chunk size to 20 MB. A lower parallelism count(3 or 4) may be preferable than setting parallelism to upwards of 10.

Using a c7.12xlarge instance to pull a 3GB single layer image from ECR private repo.

Parallelism Count	Chunk Size(MB)	Total Download time(sec)	Network Pull time(sec)	Download Speed(MBPS)	Max Memory used (from cgroups memory.peak)
1	20	65.9	51.78	53	15.9
2	20	39.3	32.08	88.8	17.5
3	20	36.6	22.57	95.4	18
4	20	36.8	16.82	94.8	17
5	20	36.8	14.78	94.8	17
10	20	36.9	13.92	94.6	20
20	20	36.9	14.31	94.6	22
30	20	36.7	14.97	95.1	26
40	20	36.8	14.21	94.8	31
50	20	36.7	14.3	95.1	36
100	20	36.8	14.91	94.8	52

Also the network download time was much faster (see Network Pull time) (~15sec) while containerd took additional ~20secs to complete the pull (before it started unpacking). I calculated the Network Download time by periodically calling /containerd.services.content.v1.Content/ListStatuses" filtering the layer digest and checking when the content.Offset == content.Size. I am still not sure why containerd takes so much time after it has already committed to the content store, pprof does not show any significant cpu usage by containerd either during this time. Are we blocked on GC or some underlying syscall(fp.Sync) to complete?

swagatbora90 · 2024-05-22T00:10:12Z

@dmcgowan @kzys

dmcgowan · 2025-04-15T06:48:08Z

core/remotes/resolver.go

 	// All content fetched from the returned fetcher will be
 	// from the namespace referred to by ref.
-	Fetcher(ctx context.Context, ref string) (Fetcher, error)
+	Fetcher(ctx context.Context, ref string, opts ...FetcherOpt) (Fetcher, error)


I don't think we should make this interface change. I was trying to think of some better options. Previously we had another interface appended with WithOptions, but I don't think that should be necessary here. This configuration could just be directly given as resolver options. The resolver could have a SetOptions function defined and type-asserted to in the transfer case. I think we can avoid changing the interface here though, especially since these settings are global and not defined per call.

Ack ! Updating !

Okay, updated, thanks for the review, I think this looks better, WDYT ?

azr · 2025-04-16T13:59:21Z

/retest

azr · 2025-04-16T15:50:04Z

Hey @dmcgowan & @djdongjin I updated the code to only add one setting: ConcurrentLayerFetchBuffer. When unset concurrent layer fetch will not happen. I removed the MaxConcurrentDownloadsPerLayer setting because in my test I ended up setting it to the same value as max concurrent downloads anyways, it was confusing and the perfs were similar.

Also I applied @dmcgowan's review to conf these settings ( ConcurrentLayerFetchBuffer and MaxConcurrentDownloads ) only once and not per (edit: layer) download. Please let me know if that sounds like a better approach.

In the meantime, I just realised that while testing 1 pull at a time was much faster that before, pulling two things at a time made for a longer total pull than if I were to pull at 2 different times, which doesn't sound like the best. Will investigate a bit.

azr · 2025-04-17T12:13:43Z

re:

pulling two things at a time made for a longer total pull than if I were to pull at 2 different times

Turns out it's not the case on tmpfs, so I think this is okay.

So yeah okay this PR looks quite good to me, thanks for the many reviews, please let me know if anything needs to change.

Will try to have tests pass now.

dmcgowan

I think these two conditions need to be checked to avoid leak and panic, but are hard to simulate in tests.

core/remotes/docker/fetcher.go

dmcgowan · 2025-04-24T07:15:19Z

core/remotes/docker/fetcher.go

+							case <-stopChan:
+								return errors.New("another worker failed")
+							default:
+								close(stopChan)


This is still racy, multiple goroutines may hit this condition and cause a panic. There is no guarantee in one goroutines selects default and closes before another goroutine selects the default.

ditto here, I put the close in a sync.Once to be sure, also made the stop close the context. LMK if anything as well.

ah made it even better, just using the context with cancel and its done chan

Signed-off-by: Adrien Delorme <[email protected]> Co-Authored-By: Corentin REGAL <[email protected]>

Signed-off-by: Adrien Delorme <[email protected]>

estesp · 2025-04-24T13:07:52Z

/test pull-containerd-k8s-e2e-ec2

dmcgowan

We should just get this in, there are still maybe a few interface tweaks we can make before final release that won't affect the functionality.

core/remotes/docker/resolver.go

core/remotes/docker/fetcher.go

fuweid · 2025-04-24T18:05:05Z

core/remotes/docker/fetcher.go

+		r.Release(1)
+		for range parallelism {
+			go func() {
+				for i := range queue { // first in first out


Not sure if there is competive condition in http2 stdlib on window updating.
Will check this part later. it's not blocker.

Signed-off-by: Adrien Delorme <[email protected]>

cartermckinnon · 2025-05-13T23:24:47Z

@azr thank you for all the hard work on this, it's an awesome improvement! 🙏

azr · 2025-05-14T06:47:18Z

Aw thanks ! ❤️ You're welcome !! 🚀

k8s-ci-robot added do-not-merge/work-in-progress size/L needs-ok-to-test labels May 6, 2024

azr force-pushed the azr/parallel-layer-fetch branch 4 times, most recently from 2531c18 to 0748b0f Compare May 7, 2024 09:57

azr marked this pull request as ready for review May 7, 2024 09:59

k8s-ci-robot removed the do-not-merge/work-in-progress label May 7, 2024

azr changed the title ~~Parallel layer fetch~~ Multipart layer fetch May 7, 2024

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels May 9, 2024

This comment was marked as outdated.

Sign in to view

azr force-pushed the azr/parallel-layer-fetch branch 10 times, most recently from c13969f to 8fc47db Compare May 21, 2024 15:07

dmcgowan added this to the 2.1 milestone May 22, 2024

dmcgowan reviewed Apr 15, 2025

View reviewed changes

dmcgowan mentioned this pull request Apr 23, 2025

Add release notes for v2.1.0-rc.0 #11752

Merged

dmcgowan requested changes Apr 24, 2025

View reviewed changes

azr and others added 6 commits April 24, 2025 11:39

perf(pull): multipart layer fetch

f9af088

Signed-off-by: Adrien Delorme <[email protected]> Co-Authored-By: Corentin REGAL <[email protected]>

update

755a4ac

Signed-off-by: Adrien Delorme <[email protected]>

remove max_dl_operations setting

88116b1

Signed-off-by: Adrien Delorme <[email protected]>

set dl options on resolver

024775d

Signed-off-by: Adrien Delorme <[email protected]>

only keep one setting: concurrent_layer_fetch_buffer

72c8c77

Signed-off-by: Adrien Delorme <[email protected]>

better race mgt

a196ee6

Signed-off-by: Adrien Delorme <[email protected]>

dmcgowan approved these changes Apr 24, 2025

View reviewed changes

fuweid approved these changes Apr 24, 2025

View reviewed changes

code review fixes

83ad3b5

Signed-off-by: Adrien Delorme <[email protected]>

fuweid approved these changes Apr 24, 2025

View reviewed changes

azr mentioned this pull request Apr 30, 2025

pull time difference between ctr i pull and kube pull on 2.1.0.rc.0 #11775

Closed

This was referenced May 6, 2025

Image pull speed is throttled by unknown source #8160

Open

Prepare release notes for v2.1.0 #11819

Merged

zaheer-merali mentioned this pull request May 7, 2025

Upgrade containerd to v2.1 bottlerocket-os/bottlerocket-core-kit#495

Closed

sidewinder12s mentioned this pull request May 28, 2025

Upgrade containerd to 2.1+ awslabs/amazon-eks-ami#2267

Closed

KCSesh mentioned this pull request Aug 28, 2025

feat: add concurrent_download_chunk_size to container-runtime bottlerocket-os/bottlerocket-settings-sdk#99

Merged

vrdn-23 mentioned this pull request Aug 29, 2025

containerd-2.1: add configurable concurrent-download-chunk-size setting bottlerocket-os/bottlerocket-core-kit#645

Merged

samuelkarp mentioned this pull request Jan 27, 2026

Support for parallel upload of large artifacts opencontainers/distribution-spec#573

Open

This was referenced Feb 27, 2026

🌱 CNCF mission generation 2026-02-27 kubestellar/console-kb#6

Closed

🌱 CNCF mission generation 2026-02-27 kubestellar/console-kb#11

Merged

🌱 CNCF mission generation 2026-03-02 kubestellar/console-kb#251

Merged

Conversation

azr commented May 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented May 6, 2024

Uh oh!

swagatbora90 commented May 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akhilerm commented May 9, 2024

Uh oh!

azr commented May 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

swagatbora90 commented May 21, 2024

Uh oh!

swagatbora90 commented May 22, 2024

Uh oh!

dmcgowan Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

azr Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

azr Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

azr commented Apr 16, 2025

Uh oh!

azr commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

azr commented Apr 17, 2025

Uh oh!

dmcgowan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dmcgowan Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

azr Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

azr Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

estesp commented Apr 24, 2025

Uh oh!

dmcgowan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fuweid Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

cartermckinnon commented May 13, 2025

Uh oh!

azr commented May 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

azr commented May 6, 2024 •

edited

Loading

swagatbora90 commented May 8, 2024 •

edited

Loading

azr commented May 14, 2024 •

edited

Loading

azr commented Apr 16, 2025 •

edited

Loading