fix container discarding event by ningmingxiao · Pull Request #9652 · containerd/containerd

ningmingxiao · 2024-01-18T12:12:24Z

use crictl create 1000 containers and containerEventsChan will full, then don't call GetContainerEvents to consume events, events will lost. will show "containerEventsChan is full, discarding event ..."
issue:#8892

k8s-ci-robot · 2024-01-18T12:12:35Z

Hi @ningmingxiao. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ningmingxiao · 2024-01-18T13:10:25Z

@Iceber @lengrongfu @ruiwen-zhao

Signed-off-by: ningmingxiao <[email protected]>

lengrongfu · 2024-01-19T03:21:17Z

@ningmingxiao can you add issue link? thanks.

ningmingxiao · 2024-01-19T03:37:38Z

@ningmingxiao can you add issue link? thanks.

#8892

dmcgowan · 2024-01-19T05:48:39Z

pkg/cri/server/helpers.go

-		containerEventsDroppedCount.Inc()
-		log.G(ctx).Debugf("containerEventsChan is full, discarding event %+v", event)
-	}
+	eq.enqueue(event)


What does the queue add over just

go func(e runtime.ContainerEventResponse) { c.containerEventsChan <- e }(event)

Neither approach seems to guarantee the order if that is the intent.

If order preservation is important, could also just have a single dequeue go routine which runs when there are items in the queue, like

func (eq *eventQueue) enqueue(c chan<- runtime.ContainerEventResponse, event runtime.ContainerEventResponse) { eq.lock.Lock() defer eq.lock.Unlock() eq.items = append(eq.items, event) if len(eq.items) == 1 { go func() { var event *runtime.ContainerEventResponse for { eq.lock.Lock() if event != nil { eq.items = eq.items[1:] } if len(eq.items) == 0 { eq.lock.Unlock() return } event = &eq.items[0] eq.lock.Unlock() c <- *event } }() } }

If just generate one event

for { eq.lock.Lock() // here len(eq.items)=1 ,event is not nill if event != nil { //after next stup len(eq.items) = 0 eq.items = eq.items[1:] } //here will return len(eq.items)=0 will lost a event if len(eq.items) == 0 { eq.lock.Unlock() return } event = &eq.items[0] eq.lock.Unlock() c <- *event }

how about add a lock in criService struct

c.mu.lock() eq.enqueue(event) c.mu.unlock()

to keep event in order
@dmcgowan

This should be looked for a better solution overall rather than just add more locking/queuing. There are already TODOs around the publishing of the events and it is using a channel which a large buffer which is not a good pattern to avoid lost messages. It probably makes more sense to use a queue but to avoid leaking memory if there is no one calling GetContainerEvents the queue should have an expiration if nothing is removing items from the queue. I don't think a quick fix to just add a queue is going to work as it may lead to a memory leak.

k8s-ci-robot · 2024-01-27T23:50:13Z

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

github-actions · 2024-04-27T00:09:11Z

This PR is stale because it has been open 90 days with no activity. This PR will be closed in 7 days unless new comments are made or the stale label is removed.

github-actions · 2024-05-05T00:10:04Z

This PR was closed because it has been stalled for 7 days with no activity.

k8s-ci-robot added needs-ok-to-test size/XS labels Jan 18, 2024

ningmingxiao force-pushed the dev14 branch 2 times, most recently from 9affa74 to d1e0739 Compare January 18, 2024 12:59

k8s-ci-robot added size/S and removed size/XS labels Jan 18, 2024

ningmingxiao force-pushed the dev14 branch from d1e0739 to 1d31df3 Compare January 18, 2024 15:13

k8s-ci-robot added size/M and removed size/S labels Jan 18, 2024

ningmingxiao force-pushed the dev14 branch 2 times, most recently from 433e0b3 to 0c4810d Compare January 18, 2024 15:16

fix container discarding event

33f230c

Signed-off-by: ningmingxiao <[email protected]>

ningmingxiao force-pushed the dev14 branch from 0c4810d to 33f230c Compare January 18, 2024 15:28

dmcgowan reviewed Jan 19, 2024

View reviewed changes

dmcgowan mentioned this pull request Jan 20, 2024

Add support for multiple subscribers to CRI container events #9661

Merged

k8s-ci-robot added the needs-rebase label Jan 27, 2024

github-actions bot added the Stale label Apr 27, 2024

github-actions bot closed this May 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix container discarding event#9652

fix container discarding event#9652
ningmingxiao wants to merge 1 commit intocontainerd:mainfrom
ningmingxiao:dev14

ningmingxiao commented Jan 18, 2024 •

edited

Loading

Uh oh!

k8s-ci-robot commented Jan 18, 2024

Uh oh!

ningmingxiao commented Jan 18, 2024 •

edited

Loading

Uh oh!

lengrongfu commented Jan 19, 2024

Uh oh!

ningmingxiao commented Jan 19, 2024

Uh oh!

dmcgowan Jan 19, 2024

Uh oh!

dmcgowan Jan 19, 2024 •

edited

Loading

Uh oh!

ningmingxiao Jan 19, 2024

Uh oh!

ningmingxiao Jan 19, 2024 •

edited

Loading

Uh oh!

dmcgowan Jan 19, 2024

Uh oh!

k8s-ci-robot commented Jan 27, 2024

Uh oh!

github-actions bot commented Apr 27, 2024

Uh oh!

github-actions bot commented May 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ningmingxiao commented Jan 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jan 18, 2024

Uh oh!

ningmingxiao commented Jan 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lengrongfu commented Jan 19, 2024

Uh oh!

ningmingxiao commented Jan 19, 2024

Uh oh!

dmcgowan Jan 19, 2024

Choose a reason for hiding this comment

Uh oh!

dmcgowan Jan 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ningmingxiao Jan 19, 2024

Choose a reason for hiding this comment

Uh oh!

ningmingxiao Jan 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dmcgowan Jan 19, 2024

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Jan 27, 2024

Uh oh!

github-actions bot commented Apr 27, 2024

Uh oh!

github-actions bot commented May 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ningmingxiao commented Jan 18, 2024 •

edited

Loading

ningmingxiao commented Jan 18, 2024 •

edited

Loading

dmcgowan Jan 19, 2024 •

edited

Loading

ningmingxiao Jan 19, 2024 •

edited

Loading