This repository was archived by the owner on Jan 30, 2020. It is now read-only.
agent,engine: fix bugs in rescheduling for replaced units#1698
Merged
dongsupark merged 4 commits intocoreos:masterfrom Nov 7, 2016
Merged
agent,engine: fix bugs in rescheduling for replaced units#1698dongsupark merged 4 commits intocoreos:masterfrom
dongsupark merged 4 commits intocoreos:masterfrom
Conversation
30e9570 to
aac597e
Compare
added 3 commits
November 4, 2016 10:52
So far decide() returns a bool variable to determine whether it's possible to schedule or not. Reschedule was not handled in a proper way. So let's create a new tri-state for job.JobAction, i.e. JobActionSchedule, JobActionUnschedule, and JobActionReschedule. And return the tri-state variable instead of bool. That way AbleToRun() can check return values from HasConflict() and HasReplace() in a correct manner.
While the 'Replaces' option has been supported since coreos#1572, the engine didn't actually unschedule units to be replaced. It was a bug. So let's implement GetReplacedUnit() to expose the replaced unit from AgentState to the engine reconciler. And make the engine reconciler unschedule the replaced unit, and schedule the current unit. The engine scheduler's decision structure needs to have a helper for the rescheduling case, by simply scheduling the replaced unit to a free machine.
Now that the AbleToRun() returns JobAction instead of bool, unit tests also need to be changed.
a060f9d to
a78572a
Compare
Author
|
Updated. |
Author
|
Oops, I didn't mean to close. It was just a bug in the github UI displaying nothing even after writing a comment. |
For correct tests, TestScheduleReplace() now does the following tests. * Start 4 units. replace.0 on m0, replace.[12] on m1, and replace-kick0 on m0. Doing that we could trigger a situation of unit replacement. * Ensure that machine of kick0 unit is not the same as that of unit 0, also that machine of kick0 is the same as the original machine m0. * Make use of WaitForState() to avoid races with unit state publisher.
a78572a to
f382b13
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
So far in engine reconciler,
decide()returns a bool variable to determine whether it's possible to schedule or not. Reschedule was not handled in a proper way.So let's create a new tri-state for
job.JobAction, i.e.JobActionSchedule,JobActionUnschedule, andJobActionReschedule. And return the tri-state variable instead of bool. That wayAbleToRun()can check return values fromHasConflict()andHasReplace()in a correct manner.With that, we can fix bugs in rescheduling for replaced units. While the
'Replaces'option has been supported since #1572, the engine didn't actually unschedule units to be replaced. It was a bug.So let's implement
GetReplacedUnit()to expose the replaced unit fromAgentStateto the engine reconciler. And make the engine reconciler unschedule the replaced unit, and schedule the current unit. The engine scheduler's decision structure needs to also track if the unit needs to be rescheduled, to be used by the scheduling path.