@oli.campeau Ok, with this change we have SIGHUP being handled in two places. Should SIGHUP only be added if upgradesEnabled is false here?
It looks like there is a legit failure: https://gitlab.com/gitlab-org/gitlab/-/jobs/13520496109
1) User state machine and default attributes raises errors by default
Failure/Error: expect { model }.to raise_error(ZeroDivisionError)
expected ZeroDivisionError but nothing was raised
# ./spec/models/user_spec.rb:9989:in `block (3 levels) in <main>'
# ./spec/spec_helper.rb:595:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:487:in `block (3 levels) in <top (required)>'
# ./lib/gitlab/sidekiq_sharding/validator.rb:42:in `enabled'
# ./spec/spec_helper.rb:486:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:481:in `block (3 levels) in <top (required)>'
# ./spec/support/sidekiq_middleware.rb:9:in `with_sidekiq_server_middleware'
# ./spec/spec_helper.rb:471:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:467:in `block (3 levels) in <top (required)>'
# ./lib/gitlab/application_context.rb:109:in `with_raw_context'
# ./spec/spec_helper.rb:467:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:438:in `block (3 levels) in <top (required)>'
# ./lib/gitlab/ci/config/feature_flags.rb:38:in `ensure_correct_usage'
# ./spec/spec_helper.rb:437:in `block (2 levels) in <top (required)>'
# ./spec/support/system_exit_detected.rb:7:in `block (2 levels) in <main>'
# ./spec/support/fast_quarantine.rb:22:in `block (2 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:107:in `block (3 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:61:in `with_cross_joins_prevented'
# ./spec/support/database/prevent_cross_joins.rb:107:in `block (2 levels) in <main>'
Finished in 1 minute 3.56 seconds (files took 1 minute 13.14 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/models/user_spec.rb:9988 # User state machine and default attributes raises errors by default
Stan Hu (069a7433) at 17 Mar 05:55
Add GLIBC version health check post-build hook
Also now that we validate cert and key must be together, I don't think this is necessary?
I just did this in 0c81b044.
Stan Hu (0c81b044) at 17 Mar 05:20
Add a config check to ensure TLS types are correct
I think Duo started by adding cert = in the config file, and I asked it to revise. Instead I got this without looking carefully. Thanks.
Oh, ancestor_inherited: nil on this version. I may have to update.
On my test instance, I see this warning whenever I run gitlab-rails console:
$ sudo gitlab-rails c
Both Namespace and its :state machine have defined a different default for "state". Use only one or the other for defining defaults to avoid unexpected behaviors.
If I run sudo gitlab-psql and \d+ namespaces, I see the default is 0:
state | smallint | | | 0 | plain | | |
state_machine :state, initial: :ancestor_inherited do
Which is supposed to be 0? https://gitlab.com/gitlab-org/gitlab/-/blob/8c174babdee0a46b8a6f5c8866c5cff5cb95de53/app/models/concerns/namespaces/stateful.rb#L17
Oh, it's called in spec/models/every_model_spec.rb. Does it fail with Namespace?
Do not check column default in state machine initialization
The upstream implementation currently checks the column default, only so that it can warn that there is an inconsistency between the column default and the state machine default.
Also the way state_machines-activerecord gets this column default is inconsistent, it only gets it if the class already has a previous DB connection.
So we monkey-patch this warning out of the state machine initialization code.
We still keep the warning code so that we can call it from a spec, for every model that use state_machine.
| Before | After |
|---|---|
Compare to the code in https://rubygems.org/gems/state_machines/versions/0.100.4, the monkey-patch should be functionally the same. Except for the code to warn about differing defaults
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
I asked Claude Code to recommend a plan to upstream:
owner_class_attribute_default Checkinitial_state= is called at class load time (during Machine#initialize). It calls
owner_class_attribute_default, which the ActiveRecord integration overrides to query
owner_class.column_defaults[attribute.to_s]. Even the guard
owner_class.connected? && owner_class.table_exists? can trigger a connection attempt
in some environments (e.g. GitLab/Sidekiq startup), because connected? itself may
probe the connection pool.
The check exists only to emit a warning — it has no effect on correctness. So it is safe to defer.
Related issue: #586582 (closed)
machine/integration.rb)Move the inline logic from configuration.rb into two new protected methods:
# Gets the initial attribute value defined by the owner class (outside of
# the machine's definition). By default, this is always nil.
def owner_class_attribute_default
nil
end
# Checks whether the given state matches the attribute default specified
# by the owner class.
def owner_class_attribute_default_matches?(state)
state.matches?(owner_class_attribute_default)
end
# Checks for a conflicting attribute default between the owner class and
# the machine. Called from initial_state= but can be deferred by integrations.
def check_conflicting_attribute_default
initial_state = states.detect(&:initial)
has_owner_default = !owner_class_attribute_default.nil?
has_conflicting_default = dynamic_initial_state? || !owner_class_attribute_default_matches?(initial_state)
return unless has_owner_default && has_conflicting_default
warn(
"Both #{owner_class.name} and its #{name.inspect} machine have defined " \
"a different default for \"#{attribute}\". Use only one or the other for " \
'defining defaults to avoid unexpected behaviors.'
)
end
# Schedules or runs the conflicting attribute default check.
# Override in integrations to defer the check (e.g., until after the DB is ready).
def schedule_conflicting_attribute_default_check
check_conflicting_attribute_default
end
configuration.rb
Replace the inline conflict check in initial_state= with a call to the new hook:
def initial_state=(new_initial_state)
@initial_state = new_initial_state
add_states([@initial_state]) unless dynamic_initial_state?
# Update all states to reflect the new initial state
states.each { |state| state.initial = (state.name == @initial_state) }
# Check for a conflicting default on the owner class attribute.
# May be deferred by integrations (e.g. ActiveRecord) to avoid
# DB queries during the Rails/Sidekiq initializer phase.
schedule_conflicting_attribute_default_check
end
state_machines-activerecord
Override schedule_conflicting_attribute_default_check in the ActiveRecord integration
to defer the check until after ActiveRecord is fully initialized:
def schedule_conflicting_attribute_default_check
machine = self
# Defer until AR is fully loaded to avoid touching the DB during
# the Rails/Sidekiq initializer phase.
ActiveSupport.on_load(:active_record) do
machine.check_conflicting_attribute_default
end
end
def owner_class_attribute_default
# The connection safety is now handled by the scheduler above.
# This guard is kept as a defensive fallback.
return unless owner_class.connected? && owner_class.table_exists?
owner_class.column_defaults[attribute.to_s]
end
| Current | Proposed | |
|---|---|---|
| DB query timing | Eagerly at class load | Deferred to on_load(:active_record)
|
state_machines gem change |
— | Extracts inline logic into overridable hook |
| AR integration change | Guard still triggers connected?
|
Defers entire check; guard becomes a safety net |
| Base class behavior | Returns nil (check is a no-op) |
Unchanged |
The key property of schedule_conflicting_attribute_default_check is that it gives the
AR integration a clean seam to hook into without state_machines itself needing to
know anything about AR or Rails loading. The base implementation runs immediately,
which is safe because owner_class_attribute_default always returns nil in the base
class. Only the AR integration needs to defer.
ActiveSupport.on_load(:active_record) is still too early for some setups (e.g.
Sidekiq workers that establish a DB connection only when a job runs), the AR
integration could fall back to Rails.application.config.after_initialize or wrap
check_conflicting_attribute_default in a rescue block. on_load is the right
first choice.check_conflicting_attribute_default must be public (or at least
protected) so that the deferred block can call it on the captured machine
reference.Is warn_conflicting_initial_state called anywhere yet? The message in #586582 (comment 3151687114) seems like it could be useful in some CI job.
Notice that ruby-io-event is in the job log, and epoll_pwait2 is not present:
$ objdump -t ./opt/gitlab/embedded/lib/ruby/gems/3.4.0/gems/io-event-1.12.1/lib/IO_Event.so | grep epoll_
0000000000000000 F *UND* 0000000000000000 epoll_create1@GLIBC_2.9
0000000000000000 F *UND* 0000000000000000 epoll_ctl@GLIBC_2.3.2
0000000000000000 F *UND* 0000000000000000 epoll_wait@GLIBC_2.3.2
Notice that ruby-io-event is in the job log, and epoll_pwait2 is not present:
$ objdump -t ./opt/gitlab/embedded/lib/ruby/gems/3.4.0/gems/io-event-1.12.1/lib/IO_Event.so | grep epoll_
0000000000000000 F *UND* 0000000000000000 epoll_create1@GLIBC_2.9
0000000000000000 F *UND* 0000000000000000 epoll_ctl@GLIBC_2.3.2
0000000000000000 F *UND* 0000000000000000 epoll_wait@GLIBC_2.3.2
Notice that ruby-io-event is NOT in the log, and epoll_pwait2 is there:
$ objdump -t ./opt/gitlab/embedded/lib/ruby/gems/3.4.0/gems/io-event-1.12.1/lib/IO_Event.so | grep epoll_
0000000000000000 F *UND* 0000000000000000 epoll_create1@GLIBC_2.9
0000000000000000 F *UND* 0000000000000000 epoll_ctl@GLIBC_2.3.2
0000000000000000 F *UND* 0000000000000000 epoll_wait@GLIBC_2.3.2
0000000000000000 F *UND* 0000000000000000 epoll_pwait2@GLIBC_2.35
I had that at first, but I was concerned this might break existing installations that might need python3.8 for some reason. It's probably okay given this recipe installs python3 now, but we should probably test that. I might have to spin up a local VM to test that, or we can just try it with an existing staging environment.
Fix corrupted logs and spurious errors in Workhorse file cleanup
The cleanup goroutine in newLocalFile was logging all removal errors, including "file not found" errors that occur when the file is already deleted (e.g., when context is cancelled before upload completes).
This caused two issues:
Now we:
| Before | After |
|---|---|
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.