Skip to content

add support for --detect-loaded-modules#2228

Merged
boegel merged 14 commits intoeasybuilders:developfrom
boegel:detect_loaded_modules
Jun 22, 2017
Merged

add support for --detect-loaded-modules#2228
boegel merged 14 commits intoeasybuilders:developfrom
boegel:detect_loaded_modules

Conversation

@boegel
Copy link
Copy Markdown
Member

@boegel boegel commented May 19, 2017

This makes EasyBuild check early on whether any EasyBuild-generated modules are loaded in the current environment (fix for #153).

Via --detect-loaded-modules, you can configure EasyBuild to either print a big fat warning (warn, done by default), raise an error (fail) or don't complain (ignore, equivalent to current behaviour).

The detection is done by checking whether any $EBROOT* environment variables are defined, which are then mapped to the loaded modules that define them.

Through --allow-loaded-modules, a list of software names for which modules are allowed to be loaded can be specified.

Still WIP because I'd like to add support for trying to automagically fix the environment by either running module purge or trying to unload all the loaded non-ignored EasyBuild modules...

Thoughts on this @wpoely86, @vanzod, @migueldiascosta, @ocaisa, @damianam?

@boegel boegel added this to the 3.3.0 milestone May 19, 2017
Comment thread easybuild/tools/modules.py Outdated
# warn about $EBROOT* environment variables without matching loaded module
if eb_module_keys:
tup = (len(eb_module_keys), ROOT_ENV_VAR_NAME_PREFIX, '$' + ', $'.join(eb_module_keys))
print_warning("Found %d defined $%s* environment variables without matching loaded module: %s" % tup)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be fatal?

Copy link
Copy Markdown
Member Author

@boegel boegel May 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, who knows what's in the environment...
It's definitely worth warning about it though.

Also, I wouldn't be surprised if people already trick EB into doing this by defining $EBROOT* variables manually.

Which makes me realise, I need to check how this plays together with the support for using external modules... I know we define $EBROOT* variables for external modules that get loaded (cfr. http://easybuild.readthedocs.io/en/latest/Using_external_modules.html#handling-of-provided-metadata), but I need to check where we do that exactly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @wpoely86. Warnings tend to be ignored, but more importantly, a warning allows to proceed. So the door is open to break something unintentionally.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it may also be totally harmless, so making it fatal seems overly harsh to me...

I guess we could a configuration option for it:

--check-ebroot-env-vars={ignore,warn,fail}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it simply depends on how you want to treat "may be harmless" vs "won't be harmless" vs "will be harmless". It looks safer to me to err on the safe side and decline to proceed it is is not sure that it will be harmless. But in any case, I have no strong objection against keeping it as warning.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess you're right, how this should be treated is site-specific.

I've added an option for this now, default is to issue a warning, but you can also get an error, totally ignore it, or just unset them and continue (with a warning).

@boegel boegel changed the title add support for --detect-loaded-modules (WIP) add support for --detect-loaded-modules Jun 6, 2017
@boegel
Copy link
Copy Markdown
Member Author

boegel commented Jun 6, 2017

With the last update, I consider this ready. As opposed to what I proposed earlier, I think it makes sense to make the possible values for --detect-loaded-modules mutually exclusive.

In the current implementation, when using purge or unload, a warning will be printed to inform the user that purge is run or that the detected loaded modules are unloaded, so it doesn't make much sense to combine this with warn. Combining it with fail doesn't make sense either, what's the point in purging or unloading if we're going to error out anyway...

Thoughts @wpoely86, @vanzod, @damianam?

@damianam
Copy link
Copy Markdown
Member

@boegel, like we discussed in the last telco, I think it would be a good idea to add to the warning or error messages a tip on how to move forward.

Also, I think at the moment you are simply checking for EB generated modules. But taking a look at the external modules being used might be also desirable. Wouldn't you agree? I think simply appending to eb_module_keys the list of external modules included in the .cfg file would be enough (I am thinking about Cray here, I am not sure about other use cases).

@easybuilders easybuilders deleted a comment from boegelbot Jun 19, 2017
@boegel
Copy link
Copy Markdown
Member Author

boegel commented Jun 19, 2017

@damianam I'd like to get feedback from @gppezzi on the latter, especially w.r.t. Cray where you typically have a bunch of modules loaded by default, which are likely going to include ones that EasyBuild leverages as external modules.

Comment thread test/framework/modules.py Outdated

def check_loaded_modules():
"Helper function to run check_loaded_modules and check on stdout/stderr."
# there should be no errors/warnings by default is no (EasyBuild-generated) modules are loaded
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no instead of is no?

Comment thread test/framework/modules.py

# default action is to print a clear warning message
stderr = check_loaded_modules()
patterns = [
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the message in a variable in modules.py and read it from there? Otherwise you have to rewrite the message twice for any change on it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the reason for copy-pasting the message in the tests is so that we know when the message gets changed (potentially by accident), so this is intentional.

Comment thread test/framework/modules.py
"To specify action to take when loaded modules are detected, use "
"--detect-loaded-modules={error,ignore,purge,unload,warn}",
]
for pattern in patterns:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to a single check? Slicing the pattern in subpatterns is kind of weird and will produce also a weird output with too many "Pattern found" messages. Alternatively, maybe check for patterns, but print just the ones that do not match?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only when one of the patterns isn't found, this will produce output (as a failing test).

The check is an assertion, so you'll see something like:

AssertionError: Pattern ... found in ...

So, the assertion is that a certain pattern should be found, and if not, the assertion fails and you get an assertion error.

This is done consistently across all tests. It looks backwards, but it's not if you interpret it right.

I could make this one big pattern, sure, but that makes spotting a mismatch visually a lot harder, so I prefer to break it down in parts. Speedup of the tests is going to be negligible, we should optimise for human effort here.

Comment thread easybuild/tools/modules.py Outdated
self.log.info("Found non-ignored loaded (EasyBuild-generated) modules, but ignoring it as configured")

elif action == PURGE:
msg = "Found non-ignored loaded (EasyBuild-generated) modules, running 'module purge': %s"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msg = "Found non-ignored loaded (EasyBuild-generated) modules (%s), running 'module purge'"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll change that if you feel it's clearer, thanks for the suggestion!

Comment thread test/framework/modules.py Outdated
init_config(build_options=build_options)

# error mentioning 1 non-ignored module (OpenMPI), both GCC and hwloc loaded modules are ignored
error_pattern = "Found one or more non-ignored loaded .* module.*\n\* OpenMPI/1.6.4-GCC-4.6.4"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before? Sync this automatically with the message in modules.py instead of rewriting it here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this pattern also match if GCC or hwloc are incorrectly not ignored?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding of the message is deliberate.

I'll make sure that also checking for just a single module name (OpenMPI) being mentioned is checked (by injecting a $ to enforce that the OpenMPI line is the last and only module being mentioned).

Comment thread test/framework/modules.py
init_config(build_options=build_options)
self.assertEqual(check_loaded_modules(), '')

# error if any $EBROOT* environment variables are defined that don't match a loaded module
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a separate test?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is for the check_loaded_modules method in easybuild/tools/modules.py, which also checks the orphaned $EBROOT* env vars, so it makes sense to keep it as a single test?

eb_module_keys.remove(key)

# warn about $EBROOT* environment variables without matching loaded module
if eb_module_keys:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flush this out to a different method?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for checking for orphaned $EBROOT* env vars is because that's we rely on for checking for loaded (EasyBuild-generated) modules, so I prefer keeping this together in the same method.

If I flesh it out, we would need to call boths methods in sequence anyway, and we'd have to duplicate some of the code, i.e. collecting defined $EBROOT* env vars, and tying the knot with the loaded modules...

@boegel
Copy link
Copy Markdown
Member Author

boegel commented Jun 21, 2017

docs update @ easybuilders/easybuild#344

@boegel
Copy link
Copy Markdown
Member Author

boegel commented Jun 22, 2017

Thanks for the review @vanzod!

@damianam I'll go ahead and merge this in to be included with EasyBuild v3.3.0, but do let me know if there's anything you want to get back to w.r.t. my replies to your review.

@boegel boegel merged commit c9fa70e into easybuilders:develop Jun 22, 2017
@boegel boegel deleted the detect_loaded_modules branch June 22, 2017 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants