Skip to content

add --check-contrib CLI option#2551

Merged
damianam merged 10 commits intoeasybuilders:developfrom
boegel:check_contrib
Sep 3, 2018
Merged

add --check-contrib CLI option#2551
damianam merged 10 commits intoeasybuilders:developfrom
boegel:check_contrib

Conversation

@boegel
Copy link
Copy Markdown
Member

@boegel boegel commented Aug 22, 2018

--check-contrib is an extension of the already existing --check-style: it checks whether the specified easyconfigs conform to the expected style, and also runs additional checks that are relevant when contributing back easyconfigs.

Currently the only additional check being done is whether SHA256 checksums are in place in all specified easyconfig files, but other checks can be added easily (for example to ensure that easyconfigs using the PythonPackage easyblock have download_dep_fail = True or use_pip = True set, when we want to start enforcing that, or potentially also whether the dependency versions are what is expected when a common toolchain is used).

The plan is to leverage check_sha256_checksums function being added here in easybuilders/easybuild-easyconfigs#5005 to also enforce SHA256 checksums in all easyconfigs that are touched by PRs.

@migueldiascosta
Copy link
Copy Markdown
Member

tested it on a couple of easyconfigs, with and without errors and missing checksums, for both sources an patches, also tried empty checksums, seems to work as advertised

@boegel boegel requested a review from damianam August 22, 2018 13:07
@damianam
Copy link
Copy Markdown
Member

I wonder if it makes sense to have --check-style and --check-contrib as 2 different options. They are too similar to be considered 2 different options, IMO. Are there test cases you can think of, where one might want to check the style, and just the style?

@boegel
Copy link
Copy Markdown
Member Author

boegel commented Aug 22, 2018

@damianam I was wondering about that for a while when working on this, and concluded (after consulting @migueldiascosta on it) that adding --check-contrib was the right way to go forward.

What we intend to do with --check-contrib is way broader than what --check-style was intended for. Whether or not SHA256 checkums are in place is not really a style concern (it's more of a security concern), and the same goes for other things we plan to include in --check-contrib (the PythonPackage specific checks mentioned in the description, checking the dependency versions in easyconfigs using a (recent) common toolchain, etc.).

Also, --check-style has been around for a while, so we can't just rename it to --check-contrib either, since we need to retain backward compatibility.

What we can do, on top of what this PR does already, is to deprecate --check-style (i.e. mark it to be removed in EasyBuild v4.0), and suggest using --check-contrib instead.

But maybe some people use --check-style to try and maintain a consistent style in their own easyconfigs repository, and they don't want to enforce all the other things we'll be adding in --check-contrib...

In short, --check-contrib is broader than --check-style, which we shouldn't remove in EasyBuild v3.x (and there's very little value in doing so anyway, imho).

@damianam
Copy link
Copy Markdown
Member

What we can do, on top of what this PR does already, is to deprecate --check-style (i.e. mark it to be removed in EasyBuild v4.0), and suggest using --check-contrib instead.

That would make sense to me. It is a pity to be "hold hostage" of backwards compatibility. Because this PR is a bit more complicated than it could be, just to deal with that situation.

@akesandgren
Copy link
Copy Markdown
Contributor

I haven't tried check-style yet, but by the sound of it I would prefer to keep it, unless check-contrib will be able to take arguments like --check-contrib=style so one can run a fast style check without the rest during development...

@migueldiascosta
Copy link
Copy Markdown
Member

I think @akesandgren just answered the question of if it ever makes sense to check for style but not other things.

If the problem is only backward compatibility, we can leave --check-style merely as an alias to Ake's --check-contrib=style?

@boegel
Copy link
Copy Markdown
Member Author

boegel commented Aug 23, 2018

Based on @akesandgren's feedback, it seems like keeping --check-style (and hence not deprecating it) makes sense.

I started looking into making --check-contrib accept an optional list of checks to run, so you can do --check-contrib=style, but the fact that it's optional has a nasty side-effect due to a limitation in the option parser (sort of)...

When running something like eb --check-style bzip2.eb, the bzip2.eb would be considered an argument to --check-contrib, equivalent to using --check-style=bzip2.eb, which would then of course fail since bzip2.eb is not a known type of check (or because no easyconfig files are specified, which would be an even more confusing error to the end user)...

It's not that this can't be implemented, it's just that you'd always have to run eb bzip2-1.0.6.eb --check-style rather than eb --check-style bzip2-1.0.6.eb to avoid running into an error like "bzip2-1.0.6.eb' is not a known check type".

We have the same problem with other options, for example --inject-checksums and --robot, which also optionally take an argument (either a checksum type like md5 or sha256, or a (list of) directories, resp.), and I'm not inclined to add more situations like this, because I end up shooting myself in the foot with these options (running into an error) over and over again (muscle memory, I guess)...

By making run_contrib_check a bit smarter, we could see if a specified unknown check type corresponds to an existing (easyconfig) file, and then promote the not-a-check-type-but-a-file, but the implementation would be quite messy since we need a parsed easyconfig file, not a filepath, so... ugh.

Long story short: just keeping --check-style instead is a lot easier imho, since the work is already done... ;)

@boegel
Copy link
Copy Markdown
Member Author

boegel commented Aug 28, 2018

@damianam @akesandgren Any reason to hold this back?

This PR is a requirement to start automatic checking of SHA256 checksums in easyconfig PRs...

Comment thread easybuild/framework/easyconfig/style.py Outdated
style_check_passed = True
for path in paths:
# if an EasyConfig instance is provided, just grab the corresponding file path
if isinstance(path, EasyConfig):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that doing this check is bad, but it is confusing that paths is a list of paths, yet we assume that EasyConfig objects can be passed here. Shouldn't we change the parameter description and name accordingly?

Comment thread easybuild/framework/easyconfig/tools.py Outdated
for (fn, chksum) in checksums:
# check that there's only one checksum specified per file (i.e. a string, not a list/tuple)
if not isinstance(chksum, basestring):
msg = "Exactly one (SHA256) checksum should be specified for %s in %s" % (fn, ec_fn)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the message to "One string containing a SHA256 checksum should be specified for %s in %s" % (fn, ec_fn)? That is clearer if someone mistakenly passes a string within a list/tuple.

Copy link
Copy Markdown
Contributor

@akesandgren akesandgren Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't "A string ..." be more correct language wise?
And ""must be specified" perhaps?

Comment thread easybuild/framework/easyconfig/tools.py Outdated
def run_contrib_checks(ecs):
"""Run contribution check on specified easyconfigs."""

def print_result(result, label):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename the result parameter to passed? Seems clearer to me.

Comment thread easybuild/framework/easyconfig/tools.py Outdated
print_msg("\nChecking for SHA256 checksums in %d easyconfig(s)...\n" % len(ecs), prefix=False)
sha256_checksums_ok = True
for ec in ecs:
sha256_res = check_sha256_checksums([ec])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename sha256_res to sha256_issues?

Comment thread easybuild/main.py Outdated
return res


def check_contrib_or_style(ecs, check_contrib, check_style):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current naming seems to indicate "Check if either contrib or style checks were done", as in "it is not possible to not run none of them" check_checks seems to be more consistent to me, but admittedly not ideal......

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, I'll rename it run_contrib_style_checks

self.assertTrue(res)
# multiple checksums listed for source tarball, while exactly one (SHA256) checksum is expected
self.assertTrue(res[-1].startswith("Exactly one (SHA256) checksum should be specified for toy-0.0.tar.gz"))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test of bundles is missing

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, very good point, as I was fixing this I noticed the part where we verify that checksums for Bundle components didn't actually work... :)

I did have to copy-paste the bundle.py easyblock under test/ though in order to write a test for this...

elif isinstance(ec, basestring):
path = ec
else:
raise EasyBuildError("Value of unknown type encountered in cmdline_easyconfigs_style_check: %s (type: %s)",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined name 'EasyBuildError'

elif isinstance(ec, basestring):
path = ec
else:
raise EasyBuildError("Value of unknown type encountered in cmdline_easyconfigs_style_check: %s (type: %s)",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined name 'EasyBuildError'

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@boegel boegel changed the title add --check-contrib CLI option add --check-contrib CLI option (WIP) Aug 31, 2018
@boegel
Copy link
Copy Markdown
Member Author

boegel commented Aug 31, 2018

I've marked this WIP again because I'm not happy at all with the way in which I'm dealing with checking whether Bundle components have checksums...

The "separation of concerns" principle is not honored by having to make the EasyBuild framework aware of the guts of a particular easyblock (i.e. the comp_cfgs bit), I'll look into a better way of dealing with this...

@boegel boegel changed the title add --check-contrib CLI option (WIP) add --check-contrib CLI option Aug 31, 2018
@easybuilders easybuilders deleted a comment from boegelbot Aug 31, 2018
@boegel
Copy link
Copy Markdown
Member Author

boegel commented Aug 31, 2018

I've reworked this significantly, the implementation is a lot cleaner now since it supports leaving the stuff specific to the Bundle easyblock there, see easybuilders/easybuild-easyblocks#1496 .

@damianam, @akesandgren Please re-review?

…asyBlock, which can be extended in easyblocks (e.g. Bundle)
Comment thread easybuild/framework/easyblock.py
else:
raise EasyBuildError("Checksum verification for %s using %s failed.", fil['path'], fil['checksum'])

def check_checksums_for(self, ent, sub='', source_cnt=None):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source_cnt parameter seems buggy to me. It is used to limit the number of sources in extensions, right? But it actually doesn't do that if I understand the code correctly. You can have an extension with 2 sources, a single checksum, and it will be all fine, since we are passing source_cnt = 1, so the count is artificially limited to 1, but in reality 2 sources are listed. Then, there is just a single checksum, but we should have 2. Maybe it makes more sense to have a is_extension = False parameter, and if it is explicitly set to True, verify that the calculated source_cnt is exactly 1?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a hard assumption in easyblock.py currently that each extension only has a single source file. The extension handling mechanism doesn't pick up on sources (so you can't specify a list of sources), it only provides source_tmpl to tweak the name of the source tarball for the extension (and it has a proper default for common type of extensions like Python packages, etc.).

That's why the source_cnt=1 thing is being used here, since there simply isn't a list of sources to count for extensions...

Comment thread easybuild/framework/easyconfig/tools.py Outdated
elif not sha256_regex.match(chksum):
msg = "Checksum '%s' for %s in %s must be a SHA256 checksum" % (chksum, fn, ec_fn)
checksum_issues.append(msg)
klass = get_easyblock_class(ec['easyblock'], name=ec['name'])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eb_class?

@damianam damianam merged commit c33b497 into easybuilders:develop Sep 3, 2018
@boegel boegel deleted the check_contrib branch September 4, 2018 07:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants