add --check-contrib CLI option#2551
Conversation
…syconfigs/tools.py (+ tests)
…t to --check-style for now
…tyle aspects of --check-contrib
|
tested it on a couple of easyconfigs, with and without errors and missing checksums, for both sources an patches, also tried empty checksums, seems to work as advertised |
|
I wonder if it makes sense to have |
|
@damianam I was wondering about that for a while when working on this, and concluded (after consulting @migueldiascosta on it) that adding What we intend to do with Also, What we can do, on top of what this PR does already, is to deprecate But maybe some people use In short, |
That would make sense to me. It is a pity to be "hold hostage" of backwards compatibility. Because this PR is a bit more complicated than it could be, just to deal with that situation. |
|
I haven't tried check-style yet, but by the sound of it I would prefer to keep it, unless check-contrib will be able to take arguments like --check-contrib=style so one can run a fast style check without the rest during development... |
|
I think @akesandgren just answered the question of if it ever makes sense to check for style but not other things. If the problem is only backward compatibility, we can leave |
|
Based on @akesandgren's feedback, it seems like keeping I started looking into making When running something like It's not that this can't be implemented, it's just that you'd always have to run We have the same problem with other options, for example By making Long story short: just keeping |
|
@damianam @akesandgren Any reason to hold this back? This PR is a requirement to start automatic checking of SHA256 checksums in easyconfig PRs... |
| style_check_passed = True | ||
| for path in paths: | ||
| # if an EasyConfig instance is provided, just grab the corresponding file path | ||
| if isinstance(path, EasyConfig): |
There was a problem hiding this comment.
Not that doing this check is bad, but it is confusing that paths is a list of paths, yet we assume that EasyConfig objects can be passed here. Shouldn't we change the parameter description and name accordingly?
| for (fn, chksum) in checksums: | ||
| # check that there's only one checksum specified per file (i.e. a string, not a list/tuple) | ||
| if not isinstance(chksum, basestring): | ||
| msg = "Exactly one (SHA256) checksum should be specified for %s in %s" % (fn, ec_fn) |
There was a problem hiding this comment.
Change the message to "One string containing a SHA256 checksum should be specified for %s in %s" % (fn, ec_fn)? That is clearer if someone mistakenly passes a string within a list/tuple.
There was a problem hiding this comment.
Wouldn't "A string ..." be more correct language wise?
And ""must be specified" perhaps?
| def run_contrib_checks(ecs): | ||
| """Run contribution check on specified easyconfigs.""" | ||
|
|
||
| def print_result(result, label): |
There was a problem hiding this comment.
Rename the result parameter to passed? Seems clearer to me.
| print_msg("\nChecking for SHA256 checksums in %d easyconfig(s)...\n" % len(ecs), prefix=False) | ||
| sha256_checksums_ok = True | ||
| for ec in ecs: | ||
| sha256_res = check_sha256_checksums([ec]) |
There was a problem hiding this comment.
rename sha256_res to sha256_issues?
| return res | ||
|
|
||
|
|
||
| def check_contrib_or_style(ecs, check_contrib, check_style): |
There was a problem hiding this comment.
The current naming seems to indicate "Check if either contrib or style checks were done", as in "it is not possible to not run none of them" check_checks seems to be more consistent to me, but admittedly not ideal......
There was a problem hiding this comment.
I see your point, I'll rename it run_contrib_style_checks
| self.assertTrue(res) | ||
| # multiple checksums listed for source tarball, while exactly one (SHA256) checksum is expected | ||
| self.assertTrue(res[-1].startswith("Exactly one (SHA256) checksum should be specified for toy-0.0.tar.gz")) | ||
|
|
There was a problem hiding this comment.
Yeah, very good point, as I was fixing this I noticed the part where we verify that checksums for Bundle components didn't actually work... :)
I did have to copy-paste the bundle.py easyblock under test/ though in order to write a test for this...
| elif isinstance(ec, basestring): | ||
| path = ec | ||
| else: | ||
| raise EasyBuildError("Value of unknown type encountered in cmdline_easyconfigs_style_check: %s (type: %s)", |
| elif isinstance(ec, basestring): | ||
| path = ec | ||
| else: | ||
| raise EasyBuildError("Value of unknown type encountered in cmdline_easyconfigs_style_check: %s (type: %s)", |
|
I've marked this The "separation of concerns" principle is not honored by having to make the EasyBuild framework aware of the guts of a particular easyblock (i.e. the |
|
I've reworked this significantly, the implementation is a lot cleaner now since it supports leaving the stuff specific to the @damianam, @akesandgren Please re-review? |
…asyBlock, which can be extended in easyblocks (e.g. Bundle)
| else: | ||
| raise EasyBuildError("Checksum verification for %s using %s failed.", fil['path'], fil['checksum']) | ||
|
|
||
| def check_checksums_for(self, ent, sub='', source_cnt=None): |
There was a problem hiding this comment.
The source_cnt parameter seems buggy to me. It is used to limit the number of sources in extensions, right? But it actually doesn't do that if I understand the code correctly. You can have an extension with 2 sources, a single checksum, and it will be all fine, since we are passing source_cnt = 1, so the count is artificially limited to 1, but in reality 2 sources are listed. Then, there is just a single checksum, but we should have 2. Maybe it makes more sense to have a is_extension = False parameter, and if it is explicitly set to True, verify that the calculated source_cnt is exactly 1?
There was a problem hiding this comment.
There's a hard assumption in easyblock.py currently that each extension only has a single source file. The extension handling mechanism doesn't pick up on sources (so you can't specify a list of sources), it only provides source_tmpl to tweak the name of the source tarball for the extension (and it has a proper default for common type of extensions like Python packages, etc.).
That's why the source_cnt=1 thing is being used here, since there simply isn't a list of sources to count for extensions...
| elif not sha256_regex.match(chksum): | ||
| msg = "Checksum '%s' for %s in %s must be a SHA256 checksum" % (chksum, fn, ec_fn) | ||
| checksum_issues.append(msg) | ||
| klass = get_easyblock_class(ec['easyblock'], name=ec['name']) |
--check-contribis an extension of the already existing--check-style: it checks whether the specified easyconfigs conform to the expected style, and also runs additional checks that are relevant when contributing back easyconfigs.Currently the only additional check being done is whether SHA256 checksums are in place in all specified easyconfig files, but other checks can be added easily (for example to ensure that easyconfigs using the
PythonPackageeasyblock havedownload_dep_fail = Trueoruse_pip = Trueset, when we want to start enforcing that, or potentially also whether the dependency versions are what is expected when a common toolchain is used).The plan is to leverage
check_sha256_checksumsfunction being added here in easybuilders/easybuild-easyconfigs#5005 to also enforce SHA256 checksums in all easyconfigs that are touched by PRs.