Skip to content

Ruby CI update#2845

Merged
danbri merged 23 commits intoschemaorg:mainfrom
gkellogg:ruby-ci-update
Jul 2, 2021
Merged

Ruby CI update#2845
danbri merged 23 commits intoschemaorg:mainfrom
gkellogg:ruby-ci-update

Conversation

@gkellogg
Copy link
Contributor

Fixes #2842.

Note, creates a new "pending-failures.txt" file that lists files known to fail. Any failures not in this file will cause the build to fail.

@gkellogg
Copy link
Contributor Author

Also, if listed files are fixed, they should be removed from the pending-failures file, otherwise the build will detect that they were expected to fail, but didn't.

@gkellogg
Copy link
Contributor Author

Odd that it passed on my repo, but failed here. I'll work on it some more.

@RichardWallis
Copy link
Contributor

Checking the difference between this and a working one is this:

    </tr>
                ^
       13:1: ERROR: Character tokens aren't legal here
           ...
       ^

I remember seeing this when I got Ruby wrong in the past.

The example it is barfing on is this:

    </tr>
    <tr>
      <td>John Adams (1797-1801)</td>
      <td>Federalist</td>
    </tr>
    ...
  </table>

Don't know why though .

@gkellogg
Copy link
Contributor Author

So, it seems that these errors are real, and the examples were introduced since the base of my PR. I'll add them to the pending list, but this shows why this is useful, to catch such errors before they get merged to main.

@gkellogg
Copy link
Contributor Author

By the way, a useful local workflow can be derived from the GitHub Action:

Run the following:

  • software/util/buildsite.py -f RDFExport.nquads Context
  • mkdir software/site/releases/LATEST
  • (cd software/site/releases/software/util/schemaversion.py; cp -r * ../LATEST)

Set up Ruby (requires Ruby 3.0 installed locally, typically using either rbenv or rvm):

  • cd software/site/scripts
  • bundle install
  • bundle exec rake clean
  • bundle exec rake context
  • bundle exec rake vocab

Then you can iterate on bundle exec rake examples and bundle exec rspec spec/examples_spec.rb

The rake examples step parses the example files to create spec/examples_spec.rb, then the rspec step executes those tests.

@gkellogg
Copy link
Contributor Author

I'll leave this as is. The failure is because a test marked pending has been fixed:

Examples issue-2681-examples.txt[15] - CreativeWork-educationRequirements-OccupationalExperienceRequirements-experienceInPlaceOfEducation-credentialCategory-JobPosting-445 (jsonld) FIXED
4355
     Expected pending 'fix' to fail. No error was raised.

Removing the appropriate entry from pending-failures.txt would solve this, but a PR is something of a moving target when changes get made underneath.

@gkellogg
Copy link
Contributor Author

Checking the difference between this and a working one is this:

    </tr>
                ^
       13:1: ERROR: Character tokens aren't legal here
           ...
       ^

I remember seeing this when I got Ruby wrong in the past.

The example it is barfing on is this:

    </tr>
    <tr>
      <td>John Adams (1797-1801)</td>
      <td>Federalist</td>
    </tr>
    ...
  </table>

Don't know why though .

That's the HTML content model. The error actually comes from the Gumbo parser. The model for the Table element must not allow text nodes. The W3C HTML validator gives an equivalent error:

Error: Misplaced non-space characters inside a table.
From line 11, column 8; to line 13, column 8
d>↩  </tr>↩  ...↩</table>↩

@gkellogg
Copy link
Contributor Author

You could always use <!-- ... --> markup if you want to insert otherwise non-valid text into HTML markup. Unfortunately, there's really no way to do this for JSON.

The JSON-LD spec uses a special encoding (#### and ****) inside of the JSON, which is stripped before checking, which is how we provide some way to highlight or annotate JSON and still validate it. We could certainly implement something similar if you want to use such a convention. This would handle HTML, too. OTOH, if the purpose of examples is to show valid markup that could be copied and pasted, it would be promoting invalid markup as a best practice.

@gkellogg
Copy link
Contributor Author

Not sure why that test passes when it’s expected to fail. It uses an undefined property “monthsOfExperience”.

@gkellogg
Copy link
Contributor Author

@RichardWallis what are your plans for this PR? Do you need me to do more? I could fix the example which is causing the (intermittent) failure, but it seemed that it was best to not conflate the two issues.

@RichardWallis
Copy link
Contributor

@gkellogg Sorry life got in the way!

My intention was/is to check through all your comments and then plan to merge it in, probably post V12.0 release in case there are any ramifications.

I should get a chance in the next few days.

I note the build is failing at the moment is that the intermittent failure you refer to?

@gkellogg
Copy link
Contributor Author

Yes, and example that should fail, and does when I run it locally, passes on GH Actions, but can't quite see why. As it's expected to fail, it is an error for it to pass (rspec logic). When the updated main branch is merged into the PR, it could be that others will fail, as either new examples are introduced which have errors, or something's been fixed.

If you like, I can fix examples in this branch (after 12.0 is released anyway) and get it to pass cleanly so that it won't (shouldn't) start failing things improperly when merged. This could be an opportunity to work on some of those long-standing examples.

Another thought would be to exclude those examples from the pending directory, which arguably aren't important to check until they're no longer pending. Otherwise, it comes down to opinionated direction on markup errors and inappropriate term use, which I haven't tried to solve directly, only by adding them to the expected errors.

@gkellogg
Copy link
Contributor Author

Okay, should be more repeatable now.

  • It was duplicating the work that the python setup code did in identifying examples; I modified the CI setup to make examples and now it just takes them from LATEST/schemaorg-all-examples.txt.
  • The code that was naming examples had some violated assumptions, and provisions for needing a fake name, which is no longer the case. All generated example files now come from the text (e.g., "eg-xxx-jsonld.html").

Note that the examples included do seem to include pending examples, which might be a candidate for improvement in the setup scripts.

@gkellogg
Copy link
Contributor Author

Tests eg-0132 and 0133 use "May" as the value for scheduledTime. This has a range of schema:DateTime, which is an ISO formatted data. The description in https://schema.org/DateTime is [-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]. https://schema.org/Date instead just references ISO date format, which includes many different notations, including the --05 I attempted to use. Arguably, this could be acceptable, at least if the value was schema:Date, although my tools don't at this point.

It may be that both schema:Date, schema:Time, and schema:DateTime should all reference the ISO date format, and allow any representation defined there. In this case, the existing example is certainly invalid, but not clear what the proper correction is.

@gkellogg
Copy link
Contributor Author

This fixes all the syntax and markup issues in tests, along with some minor semantics improvements. Some of these were hiding semantics issues with inappropriate/wrong properties or types.

It leaves 45 tests with semantic issues which require analysis.

@RichardWallis
Copy link
Contributor

Great work.

It would probably make sense to merge this in now, which I'll attend to.

Then raise a new issue for the outstanding 45 which I will assign to me. Is the list available to attach to the issue?

1 similar comment
@RichardWallis
Copy link
Contributor

Great work.

It would probably make sense to merge this in now, which I'll attend to.

Then raise a new issue for the outstanding 45 which I will assign to me. Is the list available to attach to the issue?

@gkellogg
Copy link
Contributor Author

The list of known failures, which lead to the pending status, is in data/pending-failures.txt, and is effectively just the output summary of the test run, itself.

When fixing a test (or adding a new failure), just modify that file. Only the bit after the " - " is significant, and lines starting with '#' are ignored.

We will probably need to discuss how to interpret the linter output for different classes of failures.

@gkellogg
Copy link
Contributor Author

Remaining list of known failures:

eg-0122 (jsonld)
eg-0123 (jsonld)
eg-0124 (jsonld)
eg-0125 (jsonld)
eg-0126 (jsonld)
eg-0132 (jsonld)
eg-0133 (jsonld)
eg-0144 (jsonld)
eg-0208 (rdfa)
eg-0222 (jsonld)
eg-0223 (jsonld)
eg-0223 (microdata)
eg-0223 (rdfa)
eg-0225 (jsonld)
eg-0225 (microdata)
eg-0225 (rdfa)
eg-0227 (jsonld)
eg-0229 (jsonld)
eg-0232 (jsonld)
eg-0232 (microdata)
eg-0232 (rdfa)
eg-0233 (jsonld)
eg-0233 (microdata)
eg-0233 (rdfa)
eg-0234 (jsonld)
eg-0234 (microdata)
eg-0234 (rdfa)
eg-0256 (jsonld)
eg-0256 (microdata)
eg-0256 (rdfa)
eg-0270 (jsonld)
eg-0271 (jsonld)
eg-0279 (jsonld)
eg-0290 (jsonld)
eg-0291 (jsonld)
eg-0292 (jsonld)
eg-0293 (jsonld)
eg-0309 (jsonld)
eg-0311 (jsonld)
eg-0312 (jsonld)
eg-0327 (jsonld)
eg-0356 (jsonld)
eg-0356 (microdata)
eg-0356 (rdfa)
eg-0463 (jsonld)

@danbri
Copy link
Contributor

danbri commented Apr 7, 2021

Are we good for merging here?

@RichardWallis
Copy link
Contributor

Yes - go for it.

I will then raise an umbrella issue to address examples that fail as listed in the data/pending-failures.txt file.

@gkellogg
Copy link
Contributor Author

gkellogg commented Apr 8, 2021

It's gotten down to a pretty management set, and some of them arguably could be improvements to the SDL.

@gkellogg
Copy link
Contributor Author

Yes - go for it.

I will then raise an umbrella issue to address examples that fail as listed in the data/pending-failures.txt file.

Ping! I’d hate to miss another release cycle.

@github-actions
Copy link

This pull request is being tagged as Stale due to inactivity.

@github-actions github-actions bot added the no-pr-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). label Jun 15, 2021
@gkellogg gkellogg mentioned this pull request Jul 1, 2021
@alex-jansen alex-jansen requested a review from danbri July 2, 2021 08:10
@alex-jansen
Copy link
Contributor

alex-jansen commented Jul 2, 2021

@danbri Can this still be squeezed into 13? (see #2914)

@danbri danbri merged commit 6c12a4d into schemaorg:main Jul 2, 2021
@danbri
Copy link
Contributor

danbri commented Jul 2, 2021

care to send a PR for releases.html too?

@gkellogg gkellogg deleted the ruby-ci-update branch July 2, 2021 14:55
@gkellogg
Copy link
Contributor Author

gkellogg commented Jul 2, 2021

care to send a PR for releases.html too?

Is there an issue for this? I could take a look.

@alex-jansen
Copy link
Contributor

Created pull request #2917

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-pr-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Review and update Ruby CI tests

4 participants