Conversation
PR SummaryThis pull request introduces several enhancements to the environment configuration and execution of Jupyter Notebooks within the project. Key changes include:
These changes improve the flexibility and security of environment management and streamline the process of executing and monitoring Jupyter Notebooks. Test Suggestions
|
nrichers
left a comment
There was a problem hiding this comment.
LGTM! Thank you for switching over to using .env files generated on the fly for the different workflows and fixing/improving a whole slew of other things as well — the make execute action is a great touch! 🚀🚀🚀
This PR actually opens up other possibilities for the future as well, such as executing more notebooks or surfacing code cells from notebooks that require access to the VM library to work.
Some general comments follow — these are NOT requests to make changes now, just some observations or suggestions we can look into for the future.
macOS runners
Related to the Polars deadlock specifically, we could switch our GitHub runners over to macOS, say, macos-latest. It currently includes Python 3.13.0.
We've already noticed some other issues in our docs where the behavior on Linux is subtly different compared to how content builds locally on macOS ... (Test descriptions, anyone?)
However, we might run into other subtle differences related to Python environments with this change, so it's definitely something we would want to test further before making a decision.
Disabling warnings in notebooks
Related to warnings in notebooks more generally, it looks like there is a built-in Python package called warnings that could be imported and then used to filter out warnings at the notebook level:
import warnings
warnings.filterwarnings('ignore')
More info in https://docs.python.org/3/library/warnings.html. I might experiment with this package the next time I record videos and report back.
What's the fallback when things fail?
Since this PR and the previous, related PR change how things work, we should figure out what the process is when rendering the docs site locally with the intent to deploy, e.g. to hot-fix a docs site issue where our GitHub workflow is not working or GitHub is down. It's unlikely we'd be doing this very often but it has happened.
I think we would run the following:
make get-sourcequarto render(there's technicallymake docs-sitethat includes step 1. but it currently uses thedevelopmentprofile)make executemake deploy-prod(disable the branch protection rules or run from the localprodbranch after merging in the latest changes)
Does that match your expectation?
Yes, but the Linux environment was actually a lifesaver here. SO much smoother — on MacOS you run into this edge case with the dependencies: https://www.notion.so/validmind/Documented-Issues-Troubleshooting-bc9a38fc9ab34afd88251a92265c5a6f?pvs=4
Interesting! I think they're OK in this one — sometimes the output being "imperfect" is sort of part of the experience, kind of like when product tour videos are a bit outdated. It has some organic/human element to it. In this case it still runs all the way through, so I will leave it for now but I will definitely play around with that package at some point just to see what it can do.
Yes, I was going to log a Story for next Sprint for putting the Miro board and some written internal process docs for our our infra works today, and that workaround would go on it! @nrichers Can you pretty please review/approve validmind/validmind-library#251 so I can call |
Taking a look now, @validbeck! |
|
A PR preview is available: Preview URL |
|
@nrichers Not quite there... for example, when merging into I think I know why the comparison is failing — our branches essentially match between merges, so we actually have to compare the PR being merged to the |
Internal Notes for Reviewers
Updated, updated Miro board:
Workflows updates
.envfile now created on the flydocumentation/.github/workflows/validate-docs-site.yaml
Line 43 in 4568aa8
documentation/.github/workflows/deploy-docs-staging.yaml
Line 44 in 4568aa8
documentation/.github/workflows/deploy-docs-prod.yaml
Line 44 in 4568aa8
Execution step now passes in the generated
.envfile into the composite actionsdocumentation/.github/workflows/validate-docs-site.yaml
Line 55 in 4568aa8
documentation/.github/workflows/deploy-docs-staging.yaml
Line 56 in 4568aa8
documentation/.github/workflows/deploy-docs-prod.yaml
Line 56 in 4568aa8
Notebook filters now check against the incoming branch instead of against
mainwhen applicableThis is to get around the default condition that caused this push to
prodto filter incorrectly: https://github.com/validmind/documentation/actions/runs/12039272835/job/33566716893documentation/.github/workflows/validate-docs-site.yaml
Line 37 in 4568aa8
documentation/.github/workflows/deploy-docs-staging.yaml
Line 38 in 4568aa8
documentation/.github/workflows/deploy-docs-prod.yaml
Line 38 in 4568aa8
Actions updates
Actions now receive the generated
.envfile as an inputdocumentation/.github/actions/demo-notebook/action.yml
Line 4 in 4568aa8
documentation/.github/actions/prod-notebook/action.yml
Line 4 in 4568aa8
documentation/.github/actions/staging-notebook/action.yml
Line 4 in 4568aa8
Step to double-check
.envis actually available within the composite action where expecteddocumentation/.github/actions/demo-notebook/action.yml
Line 32 in 4568aa8
documentation/.github/actions/prod-notebook/action.yml
Line 32 in 4568aa8
documentation/.github/actions/staging-notebook/action.yml
Line 32 in 4568aa8
Only execute the notebook if
.envis found, and pass the.envin for the notebook from the rootdocumentation/.github/actions/demo-notebook/action.yml
Line 41 in 4568aa8
documentation/.github/actions/prod-notebook/action.yml
Line 41 in 4568aa8
documentation/.github/actions/staging-notebook/action.yml
Line 41 in 4568aa8
Makefile updates
New
executecommand.envcredentials and any notebook dependencies installed:documentation/site/Makefile
Line 125 in 59d5694
.env.examplehas been updated as well to accommodate local execution:documentation/.env.example
Line 3 in 3e3d6e8
get-sourcemakes a copy ofintro_for_model_developers.ipynbInstead of reverting
notebooks/tutorials/intro_for_model_developers_EXECUTED.ipynbto the copy on main, now it is made on the fly by copying the source and renaming the file, ELIMINATING OUR EDGE CASE UPKEEP IN THE SOURCE!!! 🎉documentation/site/Makefile
Line 47 in 59d5694
Docs updates
Store model credentials in
.envfilesLIVE PREVIEW
Enable monitoring
LIVE PREVIEW
Added the
.envedge case here as well.Expected behaviour
PR >
mainmain>stagingmain, the workflow merging intostagingfrommainwill trigger the notebook execution: WORKFLOW LINK PLACEHOLDERstaging>prodprod, the workflow merging intostagingfromprodwill trigger the notebook execution: WORKFLOW LINK PLACEHOLDERWorkflow environment warnings
I couldn't figure out how to get rid of these without modifying the notebook source. Ugh. Anything I tried in the workflow actions to shift the dependencies or spawn method didn't work.
Polars deadlock
Affects "Run the model evaluation tests" & "Run the model evaluation tests" sections:
LangChain uses pydantic v2 internally
Affects the "Verify & preview the documentation template" section:
SHAPGlobalImportance error
Juan confirmed he is still looking into this; I passed him some more details today for sc-7456.
External Release Notes
ValidMind Academy
Jupyter Notebooks
.envfile in our Jupyter Notebook samples.