Skip to content

Live execution of intro_for_model_developers_EXECUTED.ipynb#528

Merged
validbeck merged 52 commits intomainfrom
beck/sc-7406/replace-executed-intro-for-model-developers
Nov 26, 2024
Merged

Live execution of intro_for_model_developers_EXECUTED.ipynb#528
validbeck merged 52 commits intomainfrom
beck/sc-7406/replace-executed-intro-for-model-developers

Conversation

@validbeck
Copy link
Copy Markdown
Collaborator

@validbeck validbeck commented Nov 14, 2024

Internal Notes for Reviewers

sc-7406

ValidMind Library updates

This PR pulls in the latest changes from updated validmind-library via make get-source. I can confirm that the edits show up in the rendered files.

For example: LIVE API REFERENCE PREVIEW

Live execution of intro_for_model_developers_EXECUTED.ipynb

I've used Quarto's native JupyterLab functionality to render this notebook on the fly, and it only cost me some of my sanity — what a bargin.

  • Now, the executed notebook RENDERS THE MISSING WIDGET TEMPLATES which are now fully interactive!!!
  • It is also now set to page-layout: full to cover more of the negative-space left by the missing sidenav.
Old New
Screenshot 2024-11-15 at 12 58 13 PM Screenshot 2024-11-15 at 12 57 40 PM
OLD VERSION NEW LIVE PREVIEW

How does this work? Well:

Global Jupyter Notebook execution settings

In _quarto.yml, there are now some global execute settings determining behaviour for what Quarto should do when it renders .ipynb files:

# General Jupyter notebooks settings

  • Execution for notebooks is disabled by default
  • Notebooks shouldn't be re-executed during re-renders

intro_for_model_developers_EXECUTED.ipynb front matter

There is now a raw cell at the top of this version of the notebook that determines the granular execution behaviour for rendering this file:

Screenshot 2024-11-15 at 2 00 01 PM

Local output for executed notebook added to .gitignore

If you preview or render the files locally, this notebook will execute and we don't want the output cruft (since the workflows take care of the execution for our hosted sites):

site/_freeze/

  • Quarto rendered version of the execution ignored: site/_freeze/
  • Actual output of that particular notebook ignored: site/notebooks/tutorials/my_tests/

Secure api_key & api_secret

Screenshot 2024-11-15 at 2 23 06 PM

New "Repository secrets"
  1. PLATFORM_API_KEY
  2. PLATFORM_API_SECRET
  • These secrets are linked to this ValidMind Academy model: cm3j9lu5208mf21ifij2tp6bk
    Screenshot 2024-11-15 at 2 01 21 PM
  • Obviously, the code in the notebook references both the GH workflow secrets and your local .env secrets so that it works in both environments — see below.
.env.example
  • Locally, I also have the secrets added to my .env file. WHEN YOU RUN THE NEXT PREVIEW OR RENDER LOCALLY, you'll need to replace this with the Academy org's model's key and secret so that the notebook executes properly, as well as add yourself as a developer to the model, etc.
  • I've updated the .env.example as well:

# API INFO TO RUN /notebooks/tutorials/intro_for_model_developers_EXECUTED.ipynb

Updated GitHub workflows

Screenshot 2024-11-15 at 1 30 45 PM

Some new jobs have been added to all three of our "render docs" workflows:

  • Install python3 for Jupyter Notebooks: This installs the version of python3 required for ValidMind & upgrades pip for use in Check dependencies.
  • Install validmind for notebook execution: This installs all the validmind packages and dependencies required to run the "Intro to Model Developers" notebook with minimal warnings and errors. (There are a few weird ones, I went on quite a journey.)
  • Check dependencies: This installs pycairo required for check and then runs the dependency check.

The following job has been updated to include the repository env secrets:

  • Render docs site:

n.b.

This PR won't be merged until #529 is approved and merged as they are intertwined.

Local execution of intro_for_model_developers_EXECUTED.ipynb

EDIT: We discussed this and we will add this officially to the .gitignore in a follow-up PR so we can just leave this file untouched and not override it accidentally with local changes:

Screenshot 2024-11-15 at 4 28 35 PM

  • You'll either have to add the Academy model's valid key and secrets as the model's developer so that the notebook renders/previews fine locally (as long as you have most of the packages/dependencies sorted..... which they might not be (WARNING: 🐰 🕳️ ))
  • OR, comment out the following in the front matter and temporarily add this file to your .gitignore so that the changes don't get sent up to remote:
    code-tools: true
    include-after-body: 
      - https://unpkg.com/@jupyter-widgets/html-manager@*/dist/embed-amd.js
execute: 
  enabled: true
  freeze: auto

SHAPGlobalImportance.py error

  • This test is returning some errors but the rest of the tests are still being logged to the platform.
  • Same error under two sections:
  1. "Run the model evaluation tests"
  2. "Viewing and updating the configuration for the entire model documentation template > Update the config"
ERROR(validmind.vm_models.test_suite.test): Failed to run test 'validmind.model_validation.sklearn.SHAPGlobalImportance': (TypeError) loop of ufunc does not support argument 0 of type float which has no callable rint method
  • This didn't happen the last time I executed the notebook and I spent about half a day fiddling with dependencies and script edits to no avail, so I've just logged a Shortcut Story on advice of Andres: sc-7456
  • The good thing is, since the notebook is being executed on the fly, in theory if it is an issue with the Library or the test, once those updates are pulled in the errors in the rendered notebook should resolve (I think).

Developer Fundamentals iframe

  • This is still linking to the TOP-LEVEL docs-demo rendered version, because for some reason the rendered relative links in the iframe are finicky. This feels similar to the issue with the original relative link buttons in the release previews for the python docs.
  • BUT, since the executed/rendered version is fine outside of that relative link in the preview, it should likely be fine once we send the new executed notebook up to staging and prod.
  • You can inspect this yourself by going into the training, and then on the first slide that has the relative executed notebook as the background, click View Frame Source:
    Screenshot 2024-11-15 at 1 53 37 PM

Added render run-time

Yeah, I know. I'm sorry. For local versions, you can use the .gitignore hack for the specific notebook to get around it. I think this is pretty cool functional trade-off though.

Edit: Variables are messed up on mobile view & breadcrumbs

This applies to titles and sidebar links in _quarto.yml:

Screenshot 2024-11-19 at 10 40 55 AM

Applied a workaround for now (just using the plain-text term).

@validbeck validbeck added DO NOT MERGE PR is not ready to be merged internal Not to be externalized in the release notes labels Nov 14, 2024
@validbeck validbeck self-assigned this Nov 14, 2024
@validbeck validbeck removed the DO NOT MERGE PR is not ready to be merged label Nov 14, 2024
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@validbeck validbeck changed the title Testing the online render of the executed notebook file Testing the online render of the executed notebook file — DO NOT MERGE YET Nov 14, 2024
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

2 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@validbeck validbeck added the DO NOT MERGE PR is not ready to be merged label Nov 14, 2024
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

2 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

New excuted copy without edits

WIP quarto render to execute the notebook so it displays widgets

Renaming executed file

Switching to notebook front matter

Hmm

Tweaked workflow?

Removed _freeze test files & commented out checkout in Makefile

Ugh trying again

Called make get-source for latest vm-library files

Reverting makefile

Changing API keys to secret variables

New intro_for_model_developers_EXECUTED.ipynb -secrets
@validbeck validbeck force-pushed the beck/sc-7406/replace-executed-intro-for-model-developers branch from 870153f to e24ffaf Compare November 14, 2024 23:35
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@validbeck validbeck force-pushed the beck/sc-7406/replace-executed-intro-for-model-developers branch from 3674d6a to c6949a7 Compare November 21, 2024 03:32
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

@validbeck
Copy link
Copy Markdown
Collaborator Author

validbeck commented Nov 21, 2024

Holy forking shirtballs, I figured it out. Updated Miro workflow:

New Docs Infra Workflow

Proof that it works conditionally based on what files are included in the PR: #536

tl;dr Now the execution only runs on each workflow if there are site/notebook/ files included in the PR.

Some other fun additions:

  • Hid the navbar for the notebook so it looks better in training:

  • Downloading & installing Quarto is now single-sourced (Setup for some future stable image of Quarto here??):

  • Separate execution profiles for demo, staging, and prod so that heap tracking is correct on all three:

quarto render --profile exe-demo notebooks/tutorials/intro_for_model_developers_EXECUTED.ipynb &> render_errors.log || {

quarto render --profile exe-staging notebooks/tutorials/intro_for_model_developers_EXECUTED.ipynb &> render_errors.log || {

quarto render --profile exe-prod notebooks/tutorials/intro_for_model_developers_EXECUTED.ipynb &> render_errors.log || {

Part 2

sc-7505

  • Update the relevant source notebooks in validmind-library repo to use the store credentials in .env file example already used in the 1st draft to get rid of the _EXECUTED notebook edgecase
  • Pull in the latest source from validmind-library and test that the updated workflows trigger only on notebook updates (make get-source)
  • Have the GitHub workflow create the .env file on the fly so we're able to to continue to store the credentials on GH AND get rid of the _EXECUTED notebook edgecase
  • Confirm that the .env file created by GitHub works and the notebook executes properly after the adjustments

nts

  • Retool the current conditional execution steps referencing the execution actions to create the .env file on the fly first
  • Only call the execution step if the .env file creation isn't skipped
# See if site/notebooks/ has updates
- name: Filter changed files
  uses: dorny/paths-filter@v2
  id: filter
  with:
    filters: |
      notebooks:
        - 'site/notebooks/**'

# If yes then create the .env file for use in execution step
- name: Create .env file
  if: steps.filter.outputs.notebooks == 'true'
  id: create_env
  run: |
    echo "APP_ENV=production" >> .env
    echo "PLATFORM_API_HOST=${{ secrets.VM_API_HOST }}" >> .env
    echo "PLATFORM_API_KEY=${{ secrets.VM_API_KEY }}" >> .env
    echo "PLATFORM_API_SECRET=${{ secrets.VM_API_SECRET }}" >> .env
    echo "PLATFORM_DEV_MODEL =${{ secrets.VM_API_MODEL }}" >> .env

- name: Display .env file for debugging, remove on prod
  run: cat .env

# Only execute the demo notebook if .env file creation isn't skipped
- name: Execute demo Intro for Model Developers notebook
  if: ${{ steps.create_env.outcome != 'skipped' }}
  uses: ./.github/actions/demo-notebook
  id: execute-demo-notebook
  with:
    env_file: .env

Nice to have

  • Look into this GitHub workflow environment warning:
In addition, using fork() with Python in general is a recipe for mysterious
deadlocks and crashes.

The most likely reason you are seeing this error is because you are using the
multiprocessing module on Linux, which uses fork() by default. This will be
fixed in Python 3.14. Until then, you want to use the "spawn" context instead.

See https://docs.pola.rs/user-guide/misc/multiprocessing/ for details.
  • Add make action to Makefile for local execution

@validbeck validbeck requested a review from nrichers November 21, 2024 04:29
@github-actions
Copy link
Copy Markdown
Contributor

A PR preview is available: Preview URL

Copy link
Copy Markdown
Collaborator

@nrichers nrichers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holy forking shirtballs, I figured it out.

Holy mackarel, you’re not kidding. What did we unleash by asking you to tackle the switch to profiles, Beck?!

The single-sourced actions are a great touch.

I can't even ... Spent about an hour poking around in this PR and I've accepted that we simply need to merge this. (Also, I don't think I have a reliable way to test if this will actually deploy to staging and prod without serious shenanigans, so let's just merge and watch for potential issues.)

@validbeck validbeck merged commit 741dda1 into main Nov 26, 2024
@validbeck validbeck deleted the beck/sc-7406/replace-executed-intro-for-model-developers branch November 26, 2024 01:18
@validbeck
Copy link
Copy Markdown
Collaborator Author

@nrichers
Copy link
Copy Markdown
Collaborator

@nrichers And staging works!!!

Yup, that looks GREAT! I think this makes the fully executed notebook with the complete output the preferred approach to how we surface the stuff in training — it can be the reference point against which people can check their own work.

Applause!

@validbeck
Copy link
Copy Markdown
Collaborator Author

validbeck commented Nov 26, 2024

@nrichers There seems to be not QUITE success for merging into prod: https://github.com/validmind/documentation/actions/runs/12039272835/job/33566716893

It looks like there is some additional logic comparing main to prod (which makes sense) which means that the filter step returns false... hmmmm.

EDIT: Hopefully these fix it...

@nrichers
Copy link
Copy Markdown
Collaborator

@nrichers There seems to be not QUITE success for merging into prod: https://github.com/validmind/documentation/actions/runs/12039272835/job/33566716893

It looks like there is some additional logic comparing main to prod (which makes sense) which means that the filter step returns false... hmmmm.

Let's take a look when we meet later today? But looking at that run, the workflow likely needs to compare staging against prod, though even then a local git diff I run doesn't seem to pick up the changed notebooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal Not to be externalized in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants