Link

Refresh Jupyter Notebooks With Papermill

papermill is a great project for running notebooks programatically. You can pass parameters to notebooks, use different kernels, etc.

Example: refresh notebooks every 6 hours

Put a shell script at the location_action_files/run_notebooks.sh with the following contents:


#!/bin/sh
set -e
cd $(dirname "$0")/..
cd _notebooks/

ERRORS=""

# Loop through all notebooks and run them with papermill
for file in *.ipynb
do
    if papermill --kernel python3 "${file}" "${file}"; then
        echo "Sucessfully refreshed ${file}\n\n\n\n"
    else
        echo "ERROR Refreshing ${file}"
        ERRORS="${ERRORS}, ${file}"
    fi
done

# Emit Errors If Exists So Downstream Task Can Open An Issue
if [ -z "$ERRORS" ]
then
    echo "::set-output name=error_bool::false"
else
    echo "These files failed to update properly: ${ERRORS}"
    echo "::set-output name=error_bool::true"
    echo "::set-output name=error_str::${ERRORS}"
fi

In the location .github/workflows/update-nb.yaml define an Actions workflow with the following contents

name: Update Notebooks And Refresh Page
on:
  schedule:
    - cron:  '0 */6 * * *'

jobs:
  update-notebooks:
    runs-on: ubuntu-latest
    steps:
  

    - name: Copy Repository Contents
      uses: actions/checkout@v2

    - name: Set up Python
      uses: actions/setup-python@v1
      with:
        python-version: 3.6

    - name: install dependencies
      run: |
        pip3 install -r ./_notebooks/requirements.txt
        python3 -m ipykernel install --user --name python3
        sudo chmod -R 777 .

    - name: update notebooks
      id: update_nb
      run: |
        ./_action_files/run_notebooks.sh
        
    - name: Create an issue if notebook update failure occurs
      if: steps.update_nb.outputs.error_bool == 'true'
      uses: actions/github-script@0.6.0
      with:
        github-token: $
        script: |
          var err = process.env.ERROR_STRING;
          var run_id = process.env.RUN_ID;
          github.issues.create({
            owner: context.repo.owner,
            repo: context.repo.repo,
            title: "Error updating notebooks",
            body: `These are the notebooks that failed to update properly: \n${err}\n\n See run [${run_id}](https://github.com/github/covid19-dashboard/actions/runs/${run_id}) for more details.`
          })
      env:
        ERROR_STRING: $
        RUN_ID: $

This example is based on covid19dashboards.com, which uses fastpages and papermill to refresh notebooks and serve them as dasbhoards. You can see browse the workflow files of this project to see how papermill is used.