Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 99 additions & 8 deletions .github/actions/check-warnings/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,57 @@ This GitHub Action scans HTML files for Python warnings and optionally fails the
uses: QuantEcon/meta/.github/actions/check-warnings@main
```

### Advanced Usage with PR Mode

The action supports a special "PR mode" that only checks HTML files corresponding to changed `.md` files in a pull request. This is particularly useful for large documentation repositories where you only want to check warnings in lectures that are being modified:

```yaml
- name: Check for Python warnings in changed lectures only
uses: QuantEcon/meta/.github/actions/check-warnings@main
with:
html-path: './_build/html'
pr-mode: 'true'
fail-on-warning: 'true'
```

When `pr-mode` is enabled, the action will:
1. Detect which `.md` files have changed in the current PR or push
2. Map each changed `.md` file to its corresponding `.html` file in the build directory
3. Only scan those specific HTML files for warnings

This significantly reduces noise from warnings in unrelated lectures and helps focus on the changes being made.

**Benefits of PR Mode:**
- **Faster CI runs**: Only scans relevant files instead of entire documentation
- **Focused feedback**: Only reports warnings related to your changes
- **Reduced noise**: Avoids confusion from pre-existing warnings in other files
- **Better developer experience**: Clearer feedback on what needs to be fixed

### Example: Normal Mode vs PR Mode

**Normal Mode (scans all files):**
```
Scanning HTML files in: ./_build/html
Found 50 HTML files
❌ Found 15 warnings across all files
```

**PR Mode (scans only changed files):**
```
Running in PR mode - detecting changed .md files...
Found 2 changed .md file(s):
lectures/optimization.md
exercises/exercise1.md
Mapped lectures/optimization.md -> _build/html/lectures/optimization.html
Mapped exercises/exercise1.md -> _build/html/exercises/exercise1.html
Will check 2 HTML file(s) in PR mode
❌ Found 1 warning in changed files
```

In this example, PR mode helps you focus on the 1 warning in your changes rather than being overwhelmed by 15 warnings across the entire codebase.

In this example, PR mode helps you focus on the 1 warning in your changes rather than being overwhelmed by 15 warnings across the entire codebase.

### Advanced Usage with PR Comments

```yaml
Expand Down Expand Up @@ -211,6 +262,28 @@ You can enable both issue creation and artifact generation simultaneously:

This action specifically searches for Python warnings within HTML elements that have `cell_output` in their class attribute. This approach prevents false positives that would occur if warnings like "FutureWarning" or "DeprecationWarning" are mentioned in the text content of documentation pages.

### PR Mode

When `pr-mode` is enabled, the action performs these additional steps:

1. **Detect Changed Files**: Uses git to identify which `.md` files have changed in the current PR or push
2. **Map to HTML Files**: For each changed `.md` file, finds the corresponding `.html` file in the build directory using these strategies:
- Direct mapping: `lecture.md` → `build/html/lecture.html`
- Path-preserving mapping: `path/to/lecture.md` → `build/html/path/to/lecture.html`
- Recursive search: Finds `lecture.html` anywhere in the build directory
3. **Focused Scanning**: Only scans the mapped HTML files instead of all HTML files in the directory

This is particularly valuable for large documentation repositories where:
- You have many lecture files but only modify a few in each PR
- You want to avoid reporting warnings from unrelated lectures
- You want faster CI runs by scanning fewer files

### File Mapping Examples

- `introduction.md` → `_build/html/introduction.html`
- `lectures/chapter1.md` → `_build/html/lectures/chapter1.html`
- `advanced/optimization.md` → `_build/html/advanced/optimization.html`

### Example HTML Structure

The action will detect warnings in this structure:
Expand Down Expand Up @@ -259,6 +332,7 @@ If you're only using the basic warning check functionality, only `contents: read
| `create-artifact` | Whether to create a workflow artifact with the warning report | No | `false` |
| `artifact-name` | Name for the workflow artifact containing the warning report | No | `warning-report` |
| `notify` | GitHub username(s) to assign to the created issue (comma-separated for multiple users) | No | `` |
| `pr-mode` | When enabled, only check HTML files corresponding to changed .md files in the PR (requires git repository context) | No | `false` |

## Outputs

Expand All @@ -272,7 +346,7 @@ If you're only using the basic warning check functionality, only `contents: read

## Example Workflow

Here's a complete example of how to use this action in a workflow:
Here's a complete example of how to use this action in a workflow with PR mode:

```yaml
name: Build and Check Documentation
Expand All @@ -295,6 +369,9 @@ jobs:

steps:
- uses: actions/checkout@v4
with:
# Fetch full history for proper PR mode operation
fetch-depth: 0

- name: Set up Python
uses: actions/setup-python@v4
Expand All @@ -309,18 +386,32 @@ jobs:
run: |
jupyter-book build .

- name: Check for Python warnings
# For PRs: only check changed files to reduce noise
- name: Check for Python warnings (PR mode)
if: github.event_name == 'pull_request'
uses: QuantEcon/meta/.github/actions/check-warnings@main
with:
html-path: './_build/html'
# Uses comprehensive default warnings (all Python warning types)
fail-on-warning: ${{ github.event_name == 'push' }} # Fail on push, warn on PR
create-issue: ${{ github.event_name == 'push' }} # Create issues for main branch
notify: 'maintainer1,reviewer2' # Assign issues to team members
create-artifact: 'true' # Always create artifacts
artifact-name: 'warning-report'
pr-mode: 'true'
fail-on-warning: 'true'

# For pushes to main: check all files
- name: Check for Python warnings (full check)
if: github.event_name == 'push'
uses: QuantEcon/meta/.github/actions/check-warnings@main
with:
html-path: './_build/html'
pr-mode: 'false'
fail-on-warning: 'false'
create-issue: 'true'
notify: 'maintainer1,reviewer2'
```

This workflow demonstrates:
- **PR mode for pull requests**: Only checks files related to changes, provides quick feedback
- **Full mode for main branch**: Comprehensive checking with issue creation for tracking
- **Different failure behavior**: Strict for PRs, informational for main branch

## Use Case

This action is particularly useful for:
Expand Down
202 changes: 16 additions & 186 deletions .github/actions/check-warnings/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ inputs:
description: 'GitHub username(s) to assign to the created issue (comma-separated for multiple users)'
required: false
default: ''
pr-mode:
description: 'When enabled, only check HTML files corresponding to changed .md files in the PR. If no .md files changed, exits early with no warnings found (requires git repository context)'
required: false
default: 'false'

outputs:
warnings-found:
Expand All @@ -50,6 +54,9 @@ outputs:
warning-details:
description: 'Details of warnings found'
value: ${{ steps.check.outputs.warning-details }}
detailed-report:
description: 'Detailed markdown report of warnings found'
value: ${{ steps.check.outputs.detailed-report }}
issue-url:
description: 'URL of the created GitHub issue (if create-issue is enabled)'
value: ${{ steps.create-issue.outputs.issue-url }}
Expand All @@ -63,192 +70,15 @@ runs:
- name: Check for warnings
id: check
shell: bash
env:
INPUT_HTML_PATH: ${{ inputs.html-path }}
INPUT_WARNINGS: ${{ inputs.warnings }}
INPUT_EXCLUDE_WARNING: ${{ inputs.exclude-warning }}
INPUT_FAIL_ON_WARNING: ${{ inputs.fail-on-warning }}
INPUT_PR_MODE: ${{ inputs.pr-mode }}
run: |
# Parse inputs
HTML_PATH="${{ inputs.html-path }}"
WARNINGS="${{ inputs.warnings }}"
EXCLUDE_WARNINGS="${{ inputs.exclude-warning }}"
FAIL_ON_WARNING="${{ inputs.fail-on-warning }}"

echo "Scanning HTML files in: $HTML_PATH"
echo "Looking for warnings: $WARNINGS"

# Convert comma-separated warnings to array
IFS=',' read -ra WARNING_ARRAY <<< "$WARNINGS"

# Handle exclude-warning parameter
if [ -n "$EXCLUDE_WARNINGS" ]; then
echo "Excluding warnings: $EXCLUDE_WARNINGS"
# Convert comma-separated exclude warnings to array
IFS=',' read -ra EXCLUDE_ARRAY <<< "$EXCLUDE_WARNINGS"

# Create a new array with warnings not in exclude list
FILTERED_WARNING_ARRAY=()
for warning in "${WARNING_ARRAY[@]}"; do
# Remove leading/trailing whitespace from warning
warning=$(echo "$warning" | xargs)
exclude_warning=false

# Check if this warning should be excluded
for exclude in "${EXCLUDE_ARRAY[@]}"; do
# Remove leading/trailing whitespace from exclude warning
exclude=$(echo "$exclude" | xargs)
if [ "$warning" = "$exclude" ]; then
exclude_warning=true
break
fi
done

# Add to filtered array if not excluded
if [ "$exclude_warning" = false ]; then
FILTERED_WARNING_ARRAY+=("$warning")
fi
done

# Replace WARNING_ARRAY with filtered array
WARNING_ARRAY=("${FILTERED_WARNING_ARRAY[@]}")

# Show final warning list
if [ ${#WARNING_ARRAY[@]} -eq 0 ]; then
echo "⚠️ All warnings have been excluded. No warnings will be checked."
else
echo "Final warning list after exclusions: ${WARNING_ARRAY[*]}"
fi
fi

# Initialize counters
TOTAL_WARNINGS=0
WARNING_DETAILS=""
WARNINGS_FOUND="false"
DETAILED_REPORT=""

# Find all HTML files
if [ ! -e "$HTML_PATH" ]; then
echo "Error: HTML path '$HTML_PATH' does not exist"
exit 1
fi

# Determine if we're dealing with a file or directory
if [ -f "$HTML_PATH" ]; then
# Single file
if [[ "$HTML_PATH" == *.html ]]; then
echo "Checking single HTML file: $HTML_PATH"
FILES=("$HTML_PATH")
else
echo "Error: '$HTML_PATH' is not an HTML file"
exit 1
fi
else
# Directory - find all HTML files
mapfile -d '' FILES < <(find "$HTML_PATH" -name "*.html" -type f -print0)
fi

# Create temporary Python script for parsing HTML
echo 'import re' > /tmp/check_warnings.py
echo 'import sys' >> /tmp/check_warnings.py
echo 'import os' >> /tmp/check_warnings.py
echo '' >> /tmp/check_warnings.py
echo 'def find_warnings_in_cell_outputs(file_path, warning_text):' >> /tmp/check_warnings.py
echo ' try:' >> /tmp/check_warnings.py
echo ' with open(file_path, "r", encoding="utf-8") as f:' >> /tmp/check_warnings.py
echo ' content = f.read()' >> /tmp/check_warnings.py
echo ' ' >> /tmp/check_warnings.py
echo ' # Find all HTML elements with cell_output in the class attribute' >> /tmp/check_warnings.py
echo ' pattern = r"<([^>]+)\s+class=\"[^\"]*cell_output[^\"]*\"[^>]*>(.*?)</\1>"' >> /tmp/check_warnings.py
echo ' ' >> /tmp/check_warnings.py
echo ' matches = []' >> /tmp/check_warnings.py
echo ' ' >> /tmp/check_warnings.py
echo ' # Search for cell_output blocks' >> /tmp/check_warnings.py
echo ' for match in re.finditer(pattern, content, re.DOTALL | re.IGNORECASE):' >> /tmp/check_warnings.py
echo ' block_content = match.group(2)' >> /tmp/check_warnings.py
echo ' block_start = match.start()' >> /tmp/check_warnings.py
echo ' ' >> /tmp/check_warnings.py
echo ' # Count line number where this block starts' >> /tmp/check_warnings.py
echo ' block_line = content[:block_start].count("\\n") + 1' >> /tmp/check_warnings.py
echo ' ' >> /tmp/check_warnings.py
echo ' # Search for warning within this block' >> /tmp/check_warnings.py
echo ' if warning_text in block_content:' >> /tmp/check_warnings.py
echo ' # Find specific lines within the block that contain the warning' >> /tmp/check_warnings.py
echo ' block_lines = block_content.split("\\n")' >> /tmp/check_warnings.py
echo ' for i, line in enumerate(block_lines):' >> /tmp/check_warnings.py
echo ' if warning_text in line:' >> /tmp/check_warnings.py
echo ' actual_line_num = block_line + i' >> /tmp/check_warnings.py
echo ' # Clean up the line for display (remove extra whitespace, HTML tags)' >> /tmp/check_warnings.py
echo ' clean_line = re.sub(r"<[^>]+>", "", line).strip()' >> /tmp/check_warnings.py
echo ' if clean_line: # Only add non-empty lines' >> /tmp/check_warnings.py
echo ' matches.append(f"{actual_line_num}:{clean_line}")' >> /tmp/check_warnings.py
echo ' ' >> /tmp/check_warnings.py
echo ' # Output results' >> /tmp/check_warnings.py
echo ' for match in matches:' >> /tmp/check_warnings.py
echo ' print(match)' >> /tmp/check_warnings.py
echo ' ' >> /tmp/check_warnings.py
echo ' except Exception as e:' >> /tmp/check_warnings.py
echo ' print(f"Error processing file: {e}", file=sys.stderr)' >> /tmp/check_warnings.py
echo ' sys.exit(1)' >> /tmp/check_warnings.py
echo '' >> /tmp/check_warnings.py
echo 'if __name__ == "__main__":' >> /tmp/check_warnings.py
echo ' file_path = sys.argv[1]' >> /tmp/check_warnings.py
echo ' warning_text = sys.argv[2]' >> /tmp/check_warnings.py
echo ' find_warnings_in_cell_outputs(file_path, warning_text)' >> /tmp/check_warnings.py

# Search for warnings in HTML files within cell_output elements
for file in "${FILES[@]}"; do
echo "Checking file: $file"

# Skip warning check if no warnings to check for
if [ ${#WARNING_ARRAY[@]} -eq 0 ]; then
echo "No warnings to check for in $file (all excluded)"
continue
fi

for warning in "${WARNING_ARRAY[@]}"; do
# Remove leading/trailing whitespace from warning
warning=$(echo "$warning" | xargs)

# Run the Python script and capture results
matches=$(python3 /tmp/check_warnings.py "$file" "$warning" 2>/dev/null || true)

if [ -n "$matches" ]; then
WARNINGS_FOUND="true"
count=$(echo "$matches" | wc -l)
TOTAL_WARNINGS=$((TOTAL_WARNINGS + count))

echo "⚠️ Found $count instance(s) of '$warning' in $file:"
echo "$matches"

# Add to basic details
if [ -n "$WARNING_DETAILS" ]; then
WARNING_DETAILS="$WARNING_DETAILS\n"
fi
WARNING_DETAILS="$WARNING_DETAILS$file: $count instance(s) of '$warning'"

# Add to detailed report
DETAILED_REPORT="$DETAILED_REPORT## $warning in $file\n\n"
DETAILED_REPORT="$DETAILED_REPORT**Found $count instance(s):**\n\n"
DETAILED_REPORT="$DETAILED_REPORT\`\`\`\n"
DETAILED_REPORT="$DETAILED_REPORT$matches\n"
DETAILED_REPORT="$DETAILED_REPORT\`\`\`\n\n"
fi
done
done

# Set outputs
echo "warnings-found=$WARNINGS_FOUND" >> $GITHUB_OUTPUT
echo "warning-count=$TOTAL_WARNINGS" >> $GITHUB_OUTPUT
echo "warning-details<<EOF" >> $GITHUB_OUTPUT
echo -e "$WARNING_DETAILS" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
echo "detailed-report<<EOF" >> $GITHUB_OUTPUT
echo -e "$DETAILED_REPORT" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT

# Summary
if [ "$WARNINGS_FOUND" = "true" ]; then
echo "❌ Found $TOTAL_WARNINGS warning(s) in HTML files"
echo "::error::Found $TOTAL_WARNINGS Python warning(s) in HTML output"
else
echo "✅ No warnings found in HTML files"
fi
# Run the check-warnings script
${{ github.action_path }}/check-warnings.sh

- name: Post PR comment with warning report
if: inputs.fail-on-warning == 'true' && steps.check.outputs.warnings-found == 'true' && github.event_name == 'pull_request'
Expand Down Expand Up @@ -475,4 +305,4 @@ runs:

branding:
icon: 'alert-triangle'
color: 'orange'
color: 'orange'
Loading
Loading