Skip to content

relativityone/github-reporting

Repository files navigation

GitHub Direct Access Permissions Reporter

This repository contains a Python script and GitHub Actions pipeline for fetching and reporting on direct user permissions and team access across repositories in a GitHub organization using the GraphQL API.

Recent Updates

Team Permission Enhancement (Latest)

  • GitHub CLI Integration: The script now uses GitHub CLI (gh) to fetch team permission data for enhanced accuracy
  • Improved Team Detection: Teams are retrieved using GitHub's REST API via CLI, providing more reliable permission levels
  • Fallback Mechanism: If GitHub CLI is not available or authentication fails, the script gracefully continues without team data
  • Direct Access Focus: The tool focuses on direct collaborators and teams, avoiding inherited permissions for cleaner reporting

Previous Improvements

  • Enhanced Error Handling: Implemented exponential backoff retry logic for GraphQL API calls

  • Complete Data Retrieval: Added pagination support to ensure all collaborators are fetched

  • PAT Support: Integrated Personal Access Token support via REL_TOKEN secret for enhanced API access

  • Reliability Improvements: Added comprehensive error handling and graceful degradation

  • Direct Access Focus: Fetches only direct collaborators and teams (excludes inherited organization permissions)

  • Efficient GraphQL API: Uses GitHub's GraphQL API for faster data retrieval compared to REST API

  • Enhanced Team Detection: Uses GitHub CLI for accurate team permissions (falls back to GraphQL if not available)

  • Comprehensive Reports: Generates detailed CSV reports with user permissions, team access, summaries, and repository information

  • Team Support: Includes team-level permissions alongside individual user access

  • Automated Pipeline: GitHub Actions workflow for scheduled and manual execution

  • Flexible Configuration: Support for including/excluding archived repositories and custom organizations

Files

  • fetch_user_permissions_graphql.py - Main Python script that fetches user permissions
  • .github/workflows/fetch-user-permissions.yml - GitHub Actions pipeline
  • requirements.txt - Python dependencies

Generated Reports

The script generates three CSV files:

  1. {organization}_direct_permissions_graphql.csv - Detailed direct access permissions mapping (users and teams)
  2. {organization}_direct_summary_graphql.csv - Summary of each user's and team's access across repositories
  3. {organization}_repository_summary_graphql.csv - Summary of each repository with direct collaborator counts

Direct Access Focus

This tool reports DIRECT access only:

  • Included: Users explicitly added as repository collaborators
  • Included: Teams explicitly granted repository access (with accurate permissions via GitHub CLI)
  • Excluded: Organization-wide inherited permissions
  • Excluded: Permissions inherited from organization membership

Team Permission Enhancement: When GitHub CLI is installed and authenticated, the tool uses REST API endpoints to fetch accurate team permissions. Without GitHub CLI, team detection falls back to GraphQL with limited accuracy.

This provides a cleaner view of intentional, repository-specific access grants.

GitHub Actions Pipeline

Triggers

The pipeline runs on:

  • Manual trigger (workflow_dispatch) with optional parameters
  • Scheduled execution (weekly on Mondays at 8 AM UTC)
  • Code changes to the script or workflow file

Manual Execution

To run the pipeline manually:

  1. Go to the Actions tab in your GitHub repository
  2. Select "Fetch GitHub User Permissions" workflow
  3. Click "Run workflow"
  4. Configure parameters:
    • Organization: GitHub organization name (default: "relativityone")
    • Include archived: Whether to include archived repositories (default: false)

Required Permissions

The workflow can use either:

  1. Personal Access Token (PAT) - Recommended for full organization access
  2. Default GITHUB_TOKEN - Limited permissions, may miss private repositories

PAT Setup (Recommended)

For comprehensive organization reporting, set up a Personal Access Token:

  1. Create PAT: Follow the PAT Setup Guide
  2. Required scopes: repo, read:org, read:user
  3. Add to secrets: Store as REL_TOKEN in repository secrets

Default Token (Limited)

The workflow falls back to the default GITHUB_TOKEN which has restricted access and may not see all repositories.

Artifacts

Generated CSV files are automatically uploaded as workflow artifacts with:

  • Name: github-permissions-report-{run_number}
  • Retention: 30 days

Local Development

Prerequisites

  • Python 3.9+
  • GitHub Personal Access Token (recommended) or GitHub CLI
  • GitHub CLI (gh) - Required for accurate team permission detection

Setup

# Clone the repository
git clone <your-repo-url>
cd github-reporting

# Install dependencies
pip install -r requirements.txt

# Install and authenticate GitHub CLI (recommended for team data)
brew install gh
gh auth login

# Set up GitHub token (Option 1: PAT - Recommended)
export GITHUB_PAT="your_personal_access_token_here"

# Set up GitHub token (Option 2: Default token)
export GITHUB_TOKEN=$(gh auth token)

💡 For comprehensive results: Use both a PAT with repo, read:org, read:user scopes AND GitHub CLI authentication. See PAT Setup Guide for detailed instructions.

Running Locally

python fetch_user_permissions_graphql.py

The script will:

  1. Use the organization defined in the script (default: "relativityone")
  2. Generate CSV files in the current directory
  3. Display progress and summary statistics

Configuration

Edit the script to modify:

  • organization - Target GitHub organization
  • include_archived - Whether to include archived repositories
  • Output file names and paths

Performance

  • GraphQL Efficiency: ~50-200 queries vs 3000+ REST API calls
  • Rate Limiting: Automatic handling of GitHub API rate limits
  • Processing Time: Typically 10-30 minutes for large organizations
  • Memory Usage: Optimized for large datasets

Output Format

User Permissions CSV

Contains detailed user-repository permission mappings with columns for user info, repository details, and access levels.

User Summary CSV

Aggregated view showing each user's total repository access by permission level and repository characteristics.

Repository Summary CSV

Repository-focused view showing collaborator counts and permission distribution per repository.

Troubleshooting

Common Issues

  1. Authentication Errors

    • Ensure GITHUB_TOKEN environment variable is set
    • Verify token has necessary scopes for the target organization
  2. Rate Limiting

    • The script automatically handles rate limits with wait periods
    • GraphQL API has different limits than REST API
  3. Large Organizations

    • Processing may take significant time for organizations with thousands of repositories
    • Monitor workflow logs for progress updates
  4. Permission Errors

    • Ensure the token has access to the target organization
    • Some private repositories may not be accessible

Monitoring

The workflow provides:

  • Real-time progress logs
  • Summary statistics upon completion
  • Artifact upload confirmation
  • Error reporting and debugging information

Security Considerations

  • Uses GitHub's built-in GITHUB_TOKEN (recommended)
  • No hardcoded credentials in the code
  • Artifacts are automatically cleaned up after 30 days
  • Follows GitHub's API best practices for authentication and rate limiting

About

GitHub user permissions reporting tool using GraphQL API with automated CI/CD pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages