Skip to content

Add automated template submission pipeline#799

Draft
jongio wants to merge 5 commits intoAzure:mainfrom
jongio:template-submission-v2
Draft

Add automated template submission pipeline#799
jongio wants to merge 5 commits intoAzure:mainfrom
jongio:template-submission-v2

Conversation

@jongio
Copy link
Copy Markdown
Member

@jongio jongio commented Mar 29, 2026

Summary

Adds an automated template submission pipeline that mirrors the existing extension submission model. Trusted partners can submit templates via an issue form instead of manually editing templates.json.

Flow: Issue form -> Workflow validates repo -> Updates templates.json -> Creates PR -> Human reviews and approves

What's included

Workflow & Issue Form

  • .github/ISSUE_TEMPLATE/template-submission.yml - 12-field issue form
  • .github/workflows/template-submission.yml - 174-line YAML orchestration (no inline JS)

Standalone Scripts (testable, reusable)

  • website/scripts/validate-template.js - URL validation with SSRF protection (IPv4 + IPv6 + DNS rebinding defense)
  • website/scripts/parse-template-issue.js - Issue body parser
  • website/scripts/update-templates-json.js - JSON updater (imports shared validation, no duplication)

Tests

  • 129 tests across 5 suites (75 validation + 20 parsing + 34 JSON update)

Docs

  • Updated website/docs/contribute.md with automated submission instructions

Security hardening

  • SHA-pinned GitHub Actions (not mutable tags)
  • HTTPS-only URL validation
  • SSRF protection: private IP blocking (IPv4, IPv6, mapped), DNS rebinding prevention
  • HTML stripping with unclosed tag handling
  • Field length limits (title: 200, description: 500, author: 100)
  • Tag character allowlist + count limits
  • All user input via env vars (no expression injection in run blocks)
  • Concurrency group prevents race conditions on templates.json
  • Redirect rejection (2xx only, no 3xx)
  • Duplicate detection via canonical URL (strips query/fragment/trailing slash/.git)

What's NOT included (by design)

  • No auto-merge workflow
  • No trusted-publisher labels or detection
  • No complexity beyond the extension submission model

Closes #784

Issue form + workflow that validates repos, updates templates.json, and
creates PRs automatically. Mirrors the extension submission model.

- Issue form with 12 fields (source repo, title, description, author, etc.)
- Workflow with SHA-pinned actions, SSRF protection, input sanitization
- Standalone scripts: parse-template-issue.js, update-templates-json.js,
  validate-template.js (shared validation library)
- 129 tests across 5 test suites
- No auto-merge, no trusted-publisher labels
- Human reviews and approves the PR

Closes Azure#784

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an issue-form–driven automation path for adding new azd templates to the gallery, similar in spirit to the existing automated extension flows, by validating a submitted repo URL and generating a PR that updates website/static/templates.json.

Changes:

  • Introduces a new GitHub Issue Form + workflow (template-submission) that parses issue/dispatch inputs, validates the repo URL, updates templates.json, and opens a PR.
  • Adds standalone Node scripts for issue parsing, URL/SSRF validation, and updating templates.json.
  • Adds Jest tests covering URL canonicalization/SSRF checks, issue parsing, and JSON updating; updates Jest config.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
.github/workflows/template-submission.yml New workflow to parse submission inputs, validate repo, update templates.json, and create a PR + issue comments.
.github/ISSUE_TEMPLATE/template-submission.yml New issue form for template submissions (collects repo/title/desc/tags/etc.).
website/scripts/validate-template.js URL validation + SSRF protections and a HEAD reachability check.
website/scripts/parse-template-issue.js Parses GitHub issue body / workflow_dispatch inputs into structured fields.
website/scripts/update-templates-json.js Adds a new template entry to templates.json with duplicate detection and sanitization.
website/test/validate-template.test.ts Unit tests for URL canonicalization, SSRF checks, and validation behavior.
website/test/parse-template-issue.test.ts Unit tests for issue parsing and required-field detection.
website/test/update-templates-json.test.ts Unit tests for sanitization, tag parsing, duplicate detection, and JSON writing.
website/jest.config.js Updates Jest config (adds transformIgnorePatterns).
website/docs/contribute.md Documents the new automated submission path and links to the issue form.

Comment on lines +65 to +74
function isPrivateHost(hostname) {
if (/^localhost$/i.test(hostname)) return true;

// Strip brackets for IPv6 literal check
const bare = hostname.replace(/^\[|\]$/g, "");
if (net.isIP(bare)) return isPrivateIP(bare);

// IPv4 patterns (URL parser normalises hex/octal to dotted decimal)
if (PRIVATE_IPV4_RANGES.some((re) => re.test(hostname))) return true;

Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isPrivateHost applies IPv4 private-range regexes to any hostname string (e.g., 10.example.com), which will incorrectly classify some legitimate public hostnames as private. Since net.isIP(bare) already detects real IP literals (including URL parser normalization), consider removing the PRIVATE_IPV4_RANGES hostname regex check and relying on IP-literal detection + safeLookup for DNS results.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed the hostname regex check in isPrivateHost —
et.isIP(bare) already catches real IP literals, and safeLookup handles DNS-resolved private IPs for non-literal hostnames. This eliminates false positives like \10.example.com.

Comment on lines +3 to +67
on:
issues:
types: [labeled]
workflow_dispatch:
inputs:
source_repo:
description: "GitHub repository URL"
required: true
template_title:
description: "Template title"
required: true
description:
description: "1-2 sentence description"
required: true
author:
description: "Author name"
required: true
author_url:
description: "Author GitHub URL"
required: true
author_type:
description: "Microsoft or Community"
required: true
default: "Community"
type: choice
options:
- Community
- Microsoft
preview_image:
description: "Preview image URL (optional)"
required: false
iac_provider:
description: "IaC provider"
required: true
default: "Bicep"
type: choice
options:
- Bicep
- Terraform
- Both
languages:
description: "Comma-separated language tags (optional)"
required: false
frameworks:
description: "Comma-separated framework tags (optional)"
required: false
azure_services:
description: "Comma-separated Azure service tags (optional)"
required: false

permissions:
contents: write
pull-requests: write
issues: write

concurrency:
group: template-submission
cancel-in-progress: false

jobs:
process-template:
if: >-
github.event_name == 'workflow_dispatch' ||
contains(github.event.issue.labels.*.name, 'template-submission')
runs-on: ubuntu-latest
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow listens for issues: types: [labeled], but the issue form applies the template-submission label at creation time. In practice, this can result in submissions not being processed until someone re-labels the issue. Either (a) change the trigger to include opened (and keep the label check), or (b) remove the default label from the issue template and require a maintainer to apply the label to start processing.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added \opened\ to the trigger types alongside \labeled. The label check in the \if\ condition still gates processing, but now submissions are reliably processed at creation time.

name: "Submit an azd Template"
description: "Add your azd template to the Awesome azd gallery"
title: "[Template]: "
labels: ["template-submission"]
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue template auto-applies the template-submission label. Combined with a workflow that can create branches/PRs, this removes the maintainer gate and can enable automated PR spam unless additional submitter allowlisting is implemented. If the intention is to require maintainers to opt-in per issue, consider removing the default label and letting a maintainer apply it when ready to process.

Suggested change
labels: ["template-submission"]

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the auto-label by design — the workflow only creates a PR (never auto-merges), so the maintainer gate is the PR review itself. The \concurrency\ group limits to one in-flight run. Added \opened\ trigger (comment above) as defense-in-depth.

Comment on lines +19 to +24
const regex = new RegExp(`### ${fieldName}\\s*\\n\\s*([^\\n]+)`, "i");
const match = body.match(regex);
if (!match) return "";
const value = match[1].trim();
// Treat GitHub's default placeholder as empty
return value === "_No response_" ? "" : value;
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractField only captures the first non-empty line after a heading. In the issue form, Description is a textarea and can contain newlines, so this will silently truncate user input. Consider capturing until the next ### heading (or end of body) so multi-line values are parsed correctly, and update tests to cover this case.

Suggested change
const regex = new RegExp(`### ${fieldName}\\s*\\n\\s*([^\\n]+)`, "i");
const match = body.match(regex);
if (!match) return "";
const value = match[1].trim();
// Treat GitHub's default placeholder as empty
return value === "_No response_" ? "" : value;
// Capture everything after the heading up to the next heading (`### `) or end of body
const regex = new RegExp(
`### ${fieldName}\\s*\\n([\\s\\S]*?)(?=\\n### |$)`,
"i"
);
const match = body.match(regex);
if (!match) return "";
const value = match[1].trim();
// Treat GitHub's default placeholder as empty
if (!value || value === "_No response_") {
return "";
}
return value;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. \�xtractField\ now captures everything between headings (until next ### \ or end of body), so multi-line descriptions are preserved correctly.

Comment on lines +85 to +91
* Write key=value pairs to the GITHUB_OUTPUT file.
* @param {string} outputPath - File path from $GITHUB_OUTPUT
* @param {Record<string, string>} fields
*/
function writeOutputs(outputPath, fields) {
const lines = Object.entries(fields)
.map(([k, v]) => `${k}=${v}`)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeOutputs writes key=value lines directly to $GITHUB_OUTPUT. If any parsed field contains a newline (issue body edits, API-submitted workflow_dispatch inputs), this can inject additional outputs and change later step conditions. Use the multi-line output format (key<<EOF ... EOF) or strip/replace \r/\n in values before writing.

Suggested change
* Write key=value pairs to the GITHUB_OUTPUT file.
* @param {string} outputPath - File path from $GITHUB_OUTPUT
* @param {Record<string, string>} fields
*/
function writeOutputs(outputPath, fields) {
const lines = Object.entries(fields)
.map(([k, v]) => `${k}=${v}`)
* Sanitize a value so it can be safely written as a GitHub Actions output.
* Newlines and carriage returns are replaced to prevent output injection.
*
* @param {unknown} value
* @returns {string}
*/
function sanitizeOutputValue(value) {
if (value === null || value === undefined) {
return "";
}
return String(value).replace(/[\r\n]+/g, " ");
}
/**
* Write key=value pairs to the GITHUB_OUTPUT file.
* @param {string} outputPath - File path from $GITHUB_OUTPUT
* @param {Record<string, string>} fields
*/
function writeOutputs(outputPath, fields) {
const lines = Object.entries(fields)
.map(([k, v]) => `${k}=${sanitizeOutputValue(v)}`)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added \sanitizeOutputValue()\ that strips CR/LF characters before writing to \. Applied to all three scripts that write outputs (parse-template-issue.js, update-templates-json.js, merge-extracted-metadata.js).

Comment on lines +144 to +149
* @param {string} outputPath
* @param {Record<string, string>} outputs
*/
function writeOutputs(outputPath, outputs) {
const lines = Object.entries(outputs)
.map(([k, v]) => `${k}=${v}`)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeOutputs appends raw key=value lines to $GITHUB_OUTPUT. Because title/description/author are only HTML-stripped (newlines are preserved), an attacker can inject new outputs via newline characters and influence downstream workflow logic. Use GitHub’s multi-line output format or normalize values to a single line (e.g., replace CR/LF with spaces) before writing.

Suggested change
* @param {string} outputPath
* @param {Record<string, string>} outputs
*/
function writeOutputs(outputPath, outputs) {
const lines = Object.entries(outputs)
.map(([k, v]) => `${k}=${v}`)
* Normalizes values to a single line to prevent output injection.
* @param {string} outputPath
* @param {Record<string, string>} outputs
*/
function writeOutputs(outputPath, outputs) {
const lines = Object.entries(outputs)
.map(([k, v]) => {
const safeValue = String(v).replace(/[\r\n]+/g, " ");
return `${k}=${safeValue}`;
})

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Same newline sanitization applied here and also in merge-extracted-metadata.js.

Comment on lines +8 to +14
const PRIVATE_IPV4_RANGES = [
/^127\./,
/^10\./,
/^172\.(1[6-9]|2\d|3[01])\./,
/^192\.168\./,
/^169\.254\./,
/^0\./,
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRIVATE_IPV4_RANGES is missing several reserved/non-routable ranges commonly blocked in SSRF defenses (e.g., 100.64.0.0/10, 192.0.2.0/24, 198.18.0.0/15, multicast 224.0.0.0/4, etc.). Since this file advertises “private/reserved” filtering, consider expanding coverage to RFC 6890 reserved ranges (and add tests for the newly blocked ranges).

Suggested change
const PRIVATE_IPV4_RANGES = [
/^127\./,
/^10\./,
/^172\.(1[6-9]|2\d|3[01])\./,
/^192\.168\./,
/^169\.254\./,
/^0\./,
const PRIVATE_IPV4_RANGES = [
// Loopback
/^127\./,
// RFC1918 private space
/^10\./,
/^172\.(1[6-9]|2\d|3[01])\./,
/^192\.168\./,
// Link-local
/^169\.254\./,
// "This" network
/^0\./,
// Carrier-Grade NAT (RFC 6598) 100.64.0.0/10
/^100\.(6[4-9]|7\d|8\d|9\d|1[01]\d|12[0-7])\./,
// IETF protocol assignments & special-use (covers 192.0.0.0/24 including 192.0.0.170/31, 192.0.0.8/29, etc.)
/^192\.0\.0\./,
// Deprecated 6to4 relay anycast 192.88.99.0/24
/^192\.88\.99\./,
// Documentation ranges (RFC 5737)
/^192\.0\.2\./,
/^198\.51\.100\./,
/^203\.0\.113\./,
// Benchmarking (RFC 2544) 198.18.0.0/15
/^198\.(1[89])\./,
// Multicast 224.0.0.0/4
/^(22[4-9]|23\d)\./,
// Reserved for future use 240.0.0.0/4 (includes 255.255.255.255 broadcast)
/^(24\d|25[0-5])\./,

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Expanded \PRIVATE_IPV4_RANGES\ to cover RFC 6890 reserved ranges: CGNAT (100.64/10), documentation (192.0.2, 198.51.100, 203.0.113), benchmarking (198.18/15), IETF assignments (192.0.0/24), 6to4 relay (192.88.99), multicast (224/4), and future-use (240/4).

Comment on lines +95 to +129
const canonicalSource = canonicalizeUrl(sourceRepo);
const duplicate = templates.find(
(t) => canonicalizeUrl(t.source) === canonicalSource
);
if (duplicate) {
return {
skipped: true,
skipReason: `Template with source ${sourceRepo} already exists ("${duplicate.title}")`,
};
}

let iac;
if (iacProvider === "Both") {
iac = ["bicep", "terraform"];
} else if (iacProvider === "Terraform") {
iac = ["terraform"];
} else {
iac = ["bicep"];
}

const tags =
authorType === "Microsoft" ? ["msft", "new"] : ["community", "new"];

const generateId = uuidGenerator || (() => crypto.randomUUID());

const entry = {
title,
description,
preview: previewImage || "templates/images/default-template.png",
authorUrl,
author,
source: sourceRepo,
tags,
IaC: iac,
id: generateId(),
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateTemplatesJson detects duplicates using canonicalizeUrl(sourceRepo) but stores the original sourceRepo string in the new entry. This can leave .git, query/fragment, trailing slashes, or inconsistent casing in templates.json, even though canonicalization exists. Consider storing the canonicalized URL (and possibly trimming) in entry.source so the file stays normalized and future duplicate checks are simpler.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. \�ntry.source\ now stores the canonicalized URL (lowercased, no .git suffix, no trailing slash, no query/fragment) for consistent templates.json entries.

Comment on lines +112 to +125
- name: Comment on validation failure
if: steps.validate.outputs.valid == 'false' && github.event_name != 'workflow_dispatch'
uses: actions/github-script@7888e9ef07d6bca382a90e753d77e014ff4fd756 # v8
env:
ERRORS: ${{ steps.validate.outputs.errors }}
with:
script: |
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.issue.number,
body: `❌ **Template validation failed**\n\nPlease check your repository URL and try again.\n\n\`\`\`\n${process.env.ERRORS}\n\`\`\``
});

Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says the workflow uses “no inline JS”, but this workflow relies on actions/github-script with inline scripts for commenting (and thus executes inline JavaScript). Either update the PR description, or replace these steps with a non-JS alternative (e.g., gh api / curl to the REST API).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced all three \�ctions/github-script\ steps with \gh issue comment\ CLI calls. No more inline JS in the workflow — truly zero inline scripts now.

Comment on lines +121 to +137
function validateUrl(value, label) {
if (!value) return;
let parsed;
try {
parsed = new URL(value);
} catch {
throw new Error(`Invalid ${label} URL: "${value}"`);
}
if (parsed.protocol !== "https:") {
throw new Error(
`${label} URL must use HTTPS (got "${parsed.protocol}")`
);
}
if (isPrivateHost(parsed.hostname)) {
throw new Error(`${label} URL points to a private/reserved address`);
}
}
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateUrl does not reject URLs containing userinfo (e.g., https://user:token@github.com/org/repo). If a submitter accidentally includes credentials, they could be persisted into templates.json and echoed in workflow logs. Consider explicitly rejecting parsed.username/parsed.password (and optionally non-default ports) for all user-supplied URLs.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. \�alidateUrl\ now rejects URLs containing userinfo (\parsed.username\ or \parsed.password), preventing accidental credential leakage into templates.json or workflow logs.

@jongio jongio marked this pull request as draft March 29, 2026 14:26
jongio and others added 4 commits March 29, 2026 07:45
Simplify the submission form to only require the repo URL.
Metadata (title, description, author, languages, Azure services,
IaC provider) is auto-extracted from the GitHub API, azure.yaml,
and README.md. Form fields become optional overrides.

- New: extract-template-metadata.js (safe YAML, size limits, SSRF protection)
- New: merge-extracted-metadata.js (form values override extracted)
- Updated: issue form and workflow_dispatch — only source_repo required
- Updated: parse-template-issue.js — relaxed required fields
- 180 tests passing across 6 suites

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Node 24's https.request passes {all:true} to the lookup function,
causing dns.lookup to return an array of {address, family} objects
instead of a plain string. Return the array format back to the
caller after validating each resolved address against the private
IP blocklist.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extract first non-badge image from README as template preview,
converting relative paths to raw.githubusercontent.com URLs.

Detect frameworks from dependency files (package.json, requirements.txt,
pyproject.toml, pom.xml, build.gradle, Gemfile) and GitHub topics.
Covers all 22 framework tags: FastAPI, Flask, Django, React, Vue,
Angular, Next.js, NestJS, Spring, Quarkus, LangChain, and more.

README fetch increased to 50KB, shared between title and image
extraction. All dependency fetches run in parallel, non-fatal on 404.

211 tests passing across 6 suites.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… 6890 ranges

- Remove hostname regex check in isPrivateHost (false positives on e.g. 10.example.com)
- Expand PRIVATE_IPV4_RANGES to cover RFC 6890 (CGNAT, documentation, benchmarking, multicast)
- Reject URLs with userinfo (credentials) in validateUrl
- Fix extractField to capture multi-line values (description textarea)
- Sanitize newlines in writeOutputs across all 3 scripts (prevents output injection)
- Store canonicalized URL in templates.json entry.source
- Add 'opened' to workflow trigger types (defensive, alongside 'labeled')
- Replace actions/github-script with gh CLI (no more inline JS in workflow)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Automate template submissions for trusted publishers (mirror extension pipeline)

2 participants