Skip to content

🚨 CRITICAL: Fix NPM Publishing CI/CD - Permanent Solution for Automated Versioning #831

@nickna

Description

@nickna

🚨 CRITICAL: NPM Publishing is Completely Broken

The Current Situation

Our CI/CD pipeline for npm publishing has been failing repeatedly. The "NPM Publish" job fails every time code is pushed to master with:

npm error 403 Forbidden - You cannot publish over the previously published versions: 0.2.0

This has become a habitual problem that keeps breaking despite multiple "fixes". We need a permanent, bulletproof solution.

Root Cause Analysis

What's Actually Happening

  1. The CI workflow (ci.yml) attempts to publish packages on every push to master
  2. It tries to create timestamped versions like 0.2.0-next.1234567890
  3. BUT the workflow has a critical bug - it attempts to publish BEFORE actually updating the version
  4. The npm version command updates package.json but the publish happens with the OLD version

Evidence

  • @knn_labs/conduit-common only has version 0.2.0 published (stuck)
  • @knn_labs/conduit-core-client has various versions but inconsistent patterns
  • The CI has been failing on master since August 28, 2025

Why Previous Fixes Failed

  1. Order of operations bug: Version update happens AFTER publish attempt
  2. No version checking: CI doesn't check if version already exists before publishing
  3. Inconsistent strategy: Different versioning approaches for different branches
  4. No state management: CI doesn't track what's already published

📋 Permanent Solution Plan

Core Principles

  1. NEVER manually bump versions - Everything automated
  2. NEVER publish duplicate versions - Check before publish
  3. ALWAYS have unique versions - Use commit SHA for uniqueness
  4. SEPARATE concerns - Different strategies for dev/master/releases

Implementation Strategy

1. Fix the Immediate Bug (Quick Win)

# In ci.yml, lines 182-186 need to be:
cd Common
CURRENT_VERSION=$(node -p "require('./package.json').version")
NEW_VERSION="${CURRENT_VERSION}-next.${GITHUB_SHA:0:7}.${TIMESTAMP}"
npm version "$NEW_VERSION" --no-git-tag-version --allow-same-version
# CRITICAL: Re-read the version after update
PUBLISH_VERSION=$(node -p "require('./package.json').version")
echo "Publishing version: $PUBLISH_VERSION"
npm publish --tag next --access public

2. Implement Proper Version Strategy

For dev branch:

# Version format: X.Y.Z-dev.COMMIT_SHA.TIMESTAMP
# Tag: @dev
# Example: 0.2.1-dev.abc1234.1234567890

For master branch:

# Version format: X.Y.Z-next.COMMIT_SHA
# Tag: @next  
# Example: 0.2.1-next.def5678

For release tags (v):*

# Version format: X.Y.Z (clean semver)
# Tag: @latest
# Example: 0.2.1

3. Add Pre-Publish Safety Checks

Create a new script scripts/safe-npm-publish.sh:

#!/bin/bash
set -e

PACKAGE_NAME=$1
PACKAGE_PATH=$2
NPM_TAG=$3

cd "$PACKAGE_PATH"

# Get current version from package.json
CURRENT_VERSION=$(node -p "require('./package.json').version")

# Check if this exact version already exists
if npm view "${PACKAGE_NAME}@${CURRENT_VERSION}" version 2>/dev/null; then
  echo "⚠️  Version ${CURRENT_VERSION} already exists for ${PACKAGE_NAME}"
  echo "📦 Generating unique version..."
  
  # Generate unique version based on branch
  if [[ "$GITHUB_REF" == "refs/heads/dev" ]]; then
    NEW_VERSION="${CURRENT_VERSION}-dev.${GITHUB_SHA:0:7}.$(date +%s)"
  elif [[ "$GITHUB_REF" == "refs/heads/master" ]]; then
    NEW_VERSION="${CURRENT_VERSION}-next.${GITHUB_SHA:0:7}"
  else
    echo "❌ Unexpected branch: $GITHUB_REF"
    exit 1
  fi
  
  npm version "$NEW_VERSION" --no-git-tag-version --allow-same-version
  CURRENT_VERSION=$(node -p "require('./package.json').version")
fi

echo "✅ Publishing ${PACKAGE_NAME}@${CURRENT_VERSION} with tag ${NPM_TAG}"
npm publish --tag "$NPM_TAG" --access public

4. Update CI Workflow

Replace the npm-publish job in ci.yml:

npm-publish:
  name: NPM Publish
  runs-on: ubuntu-latest
  needs: validate
  if: github.event_name == 'push' && (github.ref == 'refs/heads/master' || github.ref == 'refs/heads/dev')
  
  steps:
    - uses: actions/checkout@v4
    
    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: ${{ env.NODE_VERSION }}
        registry-url: 'https://registry.npmjs.org'
    
    - name: Determine NPM Tag
      id: npm-tag
      run: |
        if [[ "${{ github.ref }}" == "refs/heads/dev" ]]; then
          echo "tag=dev" >> $GITHUB_OUTPUT
        elif [[ "${{ github.ref }}" == "refs/heads/master" ]]; then
          echo "tag=next" >> $GITHUB_OUTPUT
        fi
    
    - name: Build SDKs
      run: |
        cd SDKs/Node
        npm ci
        npm run build
    
    - name: Publish Common Package
      run: |
        ./scripts/safe-npm-publish.sh \
          "@knn_labs/conduit-common" \
          "SDKs/Node/Common" \
          "${{ steps.npm-tag.outputs.tag }}"
      env:
        NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
        GITHUB_SHA: ${{ github.sha }}
        GITHUB_REF: ${{ github.ref }}
    
    - name: Update Dependencies and Publish Admin
      run: |
        cd SDKs/Node/Admin
        # Update to use latest Common from npm
        COMMON_VERSION=$(npm view @knn_labs/conduit-common version --tag ${{ steps.npm-tag.outputs.tag }})
        npm install "@knn_labs/conduit-common@${COMMON_VERSION}" --save-exact
        cd ../../..
        
        ./scripts/safe-npm-publish.sh \
          "@knn_labs/conduit-admin-client" \
          "SDKs/Node/Admin" \
          "${{ steps.npm-tag.outputs.tag }}"
      env:
        NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
        GITHUB_SHA: ${{ github.sha }}
        GITHUB_REF: ${{ github.ref }}
    
    - name: Update Dependencies and Publish Core
      run: |
        cd SDKs/Node/Core
        # Update to use latest Common from npm
        COMMON_VERSION=$(npm view @knn_labs/conduit-common version --tag ${{ steps.npm-tag.outputs.tag }})
        npm install "@knn_labs/conduit-common@${COMMON_VERSION}" --save-exact
        cd ../../..
        
        ./scripts/safe-npm-publish.sh \
          "@knn_labs/conduit-core-client" \
          "SDKs/Node/Core" \
          "${{ steps.npm-tag.outputs.tag }}"
      env:
        NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
        GITHUB_SHA: ${{ github.sha }}
        GITHUB_REF: ${{ github.ref }}

5. Add Version Reset Script

Create scripts/reset-sdk-versions.sh for emergency fixes:

#!/bin/bash
# Emergency version reset if things get out of sync

cd SDKs/Node/Common
npm version 0.3.0 --no-git-tag-version --allow-same-version

cd ../Admin  
npm version 0.3.0 --no-git-tag-version --allow-same-version

cd ../Core
npm version 0.3.0 --no-git-tag-version --allow-same-version

echo "✅ Reset all SDK versions to 0.3.0"
echo "📝 Commit these changes to fix version conflicts"

🎯 Success Criteria

After implementing this fix:

  1. ✅ Push to dev → Automatically publishes X.Y.Z-dev.SHA.TIMESTAMP with tag @dev
  2. ✅ Push to master → Automatically publishes X.Y.Z-next.SHA with tag @next
  3. ✅ Create release tag → Publishes clean X.Y.Z with tag @latest
  4. ✅ NO manual version bumping required
  5. ✅ NO duplicate version errors
  6. ✅ Every commit gets a unique version

📊 Testing Plan

  1. Test on dev branch first

    • Push a commit to dev
    • Verify it publishes with -dev.SHA.TIMESTAMP suffix
    • Verify tag is @dev
  2. Test on master branch

    • Merge to master
    • Verify it publishes with -next.SHA suffix
    • Verify tag is @next
  3. Test duplicate handling

    • Push another commit quickly
    • Verify it handles existing versions gracefully
  4. Test release workflow

    • Create a tag v0.3.0
    • Verify it publishes clean 0.3.0
    • Verify tag is @latest

🚀 Implementation Steps

  1. Create scripts/safe-npm-publish.sh with safety checks
  2. Update ci.yml with new npm-publish job
  3. Create scripts/reset-sdk-versions.sh for emergency use
  4. Test on a feature branch with workflow_dispatch
  5. Deploy to dev branch and verify
  6. Deploy to master branch and verify
  7. Document the new workflow in CLAUDE.md

🔥 Why This Will Work

  1. Idempotent: Can run multiple times safely
  2. Unique versions: Every commit gets unique version via SHA
  3. No conflicts: Checks before publishing
  4. Branch-aware: Different strategies per branch
  5. Fail-safe: Falls back to unique versions if conflicts detected
  6. Observable: Clear logging of what's being published

📝 Notes

  • This has been a recurring problem that needs a permanent fix
  • Previous attempts failed due to not checking for existing versions
  • The current CI has the npm version and npm publish commands in wrong order
  • We need this to "just work" without manual intervention

Priority: 🔴 CRITICAL - Blocking all npm SDK updates

Labels: bug, ci/cd, critical, npm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions