Skip to content

Conversation

@sundasnoreen12
Copy link
Contributor

INF-1819

Description

blocked web crawlers for auth MFE

Merge Checklist

  • If your update includes visual changes, have they been reviewed by a designer? Send them a link to the Sandbox, if applicable.
  • Is there adequate test coverage for your changes?

Post-merge Checklist

  • Deploy the changes to prod after verifying on stage or ask @openedx/2u-infinity to do it.
  • 🎉 🙌 Celebrate! Thanks for your contribution.

@codecov
Copy link

codecov bot commented Mar 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 58.53%. Comparing base (55a72b3) to head (1a8874e).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1208   +/-   ##
=======================================
  Coverage   58.53%   58.53%           
=======================================
  Files         117      117           
  Lines        2320     2320           
  Branches      641      644    +3     
=======================================
  Hits         1358     1358           
  Misses        901      901           
  Partials       61       61           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR adds webpack configuration files for both development and production builds to block access by web crawlers through the robots.txt file.

  • Introduces webpack.dev.config.js and webpack.prod.config.js files.
  • Uses CopyPlugin to copy the robots.txt file from the public directory to the distribution folder.

Reviewed Changes

File Description
webpack.dev.config.js Adds development config including file copying for robots.txt.
webpack.prod.config.js Adds production config including file copying for robots.txt.

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

webpack.dev.config.js:8

  • Consider adding tests to verify that the robots.txt file is correctly copied to the dist directory in the development configuration.
config.plugins.push(

webpack.prod.config.js:8

  • Consider adding tests to ensure that the robots.txt file is correctly copied to the dist directory in the production configuration.
config.plugins.push(

@sundasnoreen12
Copy link
Contributor Author

Hi @arbrandes,

We have received a high-priority request to prevent subdomains on edge.edx.org from appearing in Google search results. After discussing possible solutions with the SRE team, I have implemented a method to inhibit SEO traffic by adding a noindex meta tag and updating the robots.txt file.

If you have any alternative solutions in mind, please share them with us. Additionally, please let us know how much time will be required for the product discussion so we can update the requesting team accordingly.

Please note that this change applies to the master branch, rather than the 2u-main branch.

Copy link
Contributor

@arbrandes arbrandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads-up!

Much like the recent discussion around a potential addition of a static health-check endpoint to MFEs, I think the right answer here is that controversial changes to an MFEs static files that can't be done in a configurable or pluggable manner should be done instead by the deployment mechanism.

As a matter of fact, this came up in a Maintenance Working group meeting last October, and the consensus of the group at that time was that robots.txt should not be merged into the MFE.

In any case, it seems your deployment team has a way to add a /version.json, as per @jsnwesson's message in that thread (direct link). Would the same mechanism not be able to add a robots.txt, or a custom index.html?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants