-
Notifications
You must be signed in to change notification settings - Fork 178
fix: blocked web crawlers for account MFE #1208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1208 +/- ##
=======================================
Coverage 58.53% 58.53%
=======================================
Files 117 117
Lines 2320 2320
Branches 641 644 +3
=======================================
Hits 1358 1358
Misses 901 901
Partials 61 61 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR adds webpack configuration files for both development and production builds to block access by web crawlers through the robots.txt file.
- Introduces webpack.dev.config.js and webpack.prod.config.js files.
- Uses CopyPlugin to copy the robots.txt file from the public directory to the distribution folder.
Reviewed Changes
| File | Description |
|---|---|
| webpack.dev.config.js | Adds development config including file copying for robots.txt. |
| webpack.prod.config.js | Adds production config including file copying for robots.txt. |
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
webpack.dev.config.js:8
- Consider adding tests to verify that the robots.txt file is correctly copied to the dist directory in the development configuration.
config.plugins.push(
webpack.prod.config.js:8
- Consider adding tests to ensure that the robots.txt file is correctly copied to the dist directory in the production configuration.
config.plugins.push(
|
Hi @arbrandes, We have received a high-priority request to prevent subdomains on edge.edx.org from appearing in Google search results. After discussing possible solutions with the SRE team, I have implemented a method to inhibit SEO traffic by adding a noindex meta tag and updating the robots.txt file. If you have any alternative solutions in mind, please share them with us. Additionally, please let us know how much time will be required for the product discussion so we can update the requesting team accordingly. Please note that this change applies to the master branch, rather than the 2u-main branch. |
arbrandes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads-up!
Much like the recent discussion around a potential addition of a static health-check endpoint to MFEs, I think the right answer here is that controversial changes to an MFEs static files that can't be done in a configurable or pluggable manner should be done instead by the deployment mechanism.
As a matter of fact, this came up in a Maintenance Working group meeting last October, and the consensus of the group at that time was that robots.txt should not be merged into the MFE.
In any case, it seems your deployment team has a way to add a /version.json, as per @jsnwesson's message in that thread (direct link). Would the same mechanism not be able to add a robots.txt, or a custom index.html?
INF-1819
Description
blocked web crawlers for auth MFE
Merge Checklist
Post-merge Checklist