USE 272 - Add response headers to output records #54
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose and background context
Now that we include the full, rendered HTML in output records for downstream systems to parse and use, it would be helpful to also have the associated HTTP response headers which provide additional information. Sometimes these can help indicate when the website was updated, or alternate links, etc. It may very well not get used, but it would be beneficial to err on the side of including.
How this addresses that need:
response_headerscolumn.The following is an example of response headers in a single output record:
How can a reviewer manually see the effects of these changes?
Optionally, the instructions + crawl from this previously merged PR can be used. Though, the output above is essentially what you'll see following the instructions below.
1- Run the crawl as outlined in PR 53
2- Observe a single record:
Includes new or updated dependencies?
NO
Changes expectations for external applications?
YES: Transmogrifier will have access to response headers if needed.
What are the relevant tickets?
Code review