Skip to content

can Man-in-the-Middle somehow implement cache-control? #366

@andynuss

Description

@andynuss

I am saving interesting GET responses to the local file system with something like:

export function saveResources(hero: Hero) {
  const tab = hero.activeTab;
  tab.on('resource', async rsrc => {
    const { request, response, text, type, url, buffer, isRedirect } = rsrc;
    if (isRedirect) return;
    // only these types
    if (!['Stylesheet', 'Font', 'Image', 'Ico'].includes(type)) return;

    const { headers: inHeaders } = request;
    const { headers: outHeaders } = response;

    // the resources dir is found this way relative to the 'src' folder
    const filePrefix = '../resources/' + urlToB64(url);

    let ext: String;
    if (type === 'Stylesheet') {
      ext = '.css';
    } else {
      ext = '.' + type.toLowerCase();
    }

    if (type === 'Stylesheet') {
      fs.writeFile(filePrefix + ext, await text, 'utf8', () => {});
    } else {
      const buf = Buffer.from(await buffer); // convert to Node Buffer
      fs.writeFile(filePrefix + ext, buf, () => {});
    }

    const metaObj = { type, url, inHeaders, outHeaders };
    let metaStr: string;
    try {
      metaStr = JSON.stringify(metaObj, null, 2);
    } catch (e) {
      console.error('Error stringifying meta object', e);
      return;
    }
    fs.writeFile(filePrefix + '.meta', metaStr, 'utf8', () => {});
  });
}

Is there a way to intercept future requests for the same urls, check the metadata file on disk
(if it exists), and fulfill the request if the cache-control headers allow me to?

For example, here is an example of one Image that I could serve from the file system:

{
  "type": "Image",
  "url": "https://ads.pubmatic.com/AdServer/js/user_sync.html?p=156578&predirect=&gdpr=0&gdpr_consent=&google_gid=CAESEFKbak4YZB61SK4DBXGjinM&google_cver=1",
  "inHeaders": {
    ...
  },
  "outHeaders": {
    "last-modified": "Wed, 13 Nov 2024 05:14:24 GMT",
    "server": "Apache",
    "accept-ranges": "bytes",
    "content-encoding": "gzip",
    "p3p": "CP=\"NOI DSP COR LAW CUR ADMo DEVo TAIo PSAo PSDo IVAo IVDo HISo OTPo OUR SAMo BUS UNI COM NAV INT DEM CNT STA PRE LOC\", CP=\"NOI DSP COR LAW CUR ADMo DEVo TAIo PSAo PSDo IVAo IVDo HISo OTPo OUR SAMo BUS UNI COM NAV INT DEM CNT STA PRE LOC\"",
    "content-length": "6694",
    "content-type": "text/html",
    "cache-control": "max-age=117766",
    "expires": "Fri, 23 May 2025 00:54:40 GMT",
    "date": "Wed, 21 May 2025 16:11:54 GMT",
    "vary": "Accept-Encoding"
  }
}

Ideally, I would write a function like this:

export function serveFromCache(hero: Hero) {
  // some kind of use of mitm Agent(?) to intercept all requests
  // and check the folder built in the above function to see if
  // we already have the resource, and it is not expired, and then
  // serve it from there to Chrome, without ever going out to the internet
}

I saw this in the tests:

agent.hook({
  async beforeHttpRequest(request: IHttpResourceLoadDetails): Promise<any> {
    if (request.url.pathname === '/intercept-post') {
      // NOTE: need to delete the content length (or set to correct value)
      delete request.requestHeaders['Content-Length'];
    }
  },
  async beforeHttpRequestBody(request: IHttpResourceLoadDetails): Promise<any> {
    if (request.url.pathname === '/intercept-post') {
      // drain first
      for await (const _ of request.requestPostDataStream) {
      }
      // send body. NOTE: we had to change out the content length before the body step
      request.requestPostDataStream = Readable.from(Buffer.from('Intercept request'));
    }
  },
  async beforeHttpResponse(response: IHttpResourceLoadDetails): Promise<any> {
    if (response.url.pathname === '/intercept-post') {
      response.responseHeaders['Content-Length'] = 'Intercepted'.length.toString();
    }
  },
  async beforeHttpResponseBody(response: IHttpResourceLoadDetails): Promise<any> {
    if (response.url.pathname === '/intercept-post') {
      for await (const _ of response.responseBodyStream) {
      }
      response.responseBodyStream = Readable.from(Buffer.from('Intercepted'));
    }
  },
});

But was unclear if beforeHttpRequest allows me to do what I wish to do.
Or how to even set make this interoperate with my Hero objects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions