This test is super flaky. Its unclear if it is flaking because of an actual defect, or if its flaking because its a bad test.
Investigation should prove out whether we should keep the test around and fix the underlying issue it is observing, or remove it as a bad test.