A proposal to make the correlated t-test less conservative by NeeleKemper · Pull Request #1 · dpaetzel/cmpbayes

NeeleKemper · 2024-06-27T13:12:55Z

In the correlated t-test, the squared variance should be used as the scale parameter, as the scipy.stats.t expects the standard deviation.

See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html

dpaetzel · 2024-07-03T08:20:44Z

src/cmpbayes/__init__.py

        rho = self.fraction_test

+        sigma_2_hat = ((x - x_over) ** 2).sum() / (n - 1)
+        corrected_sigma_2_hat = (1 / n + rho / (1 - rho)) * sigma_2_hat  # Corrected variance using Nadeau-Bengio's correction


Please comment above the lines to keep the lines the comments refer to (to keep lines at less than 80 characters). Also, please end comments with a full stop.

heidmic · 2024-10-07T09:10:02Z

Note that it makes the tests more conservative for variances < 1 (which would be quite common for losses or other error metrics, especially on standardized datasets) and only less conservative on higher values. However, from my testing of this PR, the results feel more 'correct' for the smaller values. In the sense that they align better with my visual interpretations of plotting the runs individually

fix scale parameter in student t test

b496ad6

dpaetzel requested changes Jul 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A proposal to make the correlated t-test less conservative#1

A proposal to make the correlated t-test less conservative#1
NeeleKemper wants to merge 1 commit intodpaetzel:mainfrom
NeeleKemper:fix-corr-test

NeeleKemper commented Jun 27, 2024

Uh oh!

dpaetzel Jul 3, 2024

Uh oh!

heidmic commented Oct 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

NeeleKemper commented Jun 27, 2024

Uh oh!

dpaetzel Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

heidmic commented Oct 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants