Skip to content

When linear-fitting, Scidavis seem to sort the (x, y) points (why?), but not the uncertainties and thus gives wrong results #22

@puchs

Description

@puchs

I know you want users to file bug reports at https://sourceforge.net/p/scidavis/scidavis-bugs/, but I am not going to create an account on this crappy platform just to file a bug. So sorry, but I think reporting it here is better than not reporting it at all.

The bug itself:

Take the following data set (columns X, Y, and Yerr):
0.002083333333333333 0.03333333333333333 0.003333333333333334
0.00101010101010101 0.02105263157894737 0.001329639889196676
0.0006802721088435374 0.01754385964912281 0.0009233610341643582
0.0005076142131979696 0.01538461538461539 0.0007100591715976331
0.0002645502645502646 0.01282051282051282 0.0004930966469428008
0.000209643605870021 0.01219512195121951 0.000446162998215348
9.900990099009902e-05 0.01111111111111111 0.0003703703703703704
7.204610951008646e-05 0.0108695652173913 0.0003544423440453686
4.566210045662101e-05 0.01052631578947368 0.000332409972299169
1.464128843338214e-05 0.01025641025641026 0.0003155818540433925

Do a scatter-plot and then Analysis->Quick Fit->Linear Fit. The result is

Linear Regression fit of dataset: Table3_2, using function: A*x+B
Y standard errors: Associated dataset (Table3_3)
From x = 1,46412884333821e-05 to x = 0,00208333333333333
B (y-intercept) = 0,00980827992195652 +/- 0,000236890270620361
A (slope) = 11,2640343431007 +/- 0,212197222336419

Chi^2 = 0,565129788016921
R^2 = 0,999799482369171

Now take the following set (columns X, Y, and Yerr):
1.464128843338214e-05 0.01025641025641026 0.003333333333333334
4.566210045662101e-05 0.01052631578947368 0.001329639889196676
7.204610951008646e-05 0.0108695652173913 0.0009233610341643582
9.900990099009902e-05 0.01111111111111111 0.0007100591715976331
0.000209643605870021 0.01219512195121951 0.0004930966469428008
0.0002645502645502646 0.01282051282051282 0.000446162998215348
0.0005076142131979696 0.01538461538461539 0.0003703703703703704
0.0006802721088435374 0.01754385964912281 0.0003544423440453686
0.00101010101010101 0.02105263157894737 0.000332409972299169
0.002083333333333333 0.03333333333333333 0.0003155818540433925

it is the same set, but columns X (and Y) sorted based on X. The Yerr column is left as it is.

This gives:
Linear Regression fit of dataset: Table3_2, using function: A*x+B
Y standard errors: Associated dataset (Table3_3)
From x = 1,46412884333821e-05 to x = 0,00208333333333333
B (y-intercept) = 0,00980827992195652 +/- 0,000236890270620361
A (slope) = 11,2640343431007 +/- 0,212197222336419

Chi^2 = 0,565129788016854
R^2 = 0,999799482369171

and thus somehow exactly the same A and B as the first data set, although the Y errors are mirrored.

It thus seems that when fitting SciDavis sorts the (x, y) points based on x, but not Yerr, thus giving an incorrect result if the data points are not sorted in the first place. The correct result for the first dataset should be

Linear Regression fit of dataset: Table3_2, using function: A*x+B
Y standard errors: Associated dataset (Table3_3)
From x = 1,46412884333821e-05 to x = 0,00208333333333333
B (y-intercept) = 0,0100413916386931 +/- 0,00017921733049076
A (slope) = 10,8089294839438 +/- 0,755079620620623

Chi^2 = 0,268194440672359
R^2 = 0,998692920890892

Desktop (please complete the following information):
I am using SciDavis 2.4.0 on windows 10.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions