Skip to content

Conversation

@plazas
Copy link
Contributor

@plazas plazas commented Aug 12, 2025

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@plazas plazas requested a review from bnord August 12, 2025 18:11
@bnord
Copy link
Contributor

bnord commented Aug 21, 2025

Hi @plazas, Thanks for your patience with my review.

This is an informative, instructive, clear, and well-written notebook.

The comments below are from reading the notebook; I haven't run it yet, but I will do that soon.

  • General:
    • Some section titles end with a "." and some don't.
  • Section 1: Introduction:
    • What do you think of adding a few comments about anomaly detection algorithms in general and others that are available in scikit-learn or elsewhere? Does the notebook guidance permit that, or is it out of scope. For example, how is an isolation forest different from PCA? Is Isolation Forest supervised or unsupervised?
  • Section 2.2: Run the isolation forest algorithm.
    • Could you say a little more about the anomaly score? Are there positive values, and what would they mean? What is the possible range in the negative direction, the real line? Does the score have units? If so, should that be on the x-axis label?
    • Are all of the objects plotted anomalous, or only those with the most negative scores?
    • Figure 1: Consider commenting on the log-scale of the y-axis for the user. Is this a typical/common shape for the anomaly score histogram? Is there a general rule or consideration for the threshold of anomaly score values, above which to consider for further explanation? The anomaly score seems similar to the cross-entropy value for classification models, and thus the anomaly score plot is perhaps similar to the ROC Curve; in the ROC curve, the chosen threshold is modulated by the user's desire/acceptance of the fraction of false-positives.
    • It may be helpful to make a note for users that compares/distinguishes anomaly scores and classification scores; this may help them to have context for their interpretation. However, this may not be within scope.
  • Section 2.3: Visualize the Identified Anomalies
    • Figure 2:
      • Does notebook formatting permit/require units on the figure axes? Does r_psfFluxMean have units?
      • Are all the objects in the figure also in the histogram above?
    • Figure 3:
      • Could you comment on the interpretation of the curves?
      • Consider overlaying a theoretical light curve of some kind of object that is not anomalous?
      • Consider plotting a light curve from one of the non-anomalous DIAObjects.
    • Are there other visualizations of the magnitudes that demonstrate anomalousness?
    • Are there other parameters that would demonstrate anomalousness?
    • Are there other parameters that may show what objects these items are similar to?
  • Section 4: "Exercisesfor the learner." --> "Exercises for the learner" (add space)
    • "Example sof" --> "Examples of"
    • Aside from additional major calculations or computations, is there anything that a user can do with the light curves or the anomaly scores?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants