Skip to content

feat: implement ChemTEB#3

Open
kjappelbaum wants to merge 6 commits intomainfrom
embedding_metrics
Open

feat: implement ChemTEB#3
kjappelbaum wants to merge 6 commits intomainfrom
embedding_metrics

Conversation

@kjappelbaum
Copy link
Copy Markdown

@kjappelbaum kjappelbaum commented Jul 26, 2025

Summary by Sourcery

Add a Jupyter notebook implementing the ChemTEB baseline for chemical QA retrieval using OpenAI embeddings and evaluate performance with NDCG@10.

New Features:

  • Introduce a notebook for the ChemTEB pipeline that loads the ChemHotpot QA dataset and generates text embeddings via the OpenAI API
  • Implement a retrieval workflow using cosine similarity between query and corpus embeddings
  • Define DCG@k and NDCG@k functions and compute mean NDCG@10 for performance evaluation

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Jul 26, 2025

Reviewer's Guide

Introduces a new Jupyter notebook implementing the ChemTEB retrieval and evaluation pipeline: it loads the ChemHotpotQARetrieval dataset, generates and persists embeddings via the OpenAI API, defines DCG/NDCG metrics, computes cosine-similarity–based retrieval, and reports mean NDCG@10.

File-Level Changes

Change Details Files
Notebook scaffolding and dependency setup
  • Add pip install step for scikit-learn
  • Import datasets, OpenAI client, dotenv, numpy, pandas, sklearn metrics
chemretrieval-bench/test.ipynb
Load ChemHotpotQARetrieval dataset
  • Load 'default', 'corpus', and 'queries' splits via load_dataset
  • Cache dataset locally for reuse
chemretrieval-bench/test.ipynb
Embed texts using OpenAI embeddings API
  • Define get_embedding wrapper for text-embedding-3-small
  • Generate and collect embeddings for corpus and queries
chemretrieval-bench/test.ipynb
Persist and reload embeddings
  • Implement save_embeddings with numpy.save
  • Save corpus_embeddings.npy and queries_embeddings.npy
  • Load embeddings via numpy.load for evaluation
chemretrieval-bench/test.ipynb
Implement DCG and NDCG evaluation metrics
  • Add compute_dcg_at_k computing discounted cumulative gain
  • Add compute_ndcg_at_k normalizing against ideal DCG
chemretrieval-bench/test.ipynb
Compute retrieval performance and aggregate results
  • Convert embeddings to numpy arrays for vectorized operations
  • Compute cosine similarities per query–corpus pair
  • Select top-k results, assign binary relevances, compute NDCG@10
  • Collect results in pandas DataFrame and compute mean score
chemretrieval-bench/test.ipynb

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kjappelbaum - I've reviewed your changes - here's some feedback:

  • Strip the notebook outputs and migrate the core retrieval logic into reusable Python modules or scripts for better maintainability and reviewability.
  • Use batched embedding requests and vectorized numpy or sklearn cosine-similarity computations instead of per-item loops to improve performance and avoid API rate limits.
  • Fix the inconsistent filename when loading saved embeddings (e.g. ‘cqueries_embeddings.npy’) and consider parameterizing file paths rather than hard-coding them.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Strip the notebook outputs and migrate the core retrieval logic into reusable Python modules or scripts for better maintainability and reviewability.
- Use batched embedding requests and vectorized numpy or sklearn cosine-similarity computations instead of per-item loops to improve performance and avoid API rate limits.
- Fix the inconsistent filename when loading saved embeddings (e.g. ‘cqueries_embeddings.npy’) and consider parameterizing file paths rather than hard-coding them.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants