Pringled · Pringled · Nov 15, 2025 · Nov 14, 2025 · Nov 14, 2025 · Nov 14, 2025
diff --git a/.gitignore b/.gitignore
@@ -207,3 +207,5 @@ marimo/_lsp/
 __marimo__/
 
 local
+
+.DS_Store
diff --git a/README.md b/README.md
@@ -9,6 +9,7 @@
     <a href="https://pypi.org/project/pyversity/"><img src="https://img.shields.io/pypi/v/pyversity?color=%23007ec6&label=pypi%20package" alt="Package version"></a>
     <a href="https://app.codecov.io/gh/Pringled/pyversity">
       <img src="https://codecov.io/gh/Pringled/pyversity/graph/badge.svg?token=2CV5W0ZT7T" alt="Codecov">
+    </a>
     <a href="https://github.com/Pringled/pyversity/blob/main/LICENSE">
       <img src="https://img.shields.io/badge/license-MIT-green" alt="License - MIT">
     </a>
@@ -17,7 +18,9 @@
 
 [Quickstart](#quickstart) •
 [Supported Strategies](#supported-strategies) •
-[Motivation](#motivation)
+[Motivation](#motivation) •
+[Examples](#examples) •
+[References](#references)
 
 </div>
 
@@ -71,6 +74,7 @@ The following table describes the supported strategies, how they work, their tim
 | **MSD** (Max Sum of Distances)        | Prefers items that are both relevant and far from *all* previous selections.                   | **O(k · n · d)**          | Use when you want stronger spread, i.e. results that cover a wider range of topics or styles.      |
 | **DPP** (Determinantal Point Process) | Samples diverse yet relevant items using probabilistic “repulsion.”                            | **O(k · n · d + n · k²)** | Ideal when you want to eliminate redundancy or ensure diversity is built-in to selection.  |
 | **COVER** (Facility-Location)         | Ensures selected items collectively represent the full dataset’s structure.                    | **O(k · n²)**             | Great for topic coverage or clustering scenarios, but slower for large `n`. |
+| **SSD** (Sliding Spectrum Decomposition) | Sequence‑aware diversification: rewards novelty relative to recently shown items.     | **O(k · n · d)**          | Great for content feeds & infinite scroll, e.g. social/news/product feeds where users consume sequentially, as well as conversational RAG to avoid showing similar chunks within the recent window.
 
 
 ## Motivation
@@ -82,10 +86,138 @@ Each new item is chosen not only because it’s relevant, but also because it ad
 
 This improves exploration, user satisfaction, and coverage across many domains, for example:
 
-- E-commerce: Show different product styles, not multiple copies of the same black pants.
+- E-commerce: Show different product styles, not multiple copies of the same product.
 - News search: Highlight articles from different outlets or viewpoints.
 - Academic retrieval: Surface papers from different subfields or methods.
 - RAG / LLM contexts: Avoid feeding the model near-duplicate passages.
+- Recommendation feeds: Keep content diverse and engaging over time.
+
+## Examples
+
+The following examples illustrate how to apply different diversification strategies in various scenarios.
+
+<details> <summary><b>Product / Web Search</b> — Simple diversification with MMR or DPP</summary> <br>
+
+MMR and DPP are great general-purpose diversification strategies. They are fast, easy to use, and work well in many scenarios.
+For example, in a product search setting where you want to show diverse items to a user, you can diversify the top results as follows:
+
+```python
+from pyversity import diversify, Strategy
+
+# Suppose you have:
+# - item_embeddings: embeddings of the retrieved products
+# - item_scores: relevance scores for these products
+
+# Re-rank with MMR
+result = diversify(
+    embeddings=item_embeddings,
+    scores=item_scores,
+    k=10,
+    strategy=Strategy.MMR,
+)
+```
+</details>
+
+<details> <summary><b>Literature Search </b> — Represent the full topic space with COVER</summary> <br>
+
+COVER (Facility-Location) is well-suited for scenarios where you want to ensure that the selected items collectively represent the entire dataset’s structure. For instance, when searching for academic papers on a broad topic, you might want to cover various subfields and methodologies:
+
+```python
+from pyversity import diversify, Strategy
+
+# Suppose you have:
+# - paper_embeddings: embeddings of the retrieved papers
+# - paper_scores: relevance scores for these papers
+
+# Re-rank with COVER
+result = diversify(
+    embeddings=paper_embeddings,
+    scores=paper_scores,
+    k=10,
+    strategy=Strategy.COVER,
+)
+```
+</details>
+
+<details>
+<summary><b>Conversational RAG</b> — Avoid redundant chunks with SSD</summary>
+<br>
+
+In retrieval-augmented generation (RAG) for conversational AI, it’s crucial to avoid feeding the model redundant or similar chunks of information within the recent conversation context. The SSD (Sliding Spectrum Decomposition) strategy is designed for sequence-aware diversification, making it ideal for this use case:
+
+```python
+import numpy as np
+from pyversity import diversify, Strategy
+
+# Suppose you have:
+# - chunk_embeddings (for retrieved chunks this turn)
+# - chunk_scores (relevance scores for these chunks)
+# - recent_chunk_embeddings (chunks shown in the last few turns (oldest→newest)
+
+# Re-rank with SSD (sequence-aware)
+result = diversify(
+    embeddings=chunk_embeddings,
+    scores=chunk_scores,
+    k=10,
+    strategy=Strategy.SSD,
+    recent_embeddings=recent_chunk_embeddings,
+)
+
+# Maintain the rolling context window for recent chunks
+recent_chunk_embeddings = np.vstack([recent_chunk_embeddings, chunk_embeddings[result.indices]])
+```
+</details>
+
+
+<details> <summary><b>Infinite Scroll / Recommendation Feed</b> — Sequence-aware novelty with SSD</summary> <br>
+
+In content feeds or infinite scroll scenarios, users consume items sequentially. To keep the experience engaging, it’s important to introduce novelty relative to recently shown items. The SSD strategy is well-suited for this:
+
+```python
+import numpy as np
+from pyversity import diversify, Strategy
+
+# Suppose you have:
+# - feed_embeddings: embeddings of candidate items for the feed
+# - feed_scores: relevance scores for these items
+# - recent_feed_embeddings: embeddings of recently shown items in the feed (oldest→newest)
+
+# Sequence-aware re-ranking with Sliding Spectrum Decomposition (SSD)
+result = diversify(
+    embeddings=feed_embeddings,
+    scores=feed_scores,
+    k=10,
+    strategy=Strategy.SSD,
+    recent_embeddings=recent_feed_embeddings,
+)
+
+# Maintain the rolling context window for recent items
+recent_feed_embeddings = np.vstack([recent_feed_embeddings, feed_embeddings[result.indices]])
+```
+</details>
+
+
+<details> <summary><b>Single Long Document</b> — Pick diverse sections with MSD</summary> <br>
+
+When summarizing or extracting information from a single long document, it’s beneficial to select sections that are both relevant and cover different parts of the document. The MSD strategy helps achieve this by preferring items that are far apart from each other:
+
+```python
+from pyversity import diversify, Strategy
+
+# Suppose you have:
+# - doc_chunk_embeddings: embeddings of document chunks
+# - doc_chunk_scores: relevance scores for these chunks
+
+# Re-rank with MSD
+result = diversify(
+    embeddings=doc_chunk_embeddings,
+    scores=doc_chunk_scores,
+    k=10,
+    strategy=Strategy.MSD,
+)
+```
+
+</details>
 
 ## References
 
@@ -102,6 +234,9 @@ The implementations in this package are based on the following research papers:
 - **DPP (efficient greedy implementation)**: Chen, L., Zhang, G., & Zhou, H. (2018). Fast greedy MAP inference for determinantal point process to improve recommendation diversity.
 [Link](https://arxiv.org/pdf/1709.05135)
 
+- **SSD**: Huang, Y., Wang, W., Zhang, L., & Xu, R. (2021). Sliding Spectrum Decomposition for Diversified
+Recommendation. [Link](https://arxiv.org/pdf/2107.05204)
+
 ## Author
 
 Thomas van Dongen
diff --git a/src/pyversity/__init__.py b/src/pyversity/__init__.py
@@ -1,6 +1,17 @@
 from pyversity.datatypes import DiversificationResult, Metric, Strategy
 from pyversity.pyversity import diversify
-from pyversity.strategies import cover, dpp, mmr, msd
+from pyversity.strategies import cover, dpp, mmr, msd, ssd
 from pyversity.version import __version__
 
-__all__ = ["diversify", "Strategy", "Metric", "DiversificationResult", "mmr", "msd", "cover", "dpp", "__version__"]
+__all__ = [
+    "diversify",
+    "Strategy",
+    "Metric",
+    "DiversificationResult",
+    "mmr",
+    "msd",
+    "cover",
+    "dpp",
+    "ssd",
+    "__version__",
+]
diff --git a/src/pyversity/datatypes.py b/src/pyversity/datatypes.py
@@ -11,6 +11,7 @@ class Strategy(str, Enum):
     MSD = "msd"
     COVER = "cover"
     DPP = "dpp"
+    SSD = "ssd"
 
 
 class Metric(str, Enum):

diff --git a/src/pyversity/pyversity.py b/src/pyversity/pyversity.py
@@ -3,7 +3,7 @@
 import numpy as np
 
 from pyversity.datatypes import DiversificationResult, Strategy
-from pyversity.strategies import cover, dpp, mmr, msd
+from pyversity.strategies import cover, dpp, mmr, msd, ssd
 
 
 def diversify(
@@ -36,4 +36,6 @@ def diversify(
         return cover(embeddings, scores, k, diversity, **kwargs)
     if strategy == Strategy.DPP:
         return dpp(embeddings, scores, k, diversity, **kwargs)
+    if strategy == Strategy.SSD:
+        return ssd(embeddings, scores, k, diversity, **kwargs)
     raise ValueError(f"Unknown strategy: {strategy}")
diff --git a/src/pyversity/strategies/__init__.py b/src/pyversity/strategies/__init__.py
@@ -2,5 +2,6 @@
 from pyversity.strategies.dpp import dpp
 from pyversity.strategies.mmr import mmr
 from pyversity.strategies.msd import msd
+from pyversity.strategies.ssd import ssd
 
-__all__ = ["mmr", "msd", "cover", "dpp"]
+__all__ = ["mmr", "msd", "cover", "dpp", "ssd"]
diff --git a/src/pyversity/strategies/cover.py b/src/pyversity/strategies/cover.py
@@ -14,7 +14,7 @@ def cover(
     normalize: bool = True,
 ) -> DiversificationResult:
     """
-    Select a subset of items that balances relevance and coverage/diversity.
+    Cover (Facility Location) selection.
 
     This strategy chooses `k` items by combining pure relevance with
     diversity-driven coverage using a concave submodular formulation.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -207,3 +207,5 @@ marimo/_lsp/
		__marimo__/

		local

		.DS_Store