Skip to content

[HELP WANTED] Bug: Qdrant semantic search fails due to missing index for \"type\" field #144

@raphael-intugle

Description

@raphael-intugle

name: Help Wanted
about: Request community help on a specific task or feature
title: "[HELP WANTED] Bug: Qdrant semantic search fails due to missing index for "type" field"
labels: 'help wanted', 'bug'
assignees: ''

What We Need Help With

We need help resolving an issue where semantic search fails on cloud-hosted Qdrant instances due to a missing index for the "type" payload field. Although vector points are successfully created, search operations result in a 400 Bad Request error.

Background & Context

When using Intugle's semantic search feature with a cloud-hosted Qdrant instance, the system successfully uploads vector points. However, subsequent semantic search queries fail with a Bad Request error, indicating that a keyword index is required but not found for the "type" field in the Qdrant collection. This suggests that the Qdrant collection is not being initialized with the necessary payload index for the "type" field, which is critical for filtering during search operations.

Current State

Vector points are successfully created in the cloud-hosted Qdrant collection. However, attempting to perform a semantic search results in the following error:

ERROR:intugle.core.semantic_search.semantic_search:Unexpected Response: 400 (Bad Request)
Raw response content:
b'{"status":{"error":"Bad request: Index required but not found for \"type\" of one of the following types: [keyword]. Help: Create an index for this key or use a different filter."}}"time":0.000026182}'
ERROR:intugle.core.semantic_search.semantic_search:[] Error while performing dense search
Reason: Unexpected Response: 400 (Bad Request)
Raw response content:
b'{"status":{"error":"Bad request: Index required but not found for \"type\" of one of the following types: [keyword]. Help: Create an index for this key or use a different filter."}}"time":0.000026182}'
ERROR:intugle.core.semantic_search.semantic_search:[] Error while performing late search
Reason: Unexpected Response: 400 (Bad Request)

Desired Outcome

Semantic search should function correctly with cloud-hosted Qdrant instances. This requires ensuring that the Qdrant collection is created with the necessary keyword index for the "type" payload field, allowing search queries to execute without Bad Request errors.

Scope of Work

  • Reproduce the issue with a cloud-hosted Qdrant instance.
  • Investigate the Qdrant collection creation and initialization logic within Intugle's semantic search module.
  • Modify the code to ensure a keyword index is created for the "type" payload field when the Qdrant collection is initialized or updated.
  • Verify that semantic search queries now execute successfully without the indexing error.

Technical Details

Relevant Files/Modules

  • src/intugle/semantic_search/semantic_search.py (This is the primary file for Qdrant client interaction and collection management.)

Key Concepts/Technologies

  • Qdrant: Understanding collection creation, payload indexing, and filtering operations.
  • Vector Databases: Familiarity with how data is stored and indexed for semantic search.
  • Python qdrant-client library: How to interact with Qdrant from Python.

Skills Needed

  • Python development
  • Machine Learning (XGBoost, scikit-learn)
  • LLM/GenAI integration (LangChain, OpenAI)
  • Data engineering
  • SQL and database knowledge
  • Vector databases (Qdrant)
  • Databricks/Snowflake experience
  • Frontend/UI (Streamlit)
  • Documentation writing
  • Testing (pytest)

Getting Started

  1. Fork the repository.
  2. Set up your development environment (see CONTRIBUTING.md).
  3. Configure Intugle to use a cloud-hosted Qdrant instance (ensure QDRANT_URL and QDRANT_API_KEY are set correctly in your environment).
  4. Reproduce the error by running a semantic search operation.
  5. Implement the fix in src/intugle/semantic_search/semantic_search.py.

Testing Requirements

  • Unit tests: To confirm that collection creation logic includes the "type" index.
  • Integration tests: Run semantic search against a Qdrant instance to ensure it works correctly.
  • Manual testing: Confirm the fix with a cloud-hosted Qdrant deployment.

Questions & Support

  • Ask questions in the comments below
  • Join our Discord for real-time help

We appreciate your interest in contributing to Intugle! 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions