Use UUIDs for document IDs in insert_documents method #18

Acuspeedster · 2025-05-31T18:07:40Z

One thing to be aware of: this change will affect any existing data in your Qdrant store. If we already have vectors in your collections that were inserted with sequential IDs, and we now start inserting with UUIDs, we'll have a mix of ID formats. This won't cause errors but might be confusing when examining the data directly. If we need to work with existing data and maintain consistency, we might want to:
Create new collections Re-index all your data with UUIDs Switch to the new collections when complete.

Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>

juntao · 2025-05-31T18:11:46Z

What about the point_id in load_data.py? Which point ID is it using when I run the load data script.

Acuspeedster · 2025-05-31T18:15:38Z

@juntao Looking at the load_data.py file, I can see that both the load_project_examples() and load_error_examples() functions are generating new UUIDs for each point they add to the vector database:

# In load_project_examples() function
# Store in vector DB with proper UUID
point_id = str(uuid.uuid4())  # Generate proper UUID
        
vector_store.upsert("project_examples", 
                  [{"id": point_id,  # Use UUID instead of filename
                    "vector": embedding, 
                    "payload": example}])

# In load_error_examples() function
# Store in vector DB with proper UUID
point_id = str(uuid.uuid4())
        
# Store in vector DB
vector_store.upsert("error_examples", 
                   [{"id": point_id, 
                     "vector": embedding, 
                     "payload": example}])

juntao · 2025-05-31T18:27:11Z

I know this. I am just pointing to a code quality problem. We should not have the same code segments littered in different files. It is a maintenance issue.

Shouldn't load_data just call the insert_document instead of upsert in this case?

…n; enhance error handling and logging Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>

Acuspeedster · 2025-05-31T19:14:09Z

@juntao I have improved the codes' maintainability as much as I could have potentially seen.

Use UUIDs for document IDs in insert_documents method

b501fb1

Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>

Acuspeedster self-assigned this May 31, 2025

Acuspeedster requested a review from juntao May 31, 2025 18:07

Acuspeedster added the enhancement New feature or request label May 31, 2025

Refactor loading of project and error examples into a unified functio…

3f642bf

…n; enhance error handling and logging Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>

juntao approved these changes May 31, 2025

View reviewed changes

juntao merged commit 33f5e1e into cardea-mcp:main May 31, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use UUIDs for document IDs in insert_documents method #18

Use UUIDs for document IDs in insert_documents method #18

Uh oh!

Acuspeedster commented May 31, 2025

Uh oh!

juntao commented May 31, 2025

Uh oh!

Acuspeedster commented May 31, 2025

Uh oh!

juntao commented May 31, 2025

Uh oh!

Acuspeedster commented May 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use UUIDs for document IDs in insert_documents method #18

Use UUIDs for document IDs in insert_documents method #18

Uh oh!

Conversation

Acuspeedster commented May 31, 2025

Uh oh!

juntao commented May 31, 2025

Uh oh!

Acuspeedster commented May 31, 2025

Uh oh!

juntao commented May 31, 2025

Uh oh!

Acuspeedster commented May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Acuspeedster commented May 31, 2025 •

edited

Loading