Skip to content

fix: Weaviate hybrid search#6408

Open
jun4027 wants to merge 3 commits into
FlowiseAI:mainfrom
jun4027:fix/weaviate-hybrid-search
Open

fix: Weaviate hybrid search#6408
jun4027 wants to merge 3 commits into
FlowiseAI:mainfrom
jun4027:fix/weaviate-hybrid-search

Conversation

@jun4027
Copy link
Copy Markdown

@jun4027 jun4027 commented May 19, 2026

Summary

This PR improves Weaviate search and retriever integration by fixing metadata filter handling, resolving Hybrid Search issues, and adding a customizable Weaviate Hybrid Search Retriever.

Problem

  • Weaviate Search Filter was not applied correctly during search requests.
  • Hybrid Search did not work even when the alpha value was configured.
  • Retriever Tool's Additional Metadata Filter was not converted properly for Weaviate filter format.
  • Existing retriever implementation did not support customizable Weaviate Hybrid Search behavior.

Fix

  • Fixed the issue where Weaviate Search Filter was not being applied correctly.
  • Fixed Hybrid Search execution so that configured alpha values are properly reflected in Weaviate Hybrid Search queries.
  • Updated Retriever Tool Additional Metadata Filter handling to support Weaviate-compatible filter conversion.
  • Added a customizable Weaviate Hybrid Search Retriever implementation.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new HybridSearchRetriever for Weaviate, enabling hybrid search capabilities that combine vector similarity with BM25 keyword search. It adds a dedicated Weaviate Retriever node and updates the existing Weaviate vector store node to support Similarity, MMR, and Hybrid search types. Key improvements include a new filtering utility for Weaviate-specific metadata filters and integration within the RetrieverTool. Review feedback identified critical issues regarding the initialization and usage of fusionType in the retriever, as well as the need for proper type parsing of input parameters like alpha and topK to ensure compatibility with the underlying search engine.

this.vectorStore = input.vectorStore
this.alpha = input.alpha
this.topK = input.topK
this.fusionType = input.fusionType ? this.fusionType : 'RankedFusion'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The assignment logic for fusionType is incorrect. It currently assigns this.fusionType (which is uninitialized and thus undefined) to itself when input.fusionType is truthy. It should be assigned from input.fusionType.

Suggested change
this.fusionType = input.fusionType ? this.fusionType : 'RankedFusion'
this.fusionType = input.fusionType ?? 'RankedFusion'

Comment on lines +28 to +32
const results = await this.vectorStore.hybridSearch(query, {
limit: this.topK,
alpha: this.alpha,
filters: this.filter
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fusionType property is not being passed to the hybridSearch method call. This means the user's configuration for result merging (Ranked vs Relative Score) will be ignored during the search execution.

        const results = await this.vectorStore.hybridSearch(query, {
            limit: this.topK,
            alpha: this.alpha,
            fusionType: this.fusionType,
            filters: this.filter
        })

} else if ('hybrid' === searchType) {
return new HybridSearchRetriever({
vectorStore: vectorStore,
alpha: alpha,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The alpha parameter should be parsed to a float as it is retrieved as a string from the node inputs. Additionally, the fusionType configured in the node should be passed to the HybridSearchRetriever constructor.

Suggested change
alpha: alpha,
alpha: alpha ? parseFloat(alpha) : 1,
fusionType: nodeData.inputs?.fusionType,

const output = nodeData.outputs?.output as string
const searchType = nodeData.inputs?.searchType as string
const topK = nodeData.inputs?.topK as string
const k = topK ? parseFloat(topK) : 4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

topK represents the number of documents to retrieve and should be parsed as an integer rather than a float to ensure compatibility with the vector store's limit parameter.

Suggested change
const k = topK ? parseFloat(topK) : 4
const k = topK ? parseInt(topK, 10) : 4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant