fix: Weaviate hybrid search#6408
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new HybridSearchRetriever for Weaviate, enabling hybrid search capabilities that combine vector similarity with BM25 keyword search. It adds a dedicated Weaviate Retriever node and updates the existing Weaviate vector store node to support Similarity, MMR, and Hybrid search types. Key improvements include a new filtering utility for Weaviate-specific metadata filters and integration within the RetrieverTool. Review feedback identified critical issues regarding the initialization and usage of fusionType in the retriever, as well as the need for proper type parsing of input parameters like alpha and topK to ensure compatibility with the underlying search engine.
| this.vectorStore = input.vectorStore | ||
| this.alpha = input.alpha | ||
| this.topK = input.topK | ||
| this.fusionType = input.fusionType ? this.fusionType : 'RankedFusion' |
There was a problem hiding this comment.
The assignment logic for fusionType is incorrect. It currently assigns this.fusionType (which is uninitialized and thus undefined) to itself when input.fusionType is truthy. It should be assigned from input.fusionType.
| this.fusionType = input.fusionType ? this.fusionType : 'RankedFusion' | |
| this.fusionType = input.fusionType ?? 'RankedFusion' |
| const results = await this.vectorStore.hybridSearch(query, { | ||
| limit: this.topK, | ||
| alpha: this.alpha, | ||
| filters: this.filter | ||
| }) |
There was a problem hiding this comment.
The fusionType property is not being passed to the hybridSearch method call. This means the user's configuration for result merging (Ranked vs Relative Score) will be ignored during the search execution.
const results = await this.vectorStore.hybridSearch(query, {
limit: this.topK,
alpha: this.alpha,
fusionType: this.fusionType,
filters: this.filter
})| } else if ('hybrid' === searchType) { | ||
| return new HybridSearchRetriever({ | ||
| vectorStore: vectorStore, | ||
| alpha: alpha, |
There was a problem hiding this comment.
The alpha parameter should be parsed to a float as it is retrieved as a string from the node inputs. Additionally, the fusionType configured in the node should be passed to the HybridSearchRetriever constructor.
| alpha: alpha, | |
| alpha: alpha ? parseFloat(alpha) : 1, | |
| fusionType: nodeData.inputs?.fusionType, |
| const output = nodeData.outputs?.output as string | ||
| const searchType = nodeData.inputs?.searchType as string | ||
| const topK = nodeData.inputs?.topK as string | ||
| const k = topK ? parseFloat(topK) : 4 |
There was a problem hiding this comment.
Summary
This PR improves Weaviate search and retriever integration by fixing metadata filter handling, resolving Hybrid Search issues, and adding a customizable Weaviate Hybrid Search Retriever.
Problem
alphavalue was configured.Fix
alphavalues are properly reflected in Weaviate Hybrid Search queries.