From b6c6ad60e4970b0e26096c089c081ae2e28dcf11 Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Thu, 25 Jun 2026 16:44:12 +0530 Subject: [PATCH 1/8] Add Infino integration: registry entry + provider and vector store docs --- packages.yml | 5 + .../python/integrations/providers/infino.mdx | 69 +++++ .../integrations/vectorstores/infino.mdx | 249 ++++++++++++++++++ 3 files changed, 323 insertions(+) create mode 100644 src/oss/python/integrations/providers/infino.mdx create mode 100644 src/oss/python/integrations/vectorstores/infino.mdx diff --git a/packages.yml b/packages.yml index 3d4b7c1f06..24ba950bfe 100644 --- a/packages.yml +++ b/packages.yml @@ -797,3 +797,8 @@ packages: js: "n/a" downloads: 188 downloads_updated_at: '2026-06-15T00:31:43.789982+00:00' +- name: langchain-infino + repo: infino-ai/langchain-infino + js: "n/a" + downloads: 0 + downloads_updated_at: "2026-06-15T00:31:43.789982+00:00" diff --git a/src/oss/python/integrations/providers/infino.mdx b/src/oss/python/integrations/providers/infino.mdx new file mode 100644 index 0000000000..1bf476059f --- /dev/null +++ b/src/oss/python/integrations/providers/infino.mdx @@ -0,0 +1,69 @@ +--- +title: "Infino integrations" +description: "Integrate with Infino using LangChain Python." +--- + +>[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs +>SQL, full-text (BM25), vector, and hybrid (RRF) search over one copy of your +>data in Apache Parquet on object storage — no separate search cluster or +>vector store to keep in sync. + +The `langchain-infino` package surfaces that whole retrieval surface, not just +`similarity_search`. Infino never embeds: you bring a LangChain `Embeddings` +object and the integration supplies the vectors. + +## Installation and setup + + +```bash pip +pip install langchain-infino +``` + +```bash uv +uv add langchain-infino +``` + + +Infino runs in-process — there are no credentials or API keys. A connection is +a local path or an `s3://` URI for durable storage (`memory://` is ephemeral): + +```python +import infino + +connection = infino.connect("./data") +``` + +## Vector store + +`InfinoVectorStore` wraps a single Infino table — the text, its embedding, the +document id, declared metadata columns, and a JSON catch-all. Vector, filtered, +MMR, and hybrid retrieval all run over that one table. + +```python +from langchain_infino import InfinoVectorStore +``` + +For a detailed walkthrough, see the [InfinoVectorStore page](/oss/integrations/vectorstores/infino). + +## Retriever + +Beyond `as_retriever()` (vector), the store exposes lexical and fused +retrievers, plus a self-query translator that lowers an LLM's structured query +to a SQL `WHERE` over the declared metadata columns: + +```python +from langchain_infino import ( + InfinoBM25Retriever, + InfinoHybridRetriever, + InfinoTranslator, +) +``` + +## LLM cache + +`InfinoSemanticCache` caches model responses keyed by prompt meaning, backed by +one small Infino table: + +```python +from langchain_infino import InfinoSemanticCache +``` diff --git a/src/oss/python/integrations/vectorstores/infino.mdx b/src/oss/python/integrations/vectorstores/infino.mdx new file mode 100644 index 0000000000..7a3b4f229e --- /dev/null +++ b/src/oss/python/integrations/vectorstores/infino.mdx @@ -0,0 +1,249 @@ +--- +title: "Infino integration" +description: "Integrate with the Infino vector store using LangChain Python." +--- + +This guide provides a quick overview for getting started with the Infino [vector store](/oss/integrations/vectorstores#overview). For a detailed listing of all `InfinoVectorStore` features, parameters, and configurations, head to the [PyPI package page](https://pypi.org/project/langchain-infino/). + +[Infino](https://github.com/infino-ai/infino) runs SQL, full-text (BM25), vector, and hybrid (RRF) retrieval over one copy of your data in Apache Parquet on object storage. `InfinoVectorStore` surfaces that whole retrieval surface, not just `similarity_search`. + +## Setup + +Infino runs in-process, so there are no credentials or API keys for the engine itself. You only need an embeddings provider — Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. + +### Credentials + +This guide uses [OpenAI embeddings](/oss/integrations/text_embedding/openai), which need an API key: + +```python Set API key icon="key" +import getpass +import os + +if "OPENAI_API_KEY" not in os.environ: + os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ") +``` + +To enable automated tracing of your model calls, set your [LangSmith](/langsmith/observability) API key: + +```python Enable tracing icon="flask" +os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ") +os.environ["LANGSMITH_TRACING"] = "true" +``` + +### Installation + +The LangChain Infino integration lives in the `langchain-infino` package: + + + ```python pip + pip install -U langchain-infino langchain-openai + ``` + ```python uv + uv add langchain-infino langchain-openai + ``` + + +--- + +## Instantiation + +A connection is a local path or an `s3://` URI for durable storage (`memory://` is ephemeral). Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. + +```python Initialize vector store icon="database" +import infino +from langchain_infino import InfinoVectorStore +from langchain_openai import OpenAIEmbeddings + +connection = infino.connect("./data") +embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # 1536-dim + +vector_store = InfinoVectorStore.from_texts( + [], + embeddings, + connection=connection, + table_name="docs", + dim=1536, +) +``` + +--- + +## Manage vector store + +### Add items + +`add_documents` returns the ids. Caller-supplied ids are upserted; omitted ids are generated. + +```python Add documents icon="folder-plus" +from langchain_core.documents import Document + +document_1 = Document(page_content="foo", metadata={"source": "https://example.com"}) +document_2 = Document(page_content="bar", metadata={"source": "https://example.com"}) +document_3 = Document(page_content="baz", metadata={"source": "https://example.com"}) +documents = [document_1, document_2, document_3] + +vector_store.add_documents(documents=documents, ids=["1", "2", "3"]) +``` + +### Update items + +`add_documents` is an idempotent upsert: re-adding an id overwrites it in place. + +```python Update document by ID icon="pencil" +updated_document = Document( + page_content="qux", metadata={"source": "https://another-example.com"} +) + +vector_store.add_documents(documents=[updated_document], ids=["1"]) +``` + +### Delete items + +```python Delete documents by IDs icon="trash" +vector_store.delete(ids=["3"]) +``` + +--- + +## Query vector store + +Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent. + +### Directly + +Performing a simple similarity search can be done as follows: + +```python Similarity search icon="folders" +results = vector_store.similarity_search( + query="thud", k=1, filter={"source": "https://another-example.com"} +) +for doc in results: + print(f"* {doc.page_content} [{doc.metadata}]") +``` + +If you want to execute a similarity search and receive the corresponding scores you can run: + +```python Similarity search with scores icon="star-half" +results = vector_store.similarity_search_with_score(query="thud", k=1) +for doc, score in results: + print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]") +``` + +Vector distance is *smaller is nearer*. To get a normalized `[0, 1]` relevance (higher = better) for the `cosine`, `l2`, and `l2sq` metrics, use `similarity_search_with_relevance_scores`. + +### By turning into retriever + +You can also transform the vector store into a retriever for easier usage in your chains. + +```python Create retriever icon="robot" +retriever = vector_store.as_retriever(search_type="mmr", search_kwargs={"k": 1}) +retriever.invoke("thud") +``` + +--- + +## Usage for retrieval-augmented generation + + + Guide on how to use this vector store for retrieval-augmented generation (RAG) + + +--- + +## Infino-specific functionality + +Infino is a single engine for SQL, BM25, vector, and hybrid retrieval, so the store exposes more than the vector slice. + +### Metadata filtering + +Promote the keys you want to filter on to real scalar columns at table creation; everything else round-trips losslessly through a JSON catch-all but isn't filterable. Filtering supports equality, `$eq` / `$ne` / `$gt` / `$gte` / `$lt` / `$lte`, `$in` / `$nin`, and `$and` / `$or` / `$not`. + +```python Filterable metadata columns icon="filter" +import pyarrow as pa + +store = InfinoVectorStore.from_texts( + ["a paper on optimizers", "a paper on transformers"], + embeddings, + connection=connection, + table_name="papers", + dim=1536, + metadata_columns=[ + pa.field("category", pa.large_utf8(), nullable=False), + pa.field("year", pa.int64(), nullable=False), + ], + metadatas=[{"category": "ml", "year": 2024}, {"category": "ml", "year": 2023}], +) + +store.similarity_search("optimizers", k=4, filter={"year": {"$gte": 2024}}) +store.similarity_search( + "optimizers", k=4, filter={"$or": [{"category": "ml"}, {"year": {"$lt": 2000}}]} +) +``` + +### Text-pushdown pre-filter + +For a *text* predicate, push it into the kNN instead of post-filtering the top-k. The engine prunes to rows matching the full-text terms **before** ranking, so exactly `k` nearest *matching* rows come back. It is reachable from any retriever via `search_kwargs`. + +```python Text pushdown icon="magnifying-glass" +store.similarity_search("cancel my plan", k=10, filter_query="subscription billing") +``` + +### Hybrid (RRF) retrieval + +BM25 and vector search fused by reciprocal-rank fusion in a single SQL call — no separate reranking round-trip. + +```python Hybrid retriever icon="git-merge" +retriever = store.as_hybrid_retriever(k=4) +retriever.invoke("neural network training") +``` + +### BM25 retrieval + +Pure lexical ranking over the FTS-indexed text column. `mode="and"` requires all query terms; `"or"` (default) matches any. + +```python BM25 retriever icon="font" +retriever = store.as_bm25_retriever(k=4, mode="and") +retriever.invoke("gradient descent") +``` + +### Self-query + +`InfinoTranslator` plugs into LangChain's `SelfQueryRetriever`, lowering an LLM's structured query to a SQL `WHERE` over the declared metadata columns — the full comparison and boolean surface, not a reduced DSL. + +```python Self-query icon="wand-magic-sparkles" +from langchain_classic.retrievers import SelfQueryRetriever +from langchain_infino import InfinoTranslator + +retriever = SelfQueryRetriever.from_llm( + llm, + store, + document_contents="research papers", + metadata_field_info=metadata_field_info, + structured_query_translator=InfinoTranslator(), +) +retriever.invoke("ML papers since 2023") +``` + +### SQL-native search + +The escape hatch for anything the typed methods don't cover — joins, custom `WHERE`, or the `vector_search` / `hybrid_search` table functions. Project the store's columns and the rows map back to `Document`s. + +```python SQL escape hatch icon="database" +qv = ",".join(map(str, embeddings.embed_query("fox"))) +store.search_by_sql(f""" + SELECT doc_id, page_content, _metadata_json, score + FROM hybrid_search('docs', 'page_content', 'fox', 'embedding', '{qv}', 10) + ORDER BY score DESC +""") +``` + +--- + +## API reference + +For detailed documentation of all `InfinoVectorStore` features and configurations, head to the [PyPI package page](https://pypi.org/project/langchain-infino/) and the [source repository](https://github.com/infino-ai/langchain-infino). From c78ec8b9de6eafa446f9981ed5686c7d9a99c7f8 Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Thu, 25 Jun 2026 16:55:11 +0530 Subject: [PATCH 2/8] Make Infino docs engaging: realistic corpus, shown outputs, hybrid-vs-vector demo --- .../python/integrations/providers/infino.mdx | 11 +- .../integrations/vectorstores/infino.mdx | 171 ++++++++++-------- 2 files changed, 106 insertions(+), 76 deletions(-) diff --git a/src/oss/python/integrations/providers/infino.mdx b/src/oss/python/integrations/providers/infino.mdx index 1bf476059f..5029352db3 100644 --- a/src/oss/python/integrations/providers/infino.mdx +++ b/src/oss/python/integrations/providers/infino.mdx @@ -4,13 +4,14 @@ description: "Integrate with Infino using LangChain Python." --- >[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs ->SQL, full-text (BM25), vector, and hybrid (RRF) search over one copy of your ->data in Apache Parquet on object storage — no separate search cluster or ->vector store to keep in sync. +>SQL, full-text (BM25), vector, and hybrid (RRF) search over **one copy** of +>your data in Apache Parquet on object storage — no separate vector database +>and search cluster to provision, sync, or keep consistent. The `langchain-infino` package surfaces that whole retrieval surface, not just -`similarity_search`. Infino never embeds: you bring a LangChain `Embeddings` -object and the integration supplies the vectors. +`similarity_search`: semantic search *and* exact-keyword BM25 *and* their +fusion, from a single in-process engine. Infino never embeds — you bring a +LangChain `Embeddings` object and the integration supplies the vectors. ## Installation and setup diff --git a/src/oss/python/integrations/vectorstores/infino.mdx b/src/oss/python/integrations/vectorstores/infino.mdx index 7a3b4f229e..325e5e62e8 100644 --- a/src/oss/python/integrations/vectorstores/infino.mdx +++ b/src/oss/python/integrations/vectorstores/infino.mdx @@ -3,13 +3,15 @@ title: "Infino integration" description: "Integrate with the Infino vector store using LangChain Python." --- -This guide provides a quick overview for getting started with the Infino [vector store](/oss/integrations/vectorstores#overview). For a detailed listing of all `InfinoVectorStore` features, parameters, and configurations, head to the [PyPI package page](https://pypi.org/project/langchain-infino/). +[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs vector, full-text (BM25), hybrid (RRF), and SQL search over **one copy** of your data in Apache Parquet on object storage — no separate vector database and search cluster to provision, sync, or keep consistent. -[Infino](https://github.com/infino-ai/infino) runs SQL, full-text (BM25), vector, and hybrid (RRF) retrieval over one copy of your data in Apache Parquet on object storage. `InfinoVectorStore` surfaces that whole retrieval surface, not just `similarity_search`. +Most "vector database" integrations expose only the vector slice of their engine. `InfinoVectorStore` surfaces the whole surface: semantic search *and* exact-keyword BM25 *and* their fusion, from a single in-process engine. The payoff shows up below in [Vector vs. BM25 vs. hybrid](#vector-vs-bm25-vs-hybrid) — where pure-vector search misses an exact error code and hybrid retrieval catches it. + +This guide covers getting started. For the full API, see the [PyPI package page](https://pypi.org/project/langchain-infino/) and the [source repository](https://github.com/infino-ai/langchain-infino). ## Setup -Infino runs in-process, so there are no credentials or API keys for the engine itself. You only need an embeddings provider — Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. +Infino runs **in-process** — there is no server to deploy and no engine credentials. You only need an embeddings provider: Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. ### Credentials @@ -47,21 +49,21 @@ The LangChain Infino integration lives in the `langchain-infino` package: ## Instantiation -A connection is a local path or an `s3://` URI for durable storage (`memory://` is ephemeral). Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. +A connection is a local path or an `s3://` URI for durable storage (`memory://` is ephemeral — handy for tests). Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. ```python Initialize vector store icon="database" import infino from langchain_infino import InfinoVectorStore from langchain_openai import OpenAIEmbeddings -connection = infino.connect("./data") +connection = infino.connect("./data") # or "s3://my-bucket/kb", or "memory://" embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # 1536-dim vector_store = InfinoVectorStore.from_texts( [], embeddings, connection=connection, - table_name="docs", + table_name="support_kb", dim=1536, ) ``` @@ -72,17 +74,27 @@ vector_store = InfinoVectorStore.from_texts( ### Add items -`add_documents` returns the ids. Caller-supplied ids are upserted; omitted ids are generated. +We'll use a small support knowledge base — a realistic mix of natural-language prose and exact identifiers (error codes, plan names, API endpoints). The same corpus drives every example below. `add_documents` returns the ids; caller-supplied ids are upserted (re-adding an id overwrites it), omitted ids are generated. ```python Add documents icon="folder-plus" from langchain_core.documents import Document -document_1 = Document(page_content="foo", metadata={"source": "https://example.com"}) -document_2 = Document(page_content="bar", metadata={"source": "https://example.com"}) -document_3 = Document(page_content="baz", metadata={"source": "https://example.com"}) -documents = [document_1, document_2, document_3] - -vector_store.add_documents(documents=documents, ids=["1", "2", "3"]) +documents = [ + Document(page_content="Reset your password from the Account → Security settings page.", + metadata={"source": "docs"}), + Document(page_content="Error E-4042 means your upload exceeded the 2 GB file size limit.", + metadata={"source": "kb"}), + Document(page_content="The Q3 outage was caused by a misconfigured load balancer.", + metadata={"source": "postmortem"}), + Document(page_content="The Pro plan includes priority support and unlimited seats.", + metadata={"source": "pricing"}), + Document(page_content="To export your data in bulk, call the bulk_export API endpoint.", + metadata={"source": "docs"}), + Document(page_content="Customers on the legacy Starter tier keep grandfathered pricing.", + metadata={"source": "pricing"}), +] + +vector_store.add_documents(documents=documents, ids=[f"doc-{i}" for i in range(len(documents))]) ``` ### Update items @@ -90,54 +102,59 @@ vector_store.add_documents(documents=documents, ids=["1", "2", "3"]) `add_documents` is an idempotent upsert: re-adding an id overwrites it in place. ```python Update document by ID icon="pencil" -updated_document = Document( - page_content="qux", metadata={"source": "https://another-example.com"} +revised = Document( + page_content="Error E-4042 means your upload exceeded the 5 GB file size limit.", + metadata={"source": "kb"}, ) - -vector_store.add_documents(documents=[updated_document], ids=["1"]) +vector_store.add_documents(documents=[revised], ids=["doc-1"]) ``` ### Delete items ```python Delete documents by IDs icon="trash" -vector_store.delete(ids=["3"]) +vector_store.delete(ids=["doc-2"]) ``` --- ## Query vector store -Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent. - ### Directly -Performing a simple similarity search can be done as follows: +A similarity search encodes your query into an embedding and returns the nearest documents. Note there's no shared keyword between the query and the answer — semantic search bridges "login credentials" → "password": ```python Similarity search icon="folders" -results = vector_store.similarity_search( - query="thud", k=1, filter={"source": "https://another-example.com"} -) +results = vector_store.similarity_search("how do I change my login credentials", k=2) for doc in results: print(f"* {doc.page_content} [{doc.metadata}]") ``` -If you want to execute a similarity search and receive the corresponding scores you can run: +```text +* Reset your password from the Account → Security settings page. [{'source': 'docs'}] +* To export your data in bulk, call the bulk_export API endpoint. [{'source': 'docs'}] +``` + +To get the corresponding scores back, use `similarity_search_with_score`: ```python Similarity search with scores icon="star-half" -results = vector_store.similarity_search_with_score(query="thud", k=1) +results = vector_store.similarity_search_with_score("upgrade for more seats", k=1) for doc, score in results: print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]") ``` -Vector distance is *smaller is nearer*. To get a normalized `[0, 1]` relevance (higher = better) for the `cosine`, `l2`, and `l2sq` metrics, use `similarity_search_with_relevance_scores`. +```text +* [SIM=0.412879] The Pro plan includes priority support and unlimited seats. [{'source': 'pricing'}] +``` + +Vector distance is *smaller is nearer*. For a normalized `[0, 1]` relevance (higher = better) under the `cosine`, `l2`, and `l2sq` metrics, use `similarity_search_with_relevance_scores`. ### By turning into retriever -You can also transform the vector store into a retriever for easier usage in your chains. +Transform the store into a retriever for use in your chains and agents: ```python Create retriever icon="robot" retriever = vector_store.as_retriever(search_type="mmr", search_kwargs={"k": 1}) -retriever.invoke("thud") +retriever.invoke("how do I change my login credentials") ``` --- @@ -155,65 +172,75 @@ retriever.invoke("thud") --- -## Infino-specific functionality +## Vector vs. BM25 vs. hybrid + +This is where one engine over one copy of your data pays off. Consider a user searching for the exact error code **`E-4042`**. Embeddings are weak on rare tokens like error codes, so pure-vector search drifts to topically-related docs and can miss the exact match. BM25 nails the literal token. Hybrid (RRF) fuses both rankings, so you get semantic recall *and* exact-match precision — without a second datastore. + +```python Same query, three retrieval modes icon="scale-balanced" +query = "E-4042 upload failed" + +# Semantic only — may miss the rare exact token. +vector = vector_store.similarity_search(query, k=1) -Infino is a single engine for SQL, BM25, vector, and hybrid retrieval, so the store exposes more than the vector slice. +# Lexical only — locks onto the exact code. +bm25 = vector_store.as_bm25_retriever(k=1).invoke(query) -### Metadata filtering +# Fused (RRF) — semantic recall + exact-match precision, one SQL call. +hybrid = vector_store.as_hybrid_retriever(k=1).invoke(query) + +for label, docs in [("vector", vector), ("bm25", bm25), ("hybrid", hybrid)]: + print(f"{label:>7}: {docs[0].page_content}") +``` + +```text + vector: The Q3 outage was caused by a misconfigured load balancer. + bm25: Error E-4042 means your upload exceeded the 2 GB file size limit. + hybrid: Error E-4042 means your upload exceeded the 2 GB file size limit. +``` -Promote the keys you want to filter on to real scalar columns at table creation; everything else round-trips losslessly through a JSON catch-all but isn't filterable. Filtering supports equality, `$eq` / `$ne` / `$gt` / `$gte` / `$lt` / `$lte`, `$in` / `$nin`, and `$and` / `$or` / `$not`. +`as_bm25_retriever(mode="and")` requires all query terms; `"or"` (default) matches any. + +--- + +## Metadata filtering + +Promote the keys you want to filter on to real scalar columns at table creation — those become filterable; everything else round-trips losslessly through a JSON catch-all but isn't filterable. Filtering supports equality, `$eq` / `$ne` / `$gt` / `$gte` / `$lt` / `$lte`, `$in` / `$nin`, and `$and` / `$or` / `$not`. ```python Filterable metadata columns icon="filter" import pyarrow as pa store = InfinoVectorStore.from_texts( - ["a paper on optimizers", "a paper on transformers"], + [d.page_content for d in documents], embeddings, connection=connection, - table_name="papers", + table_name="support_kb_filtered", dim=1536, - metadata_columns=[ - pa.field("category", pa.large_utf8(), nullable=False), - pa.field("year", pa.int64(), nullable=False), - ], - metadatas=[{"category": "ml", "year": 2024}, {"category": "ml", "year": 2023}], + metadata_columns=[pa.field("source", pa.large_utf8(), nullable=False)], + metadatas=[d.metadata for d in documents], ) -store.similarity_search("optimizers", k=4, filter={"year": {"$gte": 2024}}) +# Only search pricing docs. +store.similarity_search("what does the paid plan include", k=2, filter={"source": "pricing"}) + +# Boolean combinations. store.similarity_search( - "optimizers", k=4, filter={"$or": [{"category": "ml"}, {"year": {"$lt": 2000}}]} + "account help", k=3, filter={"$or": [{"source": "docs"}, {"source": "kb"}]} ) ``` -### Text-pushdown pre-filter +### Text-pushdown pre-filter -For a *text* predicate, push it into the kNN instead of post-filtering the top-k. The engine prunes to rows matching the full-text terms **before** ranking, so exactly `k` nearest *matching* rows come back. It is reachable from any retriever via `search_kwargs`. +For a *text* predicate, push it into the kNN instead of post-filtering the top-k. The engine prunes to rows matching the full-text terms **before** ranking, so exactly `k` nearest *matching* rows come back — no over-fetch, no under-return. Reachable from any retriever via `search_kwargs`: ```python Text pushdown icon="magnifying-glass" -store.similarity_search("cancel my plan", k=10, filter_query="subscription billing") +store.similarity_search("how do I cancel", k=5, filter_query="plan pricing") ``` -### Hybrid (RRF) retrieval - -BM25 and vector search fused by reciprocal-rank fusion in a single SQL call — no separate reranking round-trip. - -```python Hybrid retriever icon="git-merge" -retriever = store.as_hybrid_retriever(k=4) -retriever.invoke("neural network training") -``` - -### BM25 retrieval - -Pure lexical ranking over the FTS-indexed text column. `mode="and"` requires all query terms; `"or"` (default) matches any. - -```python BM25 retriever icon="font" -retriever = store.as_bm25_retriever(k=4, mode="and") -retriever.invoke("gradient descent") -``` +--- -### Self-query +## Self-query -`InfinoTranslator` plugs into LangChain's `SelfQueryRetriever`, lowering an LLM's structured query to a SQL `WHERE` over the declared metadata columns — the full comparison and boolean surface, not a reduced DSL. +`InfinoTranslator` plugs into LangChain's `SelfQueryRetriever`, lowering an LLM's structured query into a SQL `WHERE` over the declared metadata columns — the full comparison and boolean surface, not a reduced DSL. See the [self-query docs](/oss/integrations/retrievers/self_query) for the `metadata_field_info` setup. ```python Self-query icon="wand-magic-sparkles" from langchain_classic.retrievers import SelfQueryRetriever @@ -222,22 +249,24 @@ from langchain_infino import InfinoTranslator retriever = SelfQueryRetriever.from_llm( llm, store, - document_contents="research papers", + document_contents="support knowledge base articles", metadata_field_info=metadata_field_info, structured_query_translator=InfinoTranslator(), ) -retriever.invoke("ML papers since 2023") +retriever.invoke("pricing articles about the Pro plan") ``` -### SQL-native search +--- + +## SQL-native search -The escape hatch for anything the typed methods don't cover — joins, custom `WHERE`, or the `vector_search` / `hybrid_search` table functions. Project the store's columns and the rows map back to `Document`s. +The escape hatch for anything the typed methods don't cover — joins, custom `WHERE`, or the `vector_search` / `hybrid_search` table functions directly. Project the store's columns and the rows map back to `Document`s: ```python SQL escape hatch icon="database" -qv = ",".join(map(str, embeddings.embed_query("fox"))) -store.search_by_sql(f""" +qv = ",".join(map(str, embeddings.embed_query("upload error"))) +vector_store.search_by_sql(f""" SELECT doc_id, page_content, _metadata_json, score - FROM hybrid_search('docs', 'page_content', 'fox', 'embedding', '{qv}', 10) + FROM hybrid_search('support_kb', 'page_content', 'E-4042', 'embedding', '{qv}', 5) ORDER BY score DESC """) ``` From 14d434fc06fe33300be50a368069c1ed98a62f1f Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Thu, 25 Jun 2026 17:08:37 +0530 Subject: [PATCH 3/8] Verify Infino docs against live engine: real outputs, honest hybrid demo Ran every snippet against a real embeddings backend. Corrected the hybrid demo (pure-vector does not miss the exact code; the genuine divergence is BM25 missing a keyword-free paraphrase), replaced guessed outputs with captured ones, and noted memory:// lacks delete/update. --- .../integrations/vectorstores/infino.mdx | 65 +++++++++++++------ 1 file changed, 45 insertions(+), 20 deletions(-) diff --git a/src/oss/python/integrations/vectorstores/infino.mdx b/src/oss/python/integrations/vectorstores/infino.mdx index 325e5e62e8..a6dbb3e9b7 100644 --- a/src/oss/python/integrations/vectorstores/infino.mdx +++ b/src/oss/python/integrations/vectorstores/infino.mdx @@ -5,7 +5,7 @@ description: "Integrate with the Infino vector store using LangChain Python." [Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs vector, full-text (BM25), hybrid (RRF), and SQL search over **one copy** of your data in Apache Parquet on object storage — no separate vector database and search cluster to provision, sync, or keep consistent. -Most "vector database" integrations expose only the vector slice of their engine. `InfinoVectorStore` surfaces the whole surface: semantic search *and* exact-keyword BM25 *and* their fusion, from a single in-process engine. The payoff shows up below in [Vector vs. BM25 vs. hybrid](#vector-vs-bm25-vs-hybrid) — where pure-vector search misses an exact error code and hybrid retrieval catches it. +Most "vector database" integrations expose only the vector slice of their engine. `InfinoVectorStore` surfaces the whole surface: semantic search *and* exact-keyword BM25 *and* their fusion, from a single in-process engine. The payoff shows up below in [Vector vs. BM25 vs. hybrid](#vector-vs-bm25-vs-hybrid) — where lexical search alone misses a paraphrased question, and hybrid retrieval recovers it while still guaranteeing exact-identifier matches. This guide covers getting started. For the full API, see the [PyPI package page](https://pypi.org/project/langchain-infino/) and the [source repository](https://github.com/infino-ai/langchain-infino). @@ -49,14 +49,14 @@ The LangChain Infino integration lives in the `langchain-infino` package: ## Instantiation -A connection is a local path or an `s3://` URI for durable storage (`memory://` is ephemeral — handy for tests). Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. +A connection is a local path or an `s3://` URI for durable storage. (`memory://` is also available but is read-mostly — it doesn't back the delete/update path, so use durable storage for the examples below.) Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. ```python Initialize vector store icon="database" import infino from langchain_infino import InfinoVectorStore from langchain_openai import OpenAIEmbeddings -connection = infino.connect("./data") # or "s3://my-bucket/kb", or "memory://" +connection = infino.connect("./data") # local path, or "s3://my-bucket/kb" embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # 1536-dim vector_store = InfinoVectorStore.from_texts( @@ -131,7 +131,7 @@ for doc in results: ```text * Reset your password from the Account → Security settings page. [{'source': 'docs'}] -* To export your data in bulk, call the bulk_export API endpoint. [{'source': 'docs'}] +* Customers on the legacy Starter tier keep grandfathered pricing. [{'source': 'pricing'}] ``` To get the corresponding scores back, use `similarity_search_with_score`: @@ -143,7 +143,7 @@ for doc, score in results: ``` ```text -* [SIM=0.412879] The Pro plan includes priority support and unlimited seats. [{'source': 'pricing'}] +* [SIM=0.583733] The Pro plan includes priority support and unlimited seats. [{'source': 'pricing'}] ``` Vector distance is *smaller is nearer*. For a normalized `[0, 1]` relevance (higher = better) under the `cosine`, `l2`, and `l2sq` metrics, use `similarity_search_with_relevance_scores`. @@ -174,28 +174,43 @@ retriever.invoke("how do I change my login credentials") ## Vector vs. BM25 vs. hybrid -This is where one engine over one copy of your data pays off. Consider a user searching for the exact error code **`E-4042`**. Embeddings are weak on rare tokens like error codes, so pure-vector search drifts to topically-related docs and can miss the exact match. BM25 nails the literal token. Hybrid (RRF) fuses both rankings, so you get semantic recall *and* exact-match precision — without a second datastore. +This is where one engine over one copy of your data pays off — the two retrieval modes have complementary blind spots, and hybrid (RRF) covers both in a single SQL call, with no second datastore to sync. -```python Same query, three retrieval modes icon="scale-balanced" -query = "E-4042 upload failed" +Helper to compare all three modes on a query: -# Semantic only — may miss the rare exact token. -vector = vector_store.similarity_search(query, k=1) +```python Compare retrieval modes icon="scale-balanced" +def compare(query): + modes = { + "vector": vector_store.similarity_search(query, k=1), + "bm25": vector_store.as_bm25_retriever(k=1).invoke(query), + "hybrid": vector_store.as_hybrid_retriever(k=1).invoke(query), + } + for label, docs in modes.items(): + print(f"{label:>7}: {docs[0].page_content if docs else ''}") +``` + +**A paraphrased question with no shared keywords.** Lexical search has nothing to match on, so BM25 returns nothing; semantic search still finds the answer, and hybrid recovers it: -# Lexical only — locks onto the exact code. -bm25 = vector_store.as_bm25_retriever(k=1).invoke(query) +```python Paraphrase icon="comment" +compare("how do I change my login credentials") +``` + +```text + vector: Reset your password from the Account → Security settings page. + bm25: + hybrid: Reset your password from the Account → Security settings page. +``` -# Fused (RRF) — semantic recall + exact-match precision, one SQL call. -hybrid = vector_store.as_hybrid_retriever(k=1).invoke(query) +**An exact identifier.** Embeddings can blur rare tokens like error codes; BM25 *guarantees* the literal match. Here all three agree — and hybrid means you never have to choose which mode to run: -for label, docs in [("vector", vector), ("bm25", bm25), ("hybrid", hybrid)]: - print(f"{label:>7}: {docs[0].page_content}") +```python Exact identifier icon="hashtag" +compare("E-4042") ``` ```text - vector: The Q3 outage was caused by a misconfigured load balancer. - bm25: Error E-4042 means your upload exceeded the 2 GB file size limit. - hybrid: Error E-4042 means your upload exceeded the 2 GB file size limit. + vector: Error E-4042 means your upload exceeded the 5 GB file size limit. + bm25: Error E-4042 means your upload exceeded the 5 GB file size limit. + hybrid: Error E-4042 means your upload exceeded the 5 GB file size limit. ``` `as_bm25_retriever(mode="and")` requires all query terms; `"or"` (default) matches any. @@ -264,11 +279,21 @@ The escape hatch for anything the typed methods don't cover — joins, custom `W ```python SQL escape hatch icon="database" qv = ",".join(map(str, embeddings.embed_query("upload error"))) -vector_store.search_by_sql(f""" +results = vector_store.search_by_sql(f""" SELECT doc_id, page_content, _metadata_json, score FROM hybrid_search('support_kb', 'page_content', 'E-4042', 'embedding', '{qv}', 5) ORDER BY score DESC """) +for doc in results: + print(f"* {doc.page_content}") +``` + +```text +* Error E-4042 means your upload exceeded the 5 GB file size limit. +* To export your data in bulk, call the bulk_export API endpoint. +* Reset your password from the Account → Security settings page. +* The Pro plan includes priority support and unlimited seats. +* Customers on the legacy Starter tier keep grandfathered pricing. ``` --- From 33763f238388fed1958925bf042970c4ec41568b Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Fri, 26 Jun 2026 11:13:25 +0530 Subject: [PATCH 4/8] Mark langchain-infino as JS-available in package registry --- packages.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/packages.yml b/packages.yml index 24ba950bfe..14ac7588f0 100644 --- a/packages.yml +++ b/packages.yml @@ -799,6 +799,6 @@ packages: downloads_updated_at: '2026-06-15T00:31:43.789982+00:00' - name: langchain-infino repo: infino-ai/langchain-infino - js: "n/a" + js: "@infino-ai/langchain-infino" downloads: 0 downloads_updated_at: "2026-06-15T00:31:43.789982+00:00" From 3f6aedcf0aca20b0380b9e09cfb883ca10193570 Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Fri, 26 Jun 2026 11:20:52 +0530 Subject: [PATCH 5/8] Fix em-dash spacing to satisfy Vale DashesSpaces rule --- src/oss/python/integrations/providers/infino.mdx | 2 +- src/oss/python/integrations/vectorstores/infino.mdx | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/oss/python/integrations/providers/infino.mdx b/src/oss/python/integrations/providers/infino.mdx index 5029352db3..7ccd98e01a 100644 --- a/src/oss/python/integrations/providers/infino.mdx +++ b/src/oss/python/integrations/providers/infino.mdx @@ -10,7 +10,7 @@ description: "Integrate with Infino using LangChain Python." The `langchain-infino` package surfaces that whole retrieval surface, not just `similarity_search`: semantic search *and* exact-keyword BM25 *and* their -fusion, from a single in-process engine. Infino never embeds — you bring a +fusion, from a single in-process engine. Infino never embeds—you bring a LangChain `Embeddings` object and the integration supplies the vectors. ## Installation and setup diff --git a/src/oss/python/integrations/vectorstores/infino.mdx b/src/oss/python/integrations/vectorstores/infino.mdx index a6dbb3e9b7..8d26e97a39 100644 --- a/src/oss/python/integrations/vectorstores/infino.mdx +++ b/src/oss/python/integrations/vectorstores/infino.mdx @@ -3,15 +3,15 @@ title: "Infino integration" description: "Integrate with the Infino vector store using LangChain Python." --- -[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs vector, full-text (BM25), hybrid (RRF), and SQL search over **one copy** of your data in Apache Parquet on object storage — no separate vector database and search cluster to provision, sync, or keep consistent. +[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs vector, full-text (BM25), hybrid (RRF), and SQL search over **one copy** of your data in Apache Parquet on object storage—no separate vector database and search cluster to provision, sync, or keep consistent. -Most "vector database" integrations expose only the vector slice of their engine. `InfinoVectorStore` surfaces the whole surface: semantic search *and* exact-keyword BM25 *and* their fusion, from a single in-process engine. The payoff shows up below in [Vector vs. BM25 vs. hybrid](#vector-vs-bm25-vs-hybrid) — where lexical search alone misses a paraphrased question, and hybrid retrieval recovers it while still guaranteeing exact-identifier matches. +Most "vector database" integrations expose only the vector slice of their engine. `InfinoVectorStore` surfaces the whole surface: semantic search *and* exact-keyword BM25 *and* their fusion, from a single in-process engine. The payoff shows up below in [Vector vs. BM25 vs. hybrid](#vector-vs-bm25-vs-hybrid)—where lexical search alone misses a paraphrased question, and hybrid retrieval recovers it while still guaranteeing exact-identifier matches. This guide covers getting started. For the full API, see the [PyPI package page](https://pypi.org/project/langchain-infino/) and the [source repository](https://github.com/infino-ai/langchain-infino). ## Setup -Infino runs **in-process** — there is no server to deploy and no engine credentials. You only need an embeddings provider: Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. +Infino runs **in-process**—there is no server to deploy and no engine credentials. You only need an embeddings provider: Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. ### Credentials @@ -49,7 +49,7 @@ The LangChain Infino integration lives in the `langchain-infino` package: ## Instantiation -A connection is a local path or an `s3://` URI for durable storage. (`memory://` is also available but is read-mostly — it doesn't back the delete/update path, so use durable storage for the examples below.) Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. +A connection is a local path or an `s3://` URI for durable storage. (`memory://` is also available but is read-mostly—it doesn't back the delete/update path, so use durable storage for the examples below.) Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. ```python Initialize vector store icon="database" import infino From 9f90b8047e2bf0b2a08c8bd580fffc435a78a09c Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Fri, 26 Jun 2026 11:22:46 +0530 Subject: [PATCH 6/8] Add Infino to vector store integrations index --- .../integrations/vectorstores/index.mdx | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/src/oss/python/integrations/vectorstores/index.mdx b/src/oss/python/integrations/vectorstores/index.mdx index 98e4d25d23..2eb4bc33d3 100644 --- a/src/oss/python/integrations/vectorstores/index.mdx +++ b/src/oss/python/integrations/vectorstores/index.mdx @@ -578,6 +578,32 @@ vector_store = ElasticsearchStore( ) ``` + + + + +```bash pip +pip install -qU langchain-infino +``` + +```bash uv +uv add langchain-infino +``` + +```python +import infino +from langchain_infino import InfinoVectorStore + +connection = infino.connect("./data") # local path, or "s3://my-bucket/kb" + +vector_store = InfinoVectorStore.from_texts( + [], + embeddings, + connection=connection, + table_name="support_kb", + dim=1536, +) +``` From c5c6c474083602455c8e02767d320fdc9321ad03 Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Fri, 26 Jun 2026 11:31:05 +0530 Subject: [PATCH 7/8] Strip spaces around remaining em-dashes for Vale --- src/oss/python/integrations/providers/infino.mdx | 6 +++--- .../python/integrations/vectorstores/infino.mdx | 16 ++++++++-------- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/src/oss/python/integrations/providers/infino.mdx b/src/oss/python/integrations/providers/infino.mdx index 7ccd98e01a..99626dddcd 100644 --- a/src/oss/python/integrations/providers/infino.mdx +++ b/src/oss/python/integrations/providers/infino.mdx @@ -5,7 +5,7 @@ description: "Integrate with Infino using LangChain Python." >[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs >SQL, full-text (BM25), vector, and hybrid (RRF) search over **one copy** of ->your data in Apache Parquet on object storage — no separate vector database +>your data in Apache Parquet on object storage—no separate vector database >and search cluster to provision, sync, or keep consistent. The `langchain-infino` package surfaces that whole retrieval surface, not just @@ -25,7 +25,7 @@ uv add langchain-infino ``` -Infino runs in-process — there are no credentials or API keys. A connection is +Infino runs in-process—there are no credentials or API keys. A connection is a local path or an `s3://` URI for durable storage (`memory://` is ephemeral): ```python @@ -36,7 +36,7 @@ connection = infino.connect("./data") ## Vector store -`InfinoVectorStore` wraps a single Infino table — the text, its embedding, the +`InfinoVectorStore` wraps a single Infino table—the text, its embedding, the document id, declared metadata columns, and a JSON catch-all. Vector, filtered, MMR, and hybrid retrieval all run over that one table. diff --git a/src/oss/python/integrations/vectorstores/infino.mdx b/src/oss/python/integrations/vectorstores/infino.mdx index 8d26e97a39..870e9d27c7 100644 --- a/src/oss/python/integrations/vectorstores/infino.mdx +++ b/src/oss/python/integrations/vectorstores/infino.mdx @@ -74,7 +74,7 @@ vector_store = InfinoVectorStore.from_texts( ### Add items -We'll use a small support knowledge base — a realistic mix of natural-language prose and exact identifiers (error codes, plan names, API endpoints). The same corpus drives every example below. `add_documents` returns the ids; caller-supplied ids are upserted (re-adding an id overwrites it), omitted ids are generated. +We'll use a small support knowledge base—a realistic mix of natural-language prose and exact identifiers (error codes, plan names, API endpoints). The same corpus drives every example below. `add_documents` returns the ids; caller-supplied ids are upserted (re-adding an id overwrites it), omitted ids are generated. ```python Add documents icon="folder-plus" from langchain_core.documents import Document @@ -121,7 +121,7 @@ vector_store.delete(ids=["doc-2"]) ### Directly -A similarity search encodes your query into an embedding and returns the nearest documents. Note there's no shared keyword between the query and the answer — semantic search bridges "login credentials" → "password": +A similarity search encodes your query into an embedding and returns the nearest documents. Note there's no shared keyword between the query and the answer—semantic search bridges "login credentials" → "password": ```python Similarity search icon="folders" results = vector_store.similarity_search("how do I change my login credentials", k=2) @@ -174,7 +174,7 @@ retriever.invoke("how do I change my login credentials") ## Vector vs. BM25 vs. hybrid -This is where one engine over one copy of your data pays off — the two retrieval modes have complementary blind spots, and hybrid (RRF) covers both in a single SQL call, with no second datastore to sync. +This is where one engine over one copy of your data pays off—the two retrieval modes have complementary blind spots, and hybrid (RRF) covers both in a single SQL call, with no second datastore to sync. Helper to compare all three modes on a query: @@ -201,7 +201,7 @@ compare("how do I change my login credentials") hybrid: Reset your password from the Account → Security settings page. ``` -**An exact identifier.** Embeddings can blur rare tokens like error codes; BM25 *guarantees* the literal match. Here all three agree — and hybrid means you never have to choose which mode to run: +**An exact identifier.** Embeddings can blur rare tokens like error codes; BM25 *guarantees* the literal match. Here all three agree—and hybrid means you never have to choose which mode to run: ```python Exact identifier icon="hashtag" compare("E-4042") @@ -219,7 +219,7 @@ compare("E-4042") ## Metadata filtering -Promote the keys you want to filter on to real scalar columns at table creation — those become filterable; everything else round-trips losslessly through a JSON catch-all but isn't filterable. Filtering supports equality, `$eq` / `$ne` / `$gt` / `$gte` / `$lt` / `$lte`, `$in` / `$nin`, and `$and` / `$or` / `$not`. +Promote the keys you want to filter on to real scalar columns at table creation—those become filterable; everything else round-trips losslessly through a JSON catch-all but isn't filterable. Filtering supports equality, `$eq` / `$ne` / `$gt` / `$gte` / `$lt` / `$lte`, `$in` / `$nin`, and `$and` / `$or` / `$not`. ```python Filterable metadata columns icon="filter" import pyarrow as pa @@ -245,7 +245,7 @@ store.similarity_search( ### Text-pushdown pre-filter -For a *text* predicate, push it into the kNN instead of post-filtering the top-k. The engine prunes to rows matching the full-text terms **before** ranking, so exactly `k` nearest *matching* rows come back — no over-fetch, no under-return. Reachable from any retriever via `search_kwargs`: +For a *text* predicate, push it into the kNN instead of post-filtering the top-k. The engine prunes to rows matching the full-text terms **before** ranking, so exactly `k` nearest *matching* rows come back—no over-fetch, no under-return. Reachable from any retriever via `search_kwargs`: ```python Text pushdown icon="magnifying-glass" store.similarity_search("how do I cancel", k=5, filter_query="plan pricing") @@ -255,7 +255,7 @@ store.similarity_search("how do I cancel", k=5, filter_query="plan pricing") ## Self-query -`InfinoTranslator` plugs into LangChain's `SelfQueryRetriever`, lowering an LLM's structured query into a SQL `WHERE` over the declared metadata columns — the full comparison and boolean surface, not a reduced DSL. See the [self-query docs](/oss/integrations/retrievers/self_query) for the `metadata_field_info` setup. +`InfinoTranslator` plugs into LangChain's `SelfQueryRetriever`, lowering an LLM's structured query into a SQL `WHERE` over the declared metadata columns—the full comparison and boolean surface, not a reduced DSL. See the [self-query docs](/oss/integrations/retrievers/self_query) for the `metadata_field_info` setup. ```python Self-query icon="wand-magic-sparkles" from langchain_classic.retrievers import SelfQueryRetriever @@ -275,7 +275,7 @@ retriever.invoke("pricing articles about the Pro plan") ## SQL-native search -The escape hatch for anything the typed methods don't cover — joins, custom `WHERE`, or the `vector_search` / `hybrid_search` table functions directly. Project the store's columns and the rows map back to `Document`s: +The escape hatch for anything the typed methods don't cover—joins, custom `WHERE`, or the `vector_search` / `hybrid_search` table functions directly. Project the store's columns and the rows map back to `Document`s: ```python SQL escape hatch icon="database" qv = ",".join(map(str, embeddings.embed_query("upload error"))) From e63b5ac863eefe4c18e2a69faa211d346cdbb71b Mon Sep 17 00:00:00 2001 From: Pratyush Lokhande Date: Fri, 26 Jun 2026 11:42:03 +0530 Subject: [PATCH 8/8] Replace em-dashes with commas, colons, and semicolons --- .../python/integrations/providers/infino.mdx | 8 +++---- .../integrations/vectorstores/infino.mdx | 24 +++++++++---------- 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/src/oss/python/integrations/providers/infino.mdx b/src/oss/python/integrations/providers/infino.mdx index 99626dddcd..5ed41e4c5c 100644 --- a/src/oss/python/integrations/providers/infino.mdx +++ b/src/oss/python/integrations/providers/infino.mdx @@ -5,12 +5,12 @@ description: "Integrate with Infino using LangChain Python." >[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs >SQL, full-text (BM25), vector, and hybrid (RRF) search over **one copy** of ->your data in Apache Parquet on object storage—no separate vector database +>your data in Apache Parquet on object storage, with no separate vector database >and search cluster to provision, sync, or keep consistent. The `langchain-infino` package surfaces that whole retrieval surface, not just `similarity_search`: semantic search *and* exact-keyword BM25 *and* their -fusion, from a single in-process engine. Infino never embeds—you bring a +fusion, from a single in-process engine. Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. ## Installation and setup @@ -25,7 +25,7 @@ uv add langchain-infino ``` -Infino runs in-process—there are no credentials or API keys. A connection is +Infino runs in-process, so there are no credentials or API keys. A connection is a local path or an `s3://` URI for durable storage (`memory://` is ephemeral): ```python @@ -36,7 +36,7 @@ connection = infino.connect("./data") ## Vector store -`InfinoVectorStore` wraps a single Infino table—the text, its embedding, the +`InfinoVectorStore` wraps a single Infino table: the text, its embedding, the document id, declared metadata columns, and a JSON catch-all. Vector, filtered, MMR, and hybrid retrieval all run over that one table. diff --git a/src/oss/python/integrations/vectorstores/infino.mdx b/src/oss/python/integrations/vectorstores/infino.mdx index 870e9d27c7..90e325252d 100644 --- a/src/oss/python/integrations/vectorstores/infino.mdx +++ b/src/oss/python/integrations/vectorstores/infino.mdx @@ -3,15 +3,15 @@ title: "Infino integration" description: "Integrate with the Infino vector store using LangChain Python." --- -[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs vector, full-text (BM25), hybrid (RRF), and SQL search over **one copy** of your data in Apache Parquet on object storage—no separate vector database and search cluster to provision, sync, or keep consistent. +[Infino](https://github.com/infino-ai/infino) is a retrieval engine that runs vector, full-text (BM25), hybrid (RRF), and SQL search over **one copy** of your data in Apache Parquet on object storage, with no separate vector database and search cluster to provision, sync, or keep consistent. -Most "vector database" integrations expose only the vector slice of their engine. `InfinoVectorStore` surfaces the whole surface: semantic search *and* exact-keyword BM25 *and* their fusion, from a single in-process engine. The payoff shows up below in [Vector vs. BM25 vs. hybrid](#vector-vs-bm25-vs-hybrid)—where lexical search alone misses a paraphrased question, and hybrid retrieval recovers it while still guaranteeing exact-identifier matches. +Most "vector database" integrations expose only the vector slice of their engine. `InfinoVectorStore` surfaces the whole surface: semantic search *and* exact-keyword BM25 *and* their fusion, from a single in-process engine. The payoff shows up below in [Vector vs. BM25 vs. hybrid](#vector-vs-bm25-vs-hybrid), where lexical search alone misses a paraphrased question, and hybrid retrieval recovers it while still guaranteeing exact-identifier matches. This guide covers getting started. For the full API, see the [PyPI package page](https://pypi.org/project/langchain-infino/) and the [source repository](https://github.com/infino-ai/langchain-infino). ## Setup -Infino runs **in-process**—there is no server to deploy and no engine credentials. You only need an embeddings provider: Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. +Infino runs **in-process**, so there is no server to deploy and no engine credentials. You only need an embeddings provider: Infino never embeds; you bring a LangChain `Embeddings` object and the integration supplies the vectors. ### Credentials @@ -49,7 +49,7 @@ The LangChain Infino integration lives in the `langchain-infino` package: ## Instantiation -A connection is a local path or an `s3://` URI for durable storage. (`memory://` is also available but is read-mostly—it doesn't back the delete/update path, so use durable storage for the examples below.) Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. +A connection is a local path or an `s3://` URI for durable storage. (`memory://` is also available but is read-mostly, so it does not back the delete/update path, so use durable storage for the examples below.) Use `from_texts` to create and populate a table; pass `[]` to start empty. The embedding dimension must match the table's declared `dim` and lie in the engine's supported range `[16, 4096]`. ```python Initialize vector store icon="database" import infino @@ -74,7 +74,7 @@ vector_store = InfinoVectorStore.from_texts( ### Add items -We'll use a small support knowledge base—a realistic mix of natural-language prose and exact identifiers (error codes, plan names, API endpoints). The same corpus drives every example below. `add_documents` returns the ids; caller-supplied ids are upserted (re-adding an id overwrites it), omitted ids are generated. +We'll use a small support knowledge base: a realistic mix of natural-language prose and exact identifiers (error codes, plan names, API endpoints). The same corpus drives every example below. `add_documents` returns the ids; caller-supplied ids are upserted (re-adding an id overwrites it), omitted ids are generated. ```python Add documents icon="folder-plus" from langchain_core.documents import Document @@ -121,7 +121,7 @@ vector_store.delete(ids=["doc-2"]) ### Directly -A similarity search encodes your query into an embedding and returns the nearest documents. Note there's no shared keyword between the query and the answer—semantic search bridges "login credentials" → "password": +A similarity search encodes your query into an embedding and returns the nearest documents. Note there's no shared keyword between the query and the answer, yet semantic search bridges "login credentials" → "password": ```python Similarity search icon="folders" results = vector_store.similarity_search("how do I change my login credentials", k=2) @@ -174,7 +174,7 @@ retriever.invoke("how do I change my login credentials") ## Vector vs. BM25 vs. hybrid -This is where one engine over one copy of your data pays off—the two retrieval modes have complementary blind spots, and hybrid (RRF) covers both in a single SQL call, with no second datastore to sync. +This is where one engine over one copy of your data pays off: the two retrieval modes have complementary blind spots, and hybrid (RRF) covers both in a single SQL call, with no second datastore to sync. Helper to compare all three modes on a query: @@ -201,7 +201,7 @@ compare("how do I change my login credentials") hybrid: Reset your password from the Account → Security settings page. ``` -**An exact identifier.** Embeddings can blur rare tokens like error codes; BM25 *guarantees* the literal match. Here all three agree—and hybrid means you never have to choose which mode to run: +**An exact identifier.** Embeddings can blur rare tokens like error codes; BM25 *guarantees* the literal match. Here all three agree, and hybrid means you never have to choose which mode to run: ```python Exact identifier icon="hashtag" compare("E-4042") @@ -219,7 +219,7 @@ compare("E-4042") ## Metadata filtering -Promote the keys you want to filter on to real scalar columns at table creation—those become filterable; everything else round-trips losslessly through a JSON catch-all but isn't filterable. Filtering supports equality, `$eq` / `$ne` / `$gt` / `$gte` / `$lt` / `$lte`, `$in` / `$nin`, and `$and` / `$or` / `$not`. +Promote the keys you want to filter on to real scalar columns at table creation: those become filterable; everything else round-trips losslessly through a JSON catch-all but isn't filterable. Filtering supports equality, `$eq` / `$ne` / `$gt` / `$gte` / `$lt` / `$lte`, `$in` / `$nin`, and `$and` / `$or` / `$not`. ```python Filterable metadata columns icon="filter" import pyarrow as pa @@ -245,7 +245,7 @@ store.similarity_search( ### Text-pushdown pre-filter -For a *text* predicate, push it into the kNN instead of post-filtering the top-k. The engine prunes to rows matching the full-text terms **before** ranking, so exactly `k` nearest *matching* rows come back—no over-fetch, no under-return. Reachable from any retriever via `search_kwargs`: +For a *text* predicate, push it into the kNN instead of post-filtering the top-k. The engine prunes to rows matching the full-text terms **before** ranking, so exactly `k` nearest *matching* rows come back, with no over-fetch and no under-return. Reachable from any retriever via `search_kwargs`: ```python Text pushdown icon="magnifying-glass" store.similarity_search("how do I cancel", k=5, filter_query="plan pricing") @@ -255,7 +255,7 @@ store.similarity_search("how do I cancel", k=5, filter_query="plan pricing") ## Self-query -`InfinoTranslator` plugs into LangChain's `SelfQueryRetriever`, lowering an LLM's structured query into a SQL `WHERE` over the declared metadata columns—the full comparison and boolean surface, not a reduced DSL. See the [self-query docs](/oss/integrations/retrievers/self_query) for the `metadata_field_info` setup. +`InfinoTranslator` plugs into LangChain's `SelfQueryRetriever`, lowering an LLM's structured query into a SQL `WHERE` over the declared metadata columns: the full comparison and boolean surface, not a reduced DSL. See the [self-query docs](/oss/integrations/retrievers/self_query) for the `metadata_field_info` setup. ```python Self-query icon="wand-magic-sparkles" from langchain_classic.retrievers import SelfQueryRetriever @@ -275,7 +275,7 @@ retriever.invoke("pricing articles about the Pro plan") ## SQL-native search -The escape hatch for anything the typed methods don't cover—joins, custom `WHERE`, or the `vector_search` / `hybrid_search` table functions directly. Project the store's columns and the rows map back to `Document`s: +The escape hatch for anything the typed methods do not cover: joins, custom `WHERE`, or the `vector_search` / `hybrid_search` table functions directly. Project the store's columns and the rows map back to `Document`s: ```python SQL escape hatch icon="database" qv = ",".join(map(str, embeddings.embed_query("upload error")))