08. Self Query Retriever

SelfQueryRetriever Is a search tool with the ability to create and solve questions on its own.

This is based on the natural language query provided by the user, query-constructing Use LLM chain to create structured queries. Subsequently, this structured query is applied to the default vector data store (VectorStore) to perform the search.

Through this process, SelfQueryRetriever Beyond simply comparing the user's input query with the content of the stored document, the user's query is about the document's metadata. Extract filter You can find related documents by running this filter.

[Note]

  • LangChain supports self-query Retriever list here Please check at

Copy

# API A configuration file for managing keys as environment variables.
from dotenv import load_dotenv

# API Load key information
load_dotenv()

Copy

True 

Copy

# LangSmith Set up tracking. https://smith.langchain.com
# !pip install langchain-teddynote
from langchain_teddynote import logging

# Enter a project name.
logging.langsmith("CH11-Retriever")

Copy

Sample data generation

Based on the description and metadata of cosmetic products, we build a vector repository with similar search.

Copy

SelfQueryRetriever

You can now instantiate retriever. To do this, the document supports Metadata field And the content of the document Provide a brief description in advance Should do.

AttributeInfo Classes are used to define information about cosmetic metadata fields.

  • Category ( category ): Indicates the string type, the category of cosmetics, and has the value of one of ['skincare','makeup','closing','selection'].

  • year ( year ): Indicates the integer type, the year the cosmetic was released.

  • User rating ( user_rating ): Real type, representing user ratings in the range 1-5.

Copy

SelfQueryRetriever.from_llm() Using methods retriever Create an object.

  • llm : Language model

  • vectorstore : Vector repository

  • document_contents : Description of the contents of the documents

  • metadata_field_info : Metadata field information

Copy

Query test

Search by entering the query to hang the filter.

Copy

Copy

Copy

Copy

Copy

Copy

You can perform a search using complex filters.

Copy

Copy

k means the number of documents to import.

SelfQueryRetriever Using k You can also specify This is on the constructor enable_limit=True You can do it by passing.

Copy

There are three products released in 2023, but we specify the "k" value as 2 to return only 2.

Copy

Copy

But explicitly by code search_kwargs In query without specifying 1개, 2개 You can use numbers such as to limit your search results.

Copy

Copy

Copy

Copy

Enter deeper

To see what happens inside and to have more custom control, we can reconstruct retriever from scratch.

This course query-construction chain Start by creating.

query_constructor chain generation

Generating structured queries query_constructor Generate chain. get_query_constructor_prompt Use the function to get the query generator prompt.

Copy

query_constructor.invoke() Call the method to perform processing for a given query.

Copy

Let's check the generated query.

Copy

Copy

A key element of the Self-query retriever is the query constructor. In order to create a great search system, you need to make the query configor work fine.

To do this Adjust prompt (Prompt), example within prompt, attribute description, etc. Should do.

Convert to structured queries using structured Query Translator

The next important factor is the structured query translator.

This is common StructuredQuery It is responsible for converting objects into metadata filters that fit the syntax of the vector store in use.

Copy

retriever.invoke() Use methods to generate answers to a given question.

Copy

Copy

Last updated