10. Hangeulocyte analyzer (Kiwi, Kkma, Okt) + BM25 finder
Define the function to neatly check the output result.
Copy
def pretty_print(docs):
for i, doc in enumerate(docs):
if "score" in doc.metadata:
print(f"[{i+1}] {doc.page_content} ({doc.metadata['score']:.4f})")
else:
print(f"[{i+1}] {doc.page_content}")BM25Retriever with Kiwi talkizer
Copy
# Install required libraries
# !pip install -qU kiwipiepy konlpy langchain-teddynoteCopy
# For comparison BM25Retriever
from langchain_community.retrievers import BM25Retriever
# BM25Retriever using a custom-implemented Korean morphological analyzer (Kiwi)
from langchain_teddynote.retrievers import KiwiBM25Retriever
sample_texts = [
"Financial insurance is a financial product designed for long-term asset management and risk management.",
"Financial savings product insurance is a special financial product that has a long-term savings purpose as well as a livestock product provision function.",
"Financial savings product insurance is a special financial product that has a long-term savings purpose as well as a livestock product provision function.",
"Financial group bombing insurance is a product that focuses on risk management rather than savings. It is suitable for customers who are willing to take high risks.",
]Copy
Copy
Copy
Calculate similarity scores by personal needs and on metadata score Added the ability to add
Copy
Copy
Copy
Copy
k value setting
Copy
Copy
BM25Retriever used KonlPy (Kkma, Okt)
Copy
Copy
Copy
Copy
Copy
Copy
Copy
Copy
Copy
Copy
Previous09. Time Weighted Vector StoreRetrieverNext11. EnsembleRetriever with Convex Combination (CC)
Last updated