08. Vald

Introduction to Vald

Vald is an open-source, highly scalable distributed vector search engine designed for real-time similarity search. Built on Kubernetes, it offers automatic scaling, fault tolerance, and efficient indexing for AI-driven applications, such as recommendation systems, semantic search, and retrieval-augmented generation (RAG).

Setting Up Vald

1. Installing Vald

To use Vald, deploy it on a Kubernetes cluster. You can set it up using Helm:

Copy

helm repo add vald https://vald.vdaas.org/charts
helm install vald vald/vald

This will deploy Vald on your Kubernetes cluster. You can check its status using:

Copy

kubectl get pods -n vald

2. Creating a Vald Client

Once Vald is deployed, you can initialize a Vald client in Python:

Copy

from vald_client import ValdClient

client = ValdClient(host="localhost", port=8081)

If you're using a cloud-hosted Vald instance, replace localhost with your cloud endpoint and provide authentication credentials.

Integrating Vald with LangChain

LangChain provides seamless integration with Vald for vector-based storage and retrieval. The Vald wrapper in LangChain simplifies adding and retrieving vector embeddings.

1. Creating a Vald Collection

Vald does not require a predefined schema but allows you to insert vectors dynamically. Ensure your Vald instance is running before proceeding.

2. Storing Embeddings in Vald

To store vectors, first generate embeddings using an embedding model (e.g., OpenAI or Hugging Face):

Copy

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Vald

embeddings = OpenAIEmbeddings()
vector_db = Vald(client=client, index_name="langchain_docs", embeddings=embeddings)

Now, store some text data in Vald:

Copy

documents = ["This is a sample document.", "LangChain makes working with LLMs easier."]
vector_db.add_texts(texts=documents)

3. Performing Similarity Search

Retrieve documents similar to a given query:

Copy

query = "How does LangChain help with LLMs?"
results = vector_db.similarity_search(query, k=2)

for result in results:
    print(result.page_content)

This fetches the top 2 documents that are most semantically similar to the query.

Best Practices and Optimization

Efficient Indexing: Use Vald’s automatic indexing for optimized performance.
Scalability: Deploy Vald on a Kubernetes cluster for dynamic scaling.
Hybrid Search: Combine keyword and vector-based retrieval for improved accuracy.
Cloud Deployment: Consider using a managed Kubernetes service to simplify deployment and maintenance.

Conclusion

Vald is a powerful, open-source distributed vector search engine designed for real-time similarity search and scalable AI applications. Its integration with LangChain enables efficient storage and retrieval of embeddings, making it an excellent choice for scalable search, recommendations, and retrieval-augmented generation (RAG) applications.

Previous07. Vespa Next09. LanceDB

Last updated 5 months ago