04. LangSmith dataset generation
LangSmith dataset generation
Let's find out how to build your own RAG evaluation dataset.
First, building a dataset requires a large understanding of the trivalent process.
Case: Retrieval Evaluates Relevant on Question
Copy
Question -RetrievalCopy
Case: Answer Evaluates Relevant for this QuestionCopy
Case: Answer answered within Retrievaled document (Hallucination Check)therefore, Question , Retrieval , Answer It is common to need trivalent information, Retrieval Building Ground Truth for is virtually difficult.
if, Retrieval If Ground Truth for exists, all are stored and utilized as datasets, otherwise Question , Answer You can build and utilize datasets only.
Copy
# 설치
# !pip install -qU langsmith langchain-teddynoteCopy
# API KEY를 환경변수로 관리하기 위한 설정 파일
from dotenv import load_dotenv
# API KEY 정보로드
load_dotenv()Copy
Copy
Copy
Generate data set
inputs Wow outputs Utilize to generate a data set.
Data set question and answer Consists of.
Copy

Alternatively, you can take advantage of the Synthetic Dataset generated by your previous tutorial.
The code below is an example that utilizes the uploaded HuggingFace Dataset. (Note) by unpacking and running the comments below datasets Please proceed after updating the library.
Copy
Copy

Dataset generation for LangSmith test
Datasets & TestingGenerate a new dataset on.
You can also generate datasets directly using LangSmith UI in csv files.
Please refer to the documents below for details.
Copy
You can add an example to the dataset later.
Copy
Congratulations! The dataset is now ready.
Last updated