04. LangSmith dataset generation

LangSmith dataset generation

Let's find out how to build your own RAG evaluation dataset.

First, building a dataset requires a large understanding of the trivalent process.

Case: Retrieval Evaluates Relevant on Question

Copy

Question -Retrieval

Copy

Case: Answer Evaluates Relevant for this Question

Copy

Case: Answer answered within Retrievaled document (Hallucination Check)

therefore, Question , Retrieval , Answer It is common to need trivalent information, Retrieval Building Ground Truth for is virtually difficult.

if, Retrieval If Ground Truth for exists, all are stored and utilized as datasets, otherwise Question , Answer You can build and utilize datasets only.

Copy

# 설치
# !pip install -qU langsmith langchain-teddynote

Copy

# API KEY를 환경변수로 관리하기 위한 설정 파일
from dotenv import load_dotenv

# API KEY 정보로드
load_dotenv()

Copy

 True

Copy

# LangSmith 추적을 설정합니다. https://smith.langchain.com
# !pip install -qU langchain-teddynote
from langchain_teddynote import logging

# 프로젝트 이름을 입력합니다.
logging.langsmith("CH16-Evaluations")

Copy

 Start tracking LangSmith. 
[Project name] 
CH16-Evaluations

Generate data set

inputs Wow outputs Utilize to generate a data set.

Data set question and answer Consists of.

Copy

Copyimport pandas as pd

# 질문과 답변 목록
inputs = [
    "삼성전자가 만든 생성형 AI의 이름은 무엇인가요?",
    "미국 바이든 대통령이 안전하고 신뢰할 수 있는 AI 개발과 사용을 보장하기 위한 행정명령을 발표한 날은 언제인가요?",
    "코히어의 데이터 출처 탐색기에 대해서 간략히 말해주세요.",
]

# 질문에 대한 답변 목록
outputs = [
    "삼성전자가 만든 생성형 AI의 이름은 삼성 가우스 입니다.",
    "2023년 10월 30일 미국 바이든 대통령이 행정명령을 발표했습니다.",
    "코히어의 데이터 출처 탐색기는 AI 모델 훈련에 사용되는 데이터셋의 출처와 라이선스 상태를 추적하고 투명성을 확보하기 위한 플랫폼입니다. 12개 기관과 협력하여 2,000여 개 데이터셋의 출처 정보를 제공하며, 개발자들이 데이터의 구성과 계보를 쉽게 파악할 수 있게 돕습니다.",
]

# 질문과 답변 쌍 생성
qa_pairs = [{"question": q, "answer": a} for q, a in zip(inputs, outputs)]

# 데이터프레임으로 변환
df = pd.DataFrame(qa_pairs)

# 데이터프레임 출력
df.head()

Alternatively, you can take advantage of the Synthetic Dataset generated by your previous tutorial.

The code below is an example that utilizes the uploaded HuggingFace Dataset. (Note) by unpacking and running the comments below datasets Please proceed after updating the library.

Copy

# !pip install -qU datasets

Copy

import pandas as pd
from datasets import load_dataset, Dataset
import os

# huggingface Dataset에서 repo_id로 데이터셋 다운로드
dataset = load_dataset(
    "teddylee777/rag-synthetic-dataset",  # 데이터셋 이름
    token=os.environ["HUGGINGFACEHUB_API_TOKEN"],  # private 데이터인 경우 필요합니다.
)

# 데이터셋에서 split 기준으로 조회
huggingface_df = dataset["korean_v1"].to_pandas()
huggingface_df.head()

Dataset generation for LangSmith test

Datasets & Testing Generate a new dataset on.

You can also generate datasets directly using LangSmith UI in csv files.

Please refer to the documents below for details.

LangSmith UI documents

Copy

from langsmith import Client

client = Client()
dataset_name = "RAG_EVAL_DATASET"


# 데이터셋 생성 함수
def create_dataset(client, dataset_name, description=None):
    for dataset in client.list_datasets():
        if dataset.name == dataset_name:
            return dataset

    dataset = client.create_dataset(
        dataset_name=dataset_name,
        description=description,
    )
    return dataset


# 데이터셋 생성
dataset = create_dataset(client, dataset_name)

# 생성된 데이터셋에 예제 추가
client.create_examples(
    inputs=[{"question": q} for q in df["question"].tolist()],
    outputs=[{"answer": a} for a in df["answer"].tolist()],
    dataset_id=dataset.id,
)

You can add an example to the dataset later.

Copy

# 새로운 질문 목록
new_questions = [
    "삼성전자가 만든 생성형 AI의 이름은 무엇인가요?",
    "구글이 테디노트에게 20억달러를 투자한 것이 사실입니까?",
]

# 새로운 답변 목록
new_answers = [
    "삼성전자가 만든 생성형 AI의 이름은 테디노트 입니다.",
    "사실이 아닙니다. 구글은 앤스로픽에 최대 20억 달러를 투자하기로 합의했으며, 이 중 5억 달러를 우선 투자하고 향후 15억 달러를 추가로 투자하기로 했습니다.",
]

# UI에서 업데이트된 버전 확인
client.create_examples(
    inputs=[{"question": q} for q in new_questions],
    outputs=[{"answer": a} for a in new_answers],
    dataset_id=dataset.id,
)

Congratulations! The dataset is now ready.

Previous03. Upload data set for evaluation generated (HuggingFace Dataset) Copy Next05. LLM-as-Judge

Last updated 5 months ago