04. CSV

CSV

Comma-Separated Values (CSV) The file is a delimited text file where values ​​are separated by commas. Each line in the file is a data record.

Each record consists of one or more fields separated by commas.

CSVLoader

  • CSV Load data one row per document.

Copy

from langchain_community.document_loaders.csv_loader import CSVLoader

# Create a CSV loader
loader = CSVLoader(file_path="./data/titanic.csv")

# load data
docs = loader.load()

print(len(docs))
print(docs[0].metadata)

Copy

891
{'source': './data/titanic.csv', 'row': 0}

Customizing CSV parsing and loading

See the csv module documentation for more information on supported csv args.

Copy

Copy

Use the source_column argument to specify the source of the document generated for each row. Otherwise, file_path is used as the source for all documents.

This is useful when using a chain of questions to answer questions using sources loaded from a CSV file.

Copy

Copy

UnstructuredCSVLoader

You can also load tables using UnstructuredCSVLoader. One advantage of using UnstructuredCSVLoader is that when used in "elements" mode, the metadata provides an HTML representation of the table.

Copy

DataFrameLoader

  • Output HTML text metadata for the first document

Copy

Query the first 5 rows..

Copy

PassengerId

Survived

Pclass

Name

Sex

Age

SibSp

Parch

Ticket

Fare

Cabin

Embarked

0

1

0

3

Braund, Mr. Owen Harris

male

22.0

1

0

A/5 21171

7.2500

NaN

S

1

2

1

1

Cumings, Mrs. John Bradley (Florence Briggs Th...

female

38.0

1

0

PC 17599

71.2833

C85

C

2

3

1

3

Heikkinen, Miss. Laina

female

26.0

0

0

STON/O2. 3101282

7.9250

NaN

S

3

4

1

1

Futrelle, Mrs. Jacques Heath (Lily May Peel)

female

35.0

1

0

113803

53.1000

C123

S

4

5

0

3

Allen, Mr. William Henry

male

35.0

0

0

373450

8.0500

NaN

S

Copy

Copy

Copy

Copy

Last updated