04. CSV
CSV
Comma-Separated Values (CSV) The file is a delimited text file where values ​​are separated by commas. Each line in the file is a data record.
Each record consists of one or more fields separated by commas.
CSVLoader
CSV Load data one row per document.
Copy
from langchain_community.document_loaders.csv_loader import CSVLoader
# Create a CSV loader
loader = CSVLoader(file_path="./data/titanic.csv")
# load data
docs = loader.load()
print(len(docs))
print(docs[0].metadata)Copy
891
{'source': './data/titanic.csv', 'row': 0}Customizing CSV parsing and loading
See the csv module documentation for more information on supported csv args.
Copy
Copy
Use the source_column argument to specify the source of the document generated for each row. Otherwise, file_path is used as the source for all documents.
This is useful when using a chain of questions to answer questions using sources loaded from a CSV file.
Copy
Copy
UnstructuredCSVLoader
You can also load tables using UnstructuredCSVLoader. One advantage of using UnstructuredCSVLoader is that when used in "elements" mode, the metadata provides an HTML representation of the table.
Copy
DataFrameLoader
Output HTML text metadata for the first document
Copy
Query the first 5 rows..
Copy
PassengerId
Survived
Pclass
Name
Sex
Age
SibSp
Parch
Ticket
Fare
Cabin
Embarked
0
1
0
3
Braund, Mr. Owen Harris
male
22.0
1
0
A/5 21171
7.2500
NaN
S
1
2
1
1
Cumings, Mrs. John Bradley (Florence Briggs Th...
female
38.0
1
0
PC 17599
71.2833
C85
C
2
3
1
3
Heikkinen, Miss. Laina
female
26.0
0
0
STON/O2. 3101282
7.9250
NaN
S
3
4
1
1
Futrelle, Mrs. Jacques Heath (Lily May Peel)
female
35.0
1
0
113803
53.1000
C123
S
4
5
0
3
Allen, Mr. William Henry
male
35.0
0
0
373450
8.0500
NaN
S
Copy
Copy
Copy
Copy
Last updated