12. Upstage Layout Analysis Loader
UpstageLayoutAnalysisLoader Is a document analysis tool provided by Upstage AI, a document loader that can be used in conjunction with the LangChain framework.
Main features: -Perform layout analysis in various types of documents, including PDFs and images -Automatically recognize and extract structural elements of documents (titles, paragraphs, tables, images, etc.) -OCR function support (optional)
UpstageLayoutAnalysisLoader goes beyond simple text extraction to understand the structure of documents and identify relationships between elements, enabling more accurate document analysis.
install
langchain-upstage Use the package after installation.
Copy
pip install -U langchain-upstageAPI Key Settings
.env To file UPSTAGE_API_KEY Set the key.
Reference
Preferences
Copy
# Configuration file for managing API KEY as environment variable
# Load API KEY information
load_dotenv()Copy
Copy
Copy
UpstageLayoutAnalysisLoader
Main parameters
file_path: Document path to analyzeoutput_type: Output format [(default)'html','text']split: Document splitting method ['none','element','page']use_ocr=True: Using OCRexclude=["header", "footer"]: Header, except Footer
Copy
Copy
Last updated