12. Upstage Layout Analysis Loader

UpstageLayoutAnalysisLoader Is a document analysis tool provided by Upstage AI, a document loader that can be used in conjunction with the LangChain framework.

Main features: -Perform layout analysis in various types of documents, including PDFs and images -Automatically recognize and extract structural elements of documents (titles, paragraphs, tables, images, etc.) -OCR function support (optional)

UpstageLayoutAnalysisLoader goes beyond simple text extraction to understand the structure of documents and identify relationships between elements, enabling more accurate document analysis.

install

langchain-upstage Use the package after installation.

Copy

pip install -U langchain-upstage

API Key Settings

.env To file UPSTAGE_API_KEY Set the key.

Reference

Preferences

Copy

# Configuration file for managing API KEY as environment variable

# Load API KEY information
load_dotenv()

Copy

Copy

Copy

UpstageLayoutAnalysisLoader

Main parameters

  • file_path : Document path to analyze

  • output_type : Output format [(default)'html','text']

  • split : Document splitting method ['none','element','page']

  • use_ocr=True : Using OCR

  • exclude=["header", "footer"] : Header, except Footer

Copy

Copy

Last updated