Layout Parsing
Use the GLM-OCR model to parse the layout of documents and images and extract text content. Support OCR recognition of images and PDF documents, returning detailed layout information and visualization results.
Documentation Index
Fetch the complete documentation index at: https://docs.z.ai/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Body
Model code: glm-ocr
glm-ocr "glm-ocr"
Image or PDF document to be recognized, supports URL and base64. Supported image formats: PDF, JPG, PNG. Single image ≤10MB, PDF ≤50MB, maximum support 100 pages
"https://cdn.bigmodel.cn/static/logo/introduction.png"
Whether to return screenshot information
Whether to return detailed layout image result information
Start page number for parsing when PDF is provided
x >= 1End page number for parsing when PDF is provided
x >= 1Passed by the user side, needs to be unique; used to distinguish each request, 6–64 characters. If not provided by the user side, the platform will generate one by default.
6 - 64Unique ID for the end user, 6–128 characters. Avoid using sensitive information.
6 - 128Response
Business processing successful
Task ID
"task_123456789"
Request creation time, Unix timestamp in seconds
1727156815
Model name
"GLM-OCR"
Recognition result in Markdown format
"# Doc title\nThis is the document content..."
Detailed layout information
Recognition result image URLs
Document basic information
Token usage statistics returned when the model call ends.
Request ID
"req_123456789"