Tag: ocr
Batch Screenshot Text and UI Extraction
Extract text, UI elements, and data tables from up to 500 screenshots in a single API call. The service uses a flat-fee USDC payment model via the x402 protocol.
Batch Document Intelligence Processing API
This tool processes up to 500 document images in a single batch call, providing OCR, classification, table extraction, and summarisation. It accepts an array of image URLs and supports various analysis types via a structured JSON payload.
Mistral MCP Integration for OpenClaw
Enables OpenClaw to access Mistral-specific capabilities via the Mistral MCP server, including OCR, Codestral FIM, and Voxtral audio processing. It also provides access to durable workflows, moderation, and batch API endpoints.
PDF Invoice Data Extractor
Extracts structured line-item data from digital and scanned PDF invoices using Mistral OCR and chat models. It facilitates accounting reconciliation by parsing vendor details, dates, and VAT amounts into structured formats.
Structured Contract Analysis and Risk Scoring
This skill processes PDF or scanned contracts by first performing OCR using Mistral's document AI, and subsequently extracting structured data, including key clauses and associated risk levels, via JSON schema enforcement. It provides a com…
Screenpipe Local API Interface
Query local screen recordings, audio transcriptions, and UI elements via a REST API. It provides programmatic access to user activity, application usage, and visual context through searchable metadata and frame retrieval.
Batch Book Translation and OCR Pipeline
An automated pipeline for processing historical book scans through image-cropping, Gemini-powered OCR, and context-aware translation. It supports both real-time Lambda workers and cost-effective Gemini Batch API integration.
MinerU Document Extractor
MinerU is a CLI tool and agent skill for reliable document parsing, converting PDFs, scanned documents, images, and Word/PowerPoint/Excel files into Markdown, HTML, LaTeX, or DOCX. It offers both a fast, zero-setup mode and a precision mode…
Comprehensive PDF Processing and Manipulation Skill
This skill provides comprehensive capabilities for handling PDF documents, enabling advanced operations such as extracting text, tables, and images, or performing structural modifications like merging, splitting, rotating, and encrypting fi…