Tag: data-extraction

Type: All Skills Tools
skill ★ 7,851

Analyze existing machine learning baseline implementation

This skill systematically analyzes a given experiment directory to extract comprehensive metadata about the existing ML pipeline. It captures details such as model parameters, preprocessing steps, dependencies, and data characteristics, log…

Upsonic/Upsonic analyze-ml baseline-analysis experiment-setup data-extraction
skill ★ 33,136

REST API Caller and JSON Data Parser

This skill enables the agent to fetch data from external REST endpoints, managing authentication and executing Python requests. It intelligently parses the resulting JSON payload to extract and summarise only the necessary fields for the us…

OpenBMB/ChatDev rest-api json-parsing http-requests data-extraction
skill

Automated Financial Report Analysis and Deep Reporting

This skill automates the comprehensive analysis of listed company financial reports by extracting core metrics, calculating key ratios, and generating visualizations. It synthesises a professional report, providing deep insights into profit…

csunny/DB-GPT financial-analysis report-generation data-extraction finance
tool ★ 2

ShopGraph E-commerce Product Data Extraction

An MCP server for extracting structured product information from e-commerce URLs or raw HTML using Schema.org and LLM-powered extraction. It provides per-field confidence scores and supports configurable confidence thresholds for high-relia…

laundromatic/shopgraph mcp ecommerce data-extraction web-scraping
tool ★ 1,164

Robust web extraction and data scraping engine

This comprehensive tool provides reliable web content extraction, featuring automatic antibot bypass for protected sites. It supports advanced functions including deep crawling, structured data extraction via JSON schema, and content change…

0xMassi/webclaw web-scraping data-extraction anti-bot crawling
tool ★ 1,164

Robust web extraction with anti-bot bypass

This robust web extraction tool handles complex anti-bot measures, reliably scraping content from protected sites. It supports advanced features like structured data extraction, full site crawling, and content change detection.

0xMassi/webclaw web-scraping anti-bot data-extraction crawling
tool

Screenshot to Structured Data Extraction

Extracts text, UI layouts, tables, and charts from screenshots into structured JSON format. The service supports multiple extraction modes and is payable via x402 on the Base network.

ntriq-gh/ntriq-agentshop screenshot-processing data-extraction computer-vision structured-data
tool

Local AI Document Intelligence and Extraction Tool

This tool performs local AI vision analysis on document images, enabling developers to extract structured data, classify document types, or summarise content without requiring cloud uploads or API keys. It supports multiple analysis modes i…

ntriq-gh/ntriq-agentshop document-intel image-analysis data-extraction llm
tool

Blueprint Intelligence: Extracting Architectural Data

This tool uses local vision AI to parse architectural blueprints and floor plans, extracting structured data such as room dimensions, materials, and structural elements. It accepts image URLs or base64 inputs and returns detailed JSON analy…

ntriq-gh/ntriq-agentshop blueprint floor-plan vision-ai architecture
skill ★ 6

PDF Invoice Data Extractor

Extracts structured line-item data from digital and scanned PDF invoices using Mistral OCR and chat models. It facilitates accounting reconciliation by parsing vendor details, dates, and VAT amounts into structured formats.

Swih/mistral-mcp pdf-extraction ocr mistral-ai invoice-processing
tool ★ 87

Comprehensive X (Twitter) API Integration Tool

This tool provides comprehensive access to X (Twitter) data, enabling advanced read, write, and extraction operations via a dedicated API key. It supports complex workflows such as user lookups, bulk media downloading, monitoring, and posti…

Xquik-dev/x-twitter-scraper x-twitter api social-media data-extraction
skill ★ 23

Structured import of financial documents and transactions

This skill ingests diverse financial inputs—including CSV statements, receipt images, and invoice PDFs—and performs structured extraction of transactions, contacts, and line items. It ensures full data provenance by storing the raw source d…

markmhendrickson/neotoma finance data-extraction pdf-parsing structured-data
skill ★ 23

Import Emails and Extract Structured Entities

This skill connects to an email MCP to ingest emails into persistent memory. It systematically extracts structured entities—including contacts, tasks, events, and transactions—while maintaining full provenance via source quoting and unique …

markmhendrickson/neotoma email-import data-extraction memory-persistence structured-data
skill ★ 23

Import and structure chat history from various sources

This skill ingests conversation transcripts from diverse sources—including ChatGPT, Claude, and Slack exports—and structures them into persistent memory. It extracts key entities such as decisions, tasks, and contacts, reconstructing a trac…

markmhendrickson/neotoma chat-history conversation-import data-extraction llm
skill ★ 23

Persist Calendar Events and Commitments

This skill ingests scheduling commitments from various sources, including live calendar APIs and ICS files. It extracts and persists structured entities like events, contacts, and locations into durable memory, maintaining full provenance.

markmhendrickson/neotoma calendar-sync event-import data-extraction scheduling
skill

Extracting Actionable Research Insights

This skill parses research materials such as PDFs, notebooks, or text ideas to extract method summaries, implementation requirements, and compatibility analysis. It records findings as structured JSON entries within an experiment log to fac…

Upsonic/gpt-computer-assistant research-extraction information-extraction applied-science experiment-tracking
tool ★ 21,652

AI-Powered Browser Automation CLI

An agentic CLI tool for automating complex web workflows, including form filling, data extraction, and navigating dynamic content. It supports both deterministic Playwright actions and autonomous AI-driven exploration.

Skyvern-AI/skyvern browser-automation web-scraping ai-agents playwright
tool ★ 21,652

AI-powered browser automation and extraction

An AI-driven engine that uses LLMs and computer vision to automate web workflows, extract structured data, and manage browser sessions. It integrates via Python and TypeScript SDKs, a REST API, and an MCP server.

Skyvern-AI/skyvern browser-automation web-scraping mcp llm
skill ★ 4

Extract Financial Data from Annual Accounts

Extracts key financial metrics such as revenue, profit, and total assets from the most recent annual accounts of a specified company. It processes iXBRL, XBRL, and PDF filings to retrieve headline figures directly from the registry's archiv…

sophymarine/openregistry financial-extraction annual-accounts xbrl corporate-data