Tag: bioinformatics
Annotating Genetic Variants with ENCODE Data
Annotate genetic variants with ENCODE functional genomics data to identify causal regulatory elements and link them to target genes. The skill supports the full post-GWAS workflow, including tissue-specific mapping, enrichment testing, and …
Manage and track ENCODE experiments locally
This skill enables local management of ENCODE experiments by tracking metadata, publications, and data provenance in a SQLite database. It supports experiment comparison, citation export in multiple formats, and the logging of derived files…
Analyzing ENCODE single-cell genomics data
This skill guides the retrieval and analysis of single-cell genomics data (scRNA-seq and scATAC-seq) from the ENCODE project. It provides detailed protocols for data structure, quality assessment, and integrating single-cell profiles with b…
ENCODE Toolkit Setup and Configuration
Provides instructions for installing, configuring, and verifying the ENCODE Toolkit MCP server. It covers installation via uvx or pip, managing credentials, and testing connections using metadata and search queries.
ENCODE Hi-C Data Processing Pipeline
This Nextflow pipeline processes Hi-C FASTQ files to generate multi-resolution contact matrices and chromatin loop calls. It integrates BWA, pairtools, Juicer, and cooler for end-to-end chromatin conformation analysis.
ENCODE pipeline workflow generation and management
This skill facilitates the generation and execution of standardized ENCODE bioinformatics pipelines, supporting custom Nextflow or WDL workflows. It manages compute resource requirements and deployment across local, HPC, and major cloud pla…
ENCODE Peak Annotation and Functional Enrichment
Annotate ENCODE genomic peaks with regulatory features and nearby genes using ChIPseeker and GREAT. The workflow enables genomic feature distribution analysis and functional enrichment via clusterProfiler.
ENCODE Multi-Omics Data Integration
Integrates multiple ENCODE data types, including RNA-seq, ATAC-seq, and ChIP-seq, to construct a comprehensive regulatory landscape for specific tissues or cell types. It enables chromatin state annotation, enhancer-gene linkage, and the ch…
Aggregate DNA Methylation Data Across Studies
Construct comprehensive tissue-level DNA methylation landscapes by aggregating WGBS data from multiple ENCODE experiments. The process includes quality-gating, coverage filtering, and the identification of hypomethylated regions and partial…
Genomic coordinate liftover and assembly conversion guide
This skill provides a comprehensive workflow for safely converting genomic coordinates between different assembly versions (e.g., hg19 to hg38). It guides the use of industry-standard tools like UCSC liftOver and CrossMap, ensuring full pro…
ENCODE Multi-omic Data Integration
Provides a framework for planning and executing integrative analyses across multiple ENCODE experiments, including multi-omic and cross-sample workflows. It assists with compatibility verification, integration strategy selection, and the re…
NCBI GEO and ENCODE Connector
Facilitates searching, querying, and cross-referencing NCBI GEO datasets with ENCODE experiments to identify complementary epigenomic data. It supports retrieving metadata, series matrices, and supplementary files via E-utilities and FTP.
Comprehensive Epigenomic Profiling with ENCODE
Assemble comprehensive epigenomic profiles for specific tissues or cell types by systematically gathering histone modifications, chromatin accessibility, and DNA methylation data from ENCODE. It enables the characterisation of chromatin sta…
Downloading ENCODE Genomics Data Files
Facilitates the automated retrieval and organisation of ENCODE genomics files, such as BED, FASTQ, and BAM, to a local filesystem. Supports batch processing with MD5 verification and configurable directory structures based on experiment or …
Cross-reference ENCODE with scientific databases
Link ENCODE genomic data to external scientific databases such as PubMed, ClinicalTrials.gov, and Open Targets. This skill facilitates building translational pipelines by connecting regulatory elements to clinical research and variant annot…
Comprehensive Genomics Provenance Tracking
This skill establishes a rigorous audit trail for genomic analyses, logging every operation—including tool versions, parameters, and environment details—to ensure absolute reproducibility. It structures the data to auto-generate publication…
Annotating ENCODE Regulatory Variants with ClinVar
Cross-reference ENCODE functional genomic elements with ClinVar clinical variant classifications to identify pathogenic variants in regulatory regions. This skill enables both forward annotation of ENCODE peaks and reverse analysis of ClinV…
Setup reproducible bioinformatics environments for ENCODE
This skill provisions fully reproducible, version-pinned conda environments and associated scripts for comprehensive ENCODE data analysis. It manages dependencies across multiple modalities (RNA-seq, ChIP-seq, ATAC-seq) using tools like STA…
ENCODE experiment tracking and provenance management
Track and manage local collections of ENCODE experiments, including metadata, publications, and data provenance. It supports experiment comparison, citation management, and exporting datasets to CSV, TSV, or JSON formats.
Searching and Exploring ENCODE Genomics Data
Provides a structured strategy for searching and exploring ENCODE Project genomics data using facets and metadata. It facilitates the discovery of experiments, files, and specific biological parameters like assays, organs, and cell lines.
Characterise Regulatory Elements with ENCODE Data
Identify and characterise candidate cis-regulatory elements using ENCODE datasets and the cCRE catalog. The skill enables the discovery of active enhancers, promoter state mapping, and super-enhancer identification using ChromHMM and ROSE.
ENCODE Hi-C Processing Pipeline
Executes the ENCODE Hi-C pipeline using Nextflow to transform FASTQ files into multi-resolution contact matrices and loop calls. The pipeline supports local, SLURM, and cloud-based deployments via Docker.
ENCODE RNA-seq Quantification Pipeline
This tool executes a comprehensive, ENCODE-standard RNA-seq pipeline using Nextflow. It processes raw FASTQ data through STAR alignment, generating gene/transcript quantification (TPM/FPKM) and strand-specific bigWig signal tracks.
ENCODE Pipeline Workflow Generation and Management
This skill facilitates the generation and execution of complex ENCODE bioinformatics workflows, supporting custom Nextflow and WDL pipelines. It manages compute resource requirements and deployment across local, HPC, and major cloud platfor…