Monitoring AI Agent Skills
AI agent skills for application monitoring. Logging, alerting, metrics collection, and observability workflows.
54 listings
Grafana Dashboards
Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.
Hugging Face Model Trainer
This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence.
Ntfy Me MCP
MCP Serverntfy-me-mcp provides AI assistants with the ability to send real-time notifications to your devices through the ntfy service (either public or selfhosted with token support). Get notified when your AI completes tasks, encounters errors, or reaches important milestones - all without constant monitoring. The server includes intelligent features like automatic URL detection for creating view actions
Observability Monitoring Slo Implement
You are an SLO (Service Level Objective) expert specializing in implementing reliability standards and error budget-based practices. Design SLO frameworks, define SLIs, and build monitoring that balances reliability with delivery velocity.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Service Mesh Observability
Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing SLOs for service communication.
Azure AI Anomalydetector Java
Build anomaly detection applications with Azure AI Anomaly Detector SDK for Java. Use when implementing univariate/multivariate anomaly detection, time-series analysis, or AI-powered monitoring.
Database Migrations Migration Observability
Migration monitoring, CDC, and observability infrastructure
Error Diagnostics Error Trace
You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging,
Datadog Automation
Automate Datadog tasks via Rube MCP (Composio): query metrics, search logs, manage monitors/dashboards, create events and downtimes. Always search tools first for current schemas.
Model Context Protocol Server for Home Assistant
MCP ServerThe server uses the MCP protocol to share access to a local Home Assistant instance with an LLM application. A powerful bridge between your Home Assistant instance and Language Learning Models (LLMs), enabling natural language control and monitoring of your smart home devices through the Model Context Protocol (MCP). This server provides a comprehensive API for managing your entire Home Assistant
ThinQ Connect MCP Server (Beta)
MCP ServerThis is the official MCP (Model Context Protocol) server for LG ThinQ devices. This server provides integrated control capabilities including status monitoring, device control, and profile information for various LG ThinQ devices, built on the LG ThinQ API and Python Open SDK. MCP connection method is stdio. - Prerequisites - Quick Start - Detailed Usage - Tool Reference - Device List Query Retrie
Azure Mgmt Applicationinsights Dotnet
|
Database Optimizer
Expert database optimizer specializing in modern performance
Docker MCP
MCP ServerA powerful Model Context Protocol (MCP) server for Docker operations, enabling seamless container and compose stack management through Claude AI. - 🚀 Container creation and instantiation - 📦 Docker Compose stack deployment - 🔍 Container logs retrieval - 📊 Container listing and status monitoring To try this in Claude Desktop app, add this to your claude config files: To install Docker MCP for C
Devops Troubleshooter
Expert DevOps troubleshooter specializing in rapid incident
ML Engineer
Build production ML systems with PyTorch 2.x, TensorFlow, and
Gitlab Automation
Automate GitLab project management, issues, merge requests, pipelines, branches, and user operations via Rube MCP (Composio). Always search tools first for current schemas.
Inspektor Gadget MCP Server
MCP ServerAI-powered debugging and inspection for Kubernetes clusters using Inspektor Gadget. - AI-powered interface for Kubernetes troubleshooting and monitoring - One-click Inspektor Gadget deployment and removal - Intelligent output summarization and analysis - Automatic gadget discovery from Artifact Hub 1. Ensure you have Docker and a valid kubeconfig file 2. Configure the MCP server in VS Code (see IN
LLM App Patterns
Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implementing RAG, building agents, or setting up LLM observability.
Mlops Engineer
Build comprehensive ML pipelines, experiment tracking, and model
Performance Engineer
Expert performance engineer specializing in modern observability,
Prometheus Configuration
Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.
System Plugin
PluginPurpose: System configuration and health monitoring Category: System Version: 1.0.0 The System plugin provides essential commands for configuring, monitoring, and maintaining Claude Code installations. These are system-level operations that configure or monitor Claude Code itself. Purpose: Unified view of work, system, and memory state Provides comprehensive status overview including: - Active wor