mdskills
← All tags

Monitoring AI Agent Skills

AI agent skills for application monitoring. Logging, alerting, metrics collection, and observability workflows.

54 listings

Grafana Dashboards

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

8.01 weeklysickn33/antigravity-awesome-skills

Hugging Face Model Trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence.

9.01 weeklyhuggingface/skills

Ntfy Me MCP

MCP Server

ntfy-me-mcp provides AI assistants with the ability to send real-time notifications to your devices through the ntfy service (either public or selfhosted with token support). Get notified when your AI completes tasks, encounters errors, or reaches important milestones - all without constant monitoring. The server includes intelligent features like automatic URL detection for creating view actions

7.01 weeklygitmotion/ntfy-me-mcp

Observability Monitoring Slo Implement

You are an SLO (Service Level Objective) expert specializing in implementing reliability standards and error budget-based practices. Design SLO frameworks, define SLIs, and build monitoring that balances reliability with delivery velocity.

5.0sickn33/antigravity-awesome-skills

Kpi Dashboard Design

Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.

7.0sickn33/antigravity-awesome-skills

Service Mesh Observability

Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing SLOs for service communication.

8.0sickn33/antigravity-awesome-skills

Azure AI Anomalydetector Java

Build anomaly detection applications with Azure AI Anomaly Detector SDK for Java. Use when implementing univariate/multivariate anomaly detection, time-series analysis, or AI-powered monitoring.

6.0sickn33/antigravity-awesome-skills

Database Migrations Migration Observability

Migration monitoring, CDC, and observability infrastructure

8.0sickn33/antigravity-awesome-skills

Error Diagnostics Error Trace

You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging,

6.0sickn33/antigravity-awesome-skills

Datadog Automation

Automate Datadog tasks via Rube MCP (Composio): query metrics, search logs, manage monitors/dashboards, create events and downtimes. Always search tools first for current schemas.

9.0sickn33/antigravity-awesome-skills

Model Context Protocol Server for Home Assistant

MCP Server

The server uses the MCP protocol to share access to a local Home Assistant instance with an LLM application. A powerful bridge between your Home Assistant instance and Language Learning Models (LLMs), enabling natural language control and monitoring of your smart home devices through the Model Context Protocol (MCP). This server provides a comprehensive API for managing your entire Home Assistant

8.0tevonsb/homeassistant-mcp

ThinQ Connect MCP Server (Beta)

MCP Server

This is the official MCP (Model Context Protocol) server for LG ThinQ devices. This server provides integrated control capabilities including status monitoring, device control, and profile information for various LG ThinQ devices, built on the LG ThinQ API and Python Open SDK. MCP connection method is stdio. - Prerequisites - Quick Start - Detailed Usage - Tool Reference - Device List Query Retrie

8.0thinq-connect/thinqconnect-mcp

Azure Mgmt Applicationinsights Dotnet

|

6.0sickn33/antigravity-awesome-skills

Database Optimizer

Expert database optimizer specializing in modern performance

7.0sickn33/antigravity-awesome-skills

Docker MCP

MCP Server

A powerful Model Context Protocol (MCP) server for Docker operations, enabling seamless container and compose stack management through Claude AI. - 🚀 Container creation and instantiation - 📦 Docker Compose stack deployment - 🔍 Container logs retrieval - 📊 Container listing and status monitoring To try this in Claude Desktop app, add this to your claude config files: To install Docker MCP for C

8.0QuantGeekDev/docker-mcp

Devops Troubleshooter

Expert DevOps troubleshooter specializing in rapid incident

7.0sickn33/antigravity-awesome-skills

ML Engineer

Build production ML systems with PyTorch 2.x, TensorFlow, and

7.0sickn33/antigravity-awesome-skills

Gitlab Automation

Automate GitLab project management, issues, merge requests, pipelines, branches, and user operations via Rube MCP (Composio). Always search tools first for current schemas.

8.0sickn33/antigravity-awesome-skills

Inspektor Gadget MCP Server

MCP Server

AI-powered debugging and inspection for Kubernetes clusters using Inspektor Gadget. - AI-powered interface for Kubernetes troubleshooting and monitoring - One-click Inspektor Gadget deployment and removal - Intelligent output summarization and analysis - Automatic gadget discovery from Artifact Hub 1. Ensure you have Docker and a valid kubeconfig file 2. Configure the MCP server in VS Code (see IN

8.0inspektor-gadget/ig-mcp-server

LLM App Patterns

Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implementing RAG, building agents, or setting up LLM observability.

6.0sickn33/antigravity-awesome-skills

Mlops Engineer

Build comprehensive ML pipelines, experiment tracking, and model

6.0sickn33/antigravity-awesome-skills

Performance Engineer

Expert performance engineer specializing in modern observability,

8.0sickn33/antigravity-awesome-skills

Prometheus Configuration

Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.

8.0sickn33/antigravity-awesome-skills

System Plugin

Plugin

Purpose: System configuration and health monitoring Category: System Version: 1.0.0 The System plugin provides essential commands for configuring, monitoring, and maintaining Claude Code installations. These are system-level operations that configure or monitor Claude Code itself. Purpose: Unified view of work, system, and memory state Provides comprehensive status overview including: - Active wor

7.0applied-artificial-intelligence/claude-code-toolkit