How do I install ML Pipeline Workflow?

Install ML Pipeline Workflow with a single command: npx mdskills install sickn33/ml-pipeline-workflow. This downloads the skill files into your project and your AI agent picks them up automatically.

What platforms support ML Pipeline Workflow?

ML Pipeline Workflow works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.

← Back to skills

ML Pipeline Workflow

Name: ML Pipeline Workflow: AI Agent Skill
Brand: sickn33
Availability: InStock
Rating: 7 (1 reviews)
Author: sickn33

Verified

DevOps & CloudIntermediate

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.

by @sickn334 downloads13,166Updated 2/20/2026

Add this skill

npx mdskills install sickn33/ml-pipeline-workflow

Fork & Edit

Are you @sickn33? Sign in with GitHub to claim this listing.

Skill Advisor7.0

Comprehensive MLOps guidance with clear structure but lacks concrete implementation examples

+Covers complete ML lifecycle with well-organized stages and integration points
+Provides clear progression from basic to advanced patterns with troubleshooting guidance
+References external templates and documentation for detailed implementation
-Needs more concrete code examples beyond YAML templates and comments
-Shell execution permission seems over-scoped for what appears to be guidance-focused

SKILL.md

Edit in Browser

1---
2name: ml-pipeline-workflow
3description: Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.
4---
5 
6# ML Pipeline Workflow
7 
8Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.
9 
10## Do not use this skill when
11 
12- The task is unrelated to ml pipeline workflow
13- You need a different domain or tool outside this scope
14 
15## Instructions
16 
17- Clarify goals, constraints, and required inputs.
18- Apply relevant best practices and validate outcomes.
19- Provide actionable steps and verification.
20- If detailed examples are required, open `resources/implementation-playbook.md`.
21 
22## Overview
23 
24This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring.
25 
26## Use this skill when
27 
28- Building new ML pipelines from scratch
29- Designing workflow orchestration for ML systems
30- Implementing data → model → deployment automation
31- Setting up reproducible training workflows
32- Creating DAG-based ML orchestration
33- Integrating ML components into production systems
34 
35## What This Skill Provides
36 
37### Core Capabilities
38 
391. **Pipeline Architecture**
40   - End-to-end workflow design
41   - DAG orchestration patterns (Airflow, Dagster, Kubeflow)
42   - Component dependencies and data flow
43   - Error handling and retry strategies
44 
452. **Data Preparation**
46   - Data validation and quality checks
47   - Feature engineering pipelines
48   - Data versioning and lineage
49   - Train/validation/test splitting strategies
50 
513. **Model Training**
52   - Training job orchestration
53   - Hyperparameter management
54   - Experiment tracking integration
55   - Distributed training patterns
56 
574. **Model Validation**
58   - Validation frameworks and metrics
59   - A/B testing infrastructure
60   - Performance regression detection
61   - Model comparison workflows
62 
635. **Deployment Automation**
64   - Model serving patterns
65   - Canary deployments
66   - Blue-green deployment strategies
67   - Rollback mechanisms
68 
69### Reference Documentation
70 
71See the `references/` directory for detailed guides:
72- **data-preparation.md** - Data cleaning, validation, and feature engineering
73- **model-training.md** - Training workflows and best practices
74- **model-validation.md** - Validation strategies and metrics
75- **model-deployment.md** - Deployment patterns and serving architectures
76 
77### Assets and Templates
78 
79The `assets/` directory contains:
80- **pipeline-dag.yaml.template** - DAG template for workflow orchestration
81- **training-config.yaml** - Training configuration template
82- **validation-checklist.md** - Pre-deployment validation checklist
83 
84## Usage Patterns
85 
86### Basic Pipeline Setup
87 
88```python
89# 1. Define pipeline stages
90stages = [
91    "data_ingestion",
92    "data_validation",
93    "feature_engineering",
94    "model_training",
95    "model_validation",
96    "model_deployment"
97]
98 
99# 2. Configure dependencies
100# See assets/pipeline-dag.yaml.template for full example
101```
102 
103### Production Workflow
104 
1051. **Data Preparation Phase**
106   - Ingest raw data from sources
107   - Run data quality checks
108   - Apply feature transformations
109   - Version processed datasets
110 
1112. **Training Phase**
112   - Load versioned training data
113   - Execute training jobs
114   - Track experiments and metrics
115   - Save trained models
116 
1173. **Validation Phase**
118   - Run validation test suite
119   - Compare against baseline
120   - Generate performance reports
121   - Approve for deployment
122 
1234. **Deployment Phase**
124   - Package model artifacts
125   - Deploy to serving infrastructure
126   - Configure monitoring
127   - Validate production traffic
128 
129## Best Practices
130 
131### Pipeline Design
132 
133- **Modularity**: Each stage should be independently testable
134- **Idempotency**: Re-running stages should be safe
135- **Observability**: Log metrics at every stage
136- **Versioning**: Track data, code, and model versions
137- **Failure Handling**: Implement retry logic and alerting
138 
139### Data Management
140 
141- Use data validation libraries (Great Expectations, TFX)
142- Version datasets with DVC or similar tools
143- Document feature engineering transformations
144- Maintain data lineage tracking
145 
146### Model Operations
147 
148- Separate training and serving infrastructure
149- Use model registries (MLflow, Weights & Biases)
150- Implement gradual rollouts for new models
151- Monitor model performance drift
152- Maintain rollback capabilities
153 
154### Deployment Strategies
155 
156- Start with shadow deployments
157- Use canary releases for validation
158- Implement A/B testing infrastructure
159- Set up automated rollback triggers
160- Monitor latency and throughput
161 
162## Integration Points
163 
164### Orchestration Tools
165 
166- **Apache Airflow**: DAG-based workflow orchestration
167- **Dagster**: Asset-based pipeline orchestration
168- **Kubeflow Pipelines**: Kubernetes-native ML workflows
169- **Prefect**: Modern dataflow automation
170 
171### Experiment Tracking
172 
173- MLflow for experiment tracking and model registry
174- Weights & Biases for visualization and collaboration
175- TensorBoard for training metrics
176 
177### Deployment Platforms
178 
179- AWS SageMaker for managed ML infrastructure
180- Google Vertex AI for GCP deployments
181- Azure ML for Azure cloud
182- Kubernetes + KServe for cloud-agnostic serving
183 
184## Progressive Disclosure
185 
186Start with the basics and gradually add complexity:
187 
1881. **Level 1**: Simple linear pipeline (data → train → deploy)
1892. **Level 2**: Add validation and monitoring stages
1903. **Level 3**: Implement hyperparameter tuning
1914. **Level 4**: Add A/B testing and gradual rollouts
1925. **Level 5**: Multi-model pipelines with ensemble strategies
193 
194## Common Patterns
195 
196### Batch Training Pipeline
197 
198```yaml
199# See assets/pipeline-dag.yaml.template
200stages:
201  - name: data_preparation
202    dependencies: []
203  - name: model_training
204    dependencies: [data_preparation]
205  - name: model_evaluation
206    dependencies: [model_training]
207  - name: model_deployment
208    dependencies: [model_evaluation]
209```
210 
211### Real-time Feature Pipeline
212 
213```python
214# Stream processing for real-time features
215# Combined with batch training
216# See references/data-preparation.md
217```
218 
219### Continuous Training
220 
221```python
222# Automated retraining on schedule
223# Triggered by data drift detection
224# See references/model-training.md
225```
226 
227## Troubleshooting
228 
229### Common Issues
230 
231- **Pipeline failures**: Check dependencies and data availability
232- **Training instability**: Review hyperparameters and data quality
233- **Deployment issues**: Validate model artifacts and serving config
234- **Performance degradation**: Monitor data drift and model metrics
235 
236### Debugging Steps
237 
2381. Check pipeline logs for each stage
2392. Validate input/output data at boundaries
2403. Test components in isolation
2414. Review experiment tracking metrics
2425. Inspect model artifacts and metadata
243 
244## Next Steps
245 
246After setting up your pipeline:
247 
2481. Explore **hyperparameter-tuning** skill for optimization
2492. Learn **experiment-tracking-setup** for MLflow/W&B
2503. Review **model-deployment-patterns** for serving strategies
2514. Implement monitoring with observability tools
252 
253## Related Skills
254 
255- **experiment-tracking-setup**: MLflow and Weights & Biases integration
256- **hyperparameter-tuning**: Automated hyperparameter optimization
257- **model-deployment-patterns**: Advanced deployment strategies
258

Full transparency — inspect the skill content before installing.

New to skill.md files?

See what a SKILL.md file is, how to install one, and how it differs from AGENTS.md or cursorrules.

Read the guide →