Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.
Add this skill
npx mdskills install sickn33/ml-pipeline-workflowComprehensive MLOps guidance with clear structure but lacks concrete implementation examples
1---2name: ml-pipeline-workflow3description: Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.4---56# ML Pipeline Workflow78Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.910## Do not use this skill when1112- The task is unrelated to ml pipeline workflow13- You need a different domain or tool outside this scope1415## Instructions1617- Clarify goals, constraints, and required inputs.18- Apply relevant best practices and validate outcomes.19- Provide actionable steps and verification.20- If detailed examples are required, open `resources/implementation-playbook.md`.2122## Overview2324This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring.2526## Use this skill when2728- Building new ML pipelines from scratch29- Designing workflow orchestration for ML systems30- Implementing data → model → deployment automation31- Setting up reproducible training workflows32- Creating DAG-based ML orchestration33- Integrating ML components into production systems3435## What This Skill Provides3637### Core Capabilities38391. **Pipeline Architecture**40 - End-to-end workflow design41 - DAG orchestration patterns (Airflow, Dagster, Kubeflow)42 - Component dependencies and data flow43 - Error handling and retry strategies44452. **Data Preparation**46 - Data validation and quality checks47 - Feature engineering pipelines48 - Data versioning and lineage49 - Train/validation/test splitting strategies50513. **Model Training**52 - Training job orchestration53 - Hyperparameter management54 - Experiment tracking integration55 - Distributed training patterns56574. **Model Validation**58 - Validation frameworks and metrics59 - A/B testing infrastructure60 - Performance regression detection61 - Model comparison workflows62635. **Deployment Automation**64 - Model serving patterns65 - Canary deployments66 - Blue-green deployment strategies67 - Rollback mechanisms6869### Reference Documentation7071See the `references/` directory for detailed guides:72- **data-preparation.md** - Data cleaning, validation, and feature engineering73- **model-training.md** - Training workflows and best practices74- **model-validation.md** - Validation strategies and metrics75- **model-deployment.md** - Deployment patterns and serving architectures7677### Assets and Templates7879The `assets/` directory contains:80- **pipeline-dag.yaml.template** - DAG template for workflow orchestration81- **training-config.yaml** - Training configuration template82- **validation-checklist.md** - Pre-deployment validation checklist8384## Usage Patterns8586### Basic Pipeline Setup8788```python89# 1. Define pipeline stages90stages = [91 "data_ingestion",92 "data_validation",93 "feature_engineering",94 "model_training",95 "model_validation",96 "model_deployment"97]9899# 2. Configure dependencies100# See assets/pipeline-dag.yaml.template for full example101```102103### Production Workflow1041051. **Data Preparation Phase**106 - Ingest raw data from sources107 - Run data quality checks108 - Apply feature transformations109 - Version processed datasets1101112. **Training Phase**112 - Load versioned training data113 - Execute training jobs114 - Track experiments and metrics115 - Save trained models1161173. **Validation Phase**118 - Run validation test suite119 - Compare against baseline120 - Generate performance reports121 - Approve for deployment1221234. **Deployment Phase**124 - Package model artifacts125 - Deploy to serving infrastructure126 - Configure monitoring127 - Validate production traffic128129## Best Practices130131### Pipeline Design132133- **Modularity**: Each stage should be independently testable134- **Idempotency**: Re-running stages should be safe135- **Observability**: Log metrics at every stage136- **Versioning**: Track data, code, and model versions137- **Failure Handling**: Implement retry logic and alerting138139### Data Management140141- Use data validation libraries (Great Expectations, TFX)142- Version datasets with DVC or similar tools143- Document feature engineering transformations144- Maintain data lineage tracking145146### Model Operations147148- Separate training and serving infrastructure149- Use model registries (MLflow, Weights & Biases)150- Implement gradual rollouts for new models151- Monitor model performance drift152- Maintain rollback capabilities153154### Deployment Strategies155156- Start with shadow deployments157- Use canary releases for validation158- Implement A/B testing infrastructure159- Set up automated rollback triggers160- Monitor latency and throughput161162## Integration Points163164### Orchestration Tools165166- **Apache Airflow**: DAG-based workflow orchestration167- **Dagster**: Asset-based pipeline orchestration168- **Kubeflow Pipelines**: Kubernetes-native ML workflows169- **Prefect**: Modern dataflow automation170171### Experiment Tracking172173- MLflow for experiment tracking and model registry174- Weights & Biases for visualization and collaboration175- TensorBoard for training metrics176177### Deployment Platforms178179- AWS SageMaker for managed ML infrastructure180- Google Vertex AI for GCP deployments181- Azure ML for Azure cloud182- Kubernetes + KServe for cloud-agnostic serving183184## Progressive Disclosure185186Start with the basics and gradually add complexity:1871881. **Level 1**: Simple linear pipeline (data → train → deploy)1892. **Level 2**: Add validation and monitoring stages1903. **Level 3**: Implement hyperparameter tuning1914. **Level 4**: Add A/B testing and gradual rollouts1925. **Level 5**: Multi-model pipelines with ensemble strategies193194## Common Patterns195196### Batch Training Pipeline197198```yaml199# See assets/pipeline-dag.yaml.template200stages:201 - name: data_preparation202 dependencies: []203 - name: model_training204 dependencies: [data_preparation]205 - name: model_evaluation206 dependencies: [model_training]207 - name: model_deployment208 dependencies: [model_evaluation]209```210211### Real-time Feature Pipeline212213```python214# Stream processing for real-time features215# Combined with batch training216# See references/data-preparation.md217```218219### Continuous Training220221```python222# Automated retraining on schedule223# Triggered by data drift detection224# See references/model-training.md225```226227## Troubleshooting228229### Common Issues230231- **Pipeline failures**: Check dependencies and data availability232- **Training instability**: Review hyperparameters and data quality233- **Deployment issues**: Validate model artifacts and serving config234- **Performance degradation**: Monitor data drift and model metrics235236### Debugging Steps2372381. Check pipeline logs for each stage2392. Validate input/output data at boundaries2403. Test components in isolation2414. Review experiment tracking metrics2425. Inspect model artifacts and metadata243244## Next Steps245246After setting up your pipeline:2472481. Explore **hyperparameter-tuning** skill for optimization2492. Learn **experiment-tracking-setup** for MLflow/W&B2503. Review **model-deployment-patterns** for serving strategies2514. Implement monitoring with observability tools252253## Related Skills254255- **experiment-tracking-setup**: MLflow and Weights & Biases integration256- **hyperparameter-tuning**: Automated hyperparameter optimization257- **model-deployment-patterns**: Advanced deployment strategies258
Full transparency — inspect the skill content before installing.