How do I install Hugging Face Jobs?

Install Hugging Face Jobs with a single command: npx mdskills install huggingface/hugging-face-jobs. This downloads the skill files into your project and your AI agent picks them up automatically.
What platforms support Hugging Face Jobs?

Hugging Face Jobs works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.
← Back to skills
Hugging Face Jobs

Name: Hugging Face Jobs: AI Agent Skill
Brand: huggingface
Availability: InStock
Rating: 8 (1 reviews)
Author: huggingface
Verified
ProductivityIntermediate
This skill should be used when users want to run any workload on Hugging Face Jobs infrastructure. Covers UV scripts, Docker-based jobs, hardware selection, cost estimation, authentication with tokens, secrets management, timeout configuration, and result persistence. Designed for general-purpose compute workloads including data processing, inference, experiments, batch jobs, and any Python-based tasks.
by @huggingface1 downloads4,202Updated 2/20/2026
Add this skill
npx mdskills install huggingface/hugging-face-jobs
Fork & Edit
Are you @huggingface? Sign in with GitHub to claim this listing.
Skill Advisor8.0
Comprehensive guide for running workloads on Hugging Face cloud infrastructure with detailed token security
+Provides extremely detailed token authentication guidance with security best practices
+Clearly defines when to use skill with specific use cases and trigger conditions
+Includes multiple approaches (UV scripts, Docker) with concrete MCP tool examples
-Content appears truncated mid-sentence limiting completeness assessment
-Declares all permissions without clear justification for shell execution necessity
SKILL.md
Edit in Browser
1---
2name: hugging-face-jobs
3description: This skill should be used when users want to run any workload on Hugging Face Jobs infrastructure. Covers UV scripts, Docker-based jobs, hardware selection, cost estimation, authentication with tokens, secrets management, timeout configuration, and result persistence. Designed for general-purpose compute workloads including data processing, inference, experiments, batch jobs, and any Python-based tasks. Should be invoked for tasks involving cloud compute, GPU workloads, or when users mention running jobs on Hugging Face infrastructure without local setup.
4license: Complete terms in LICENSE.txt
5---
6 
7# Running Workloads on Hugging Face Jobs
8 
9## Overview
10 
11Run any workload on fully managed Hugging Face infrastructure. No local setup required—jobs run on cloud CPUs, GPUs, or TPUs and can persist results to the Hugging Face Hub.
12 
13**Common use cases:**
14- **Data Processing** - Transform, filter, or analyze large datasets
15- **Batch Inference** - Run inference on thousands of samples
16- **Experiments & Benchmarks** - Reproducible ML experiments
17- **Model Training** - Fine-tune models (see `model-trainer` skill for TRL-specific training)
18- **Synthetic Data Generation** - Generate datasets using LLMs
19- **Development & Testing** - Test code without local GPU setup
20- **Scheduled Jobs** - Automate recurring tasks
21 
22**For model training specifically:** See the `model-trainer` skill for TRL-based training workflows.
23 
24## When to Use This Skill
25 
26Use this skill when users want to:
27- Run Python workloads on cloud infrastructure
28- Execute jobs without local GPU/TPU setup
29- Process data at scale
30- Run batch inference or experiments
31- Schedule recurring tasks
32- Use GPUs/TPUs for any workload
33- Persist results to the Hugging Face Hub
34 
35## Key Directives
36 
37When assisting with jobs:
38 
391. **ALWAYS use `hf_jobs()` MCP tool** - Submit jobs using `hf_jobs("uv", {...})` or `hf_jobs("run", {...})`. The `script` parameter accepts Python code directly. Do NOT save to local files unless the user explicitly requests it. Pass the script content as a string to `hf_jobs()`.
40 
412. **Always handle authentication** - Jobs that interact with the Hub require `HF_TOKEN` via secrets. See Token Usage section below.
42 
433. **Provide job details after submission** - After submitting, provide job ID, monitoring URL, estimated time, and note that the user can request status checks later.
44 
454. **Set appropriate timeouts** - Default 30min may be insufficient for long-running tasks.
46 
47## Prerequisites Checklist
48 
49Before starting any job, verify:
50 
51### ✅ **Account & Authentication**
52- Hugging Face Account with [Pro](https://hf.co/pro), [Team](https://hf.co/enterprise), or [Enterprise](https://hf.co/enterprise) plan (Jobs require paid plan)
53- Authenticated login: Check with `hf_whoami()`
54- **HF_TOKEN for Hub Access** ⚠️ CRITICAL - Required for any Hub operations (push models/datasets, download private repos, etc.)
55- Token must have appropriate permissions (read for downloads, write for uploads)
56 
57### ✅ **Token Usage** (See Token Usage section for details)
58 
59**When tokens are required:**
60- Pushing models/datasets to Hub
61- Accessing private repositories
62- Using Hub APIs in scripts
63- Any authenticated Hub operations
64 
65**How to provide tokens:**
66```python
67{
68    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # Recommended: automatic token
69}
70```
71 
72**⚠️ CRITICAL:** The `$HF_TOKEN` placeholder is automatically replaced with your logged-in token. Never hardcode tokens in scripts.
73 
74## Token Usage Guide
75 
76### Understanding Tokens
77 
78**What are HF Tokens?**
79- Authentication credentials for Hugging Face Hub
80- Required for authenticated operations (push, private repos, API access)
81- Stored securely on your machine after `hf auth login`
82 
83**Token Types:**
84- **Read Token** - Can download models/datasets, read private repos
85- **Write Token** - Can push models/datasets, create repos, modify content
86- **Organization Token** - Can act on behalf of an organization
87 
88### When Tokens Are Required
89 
90**Always Required:**
91- Pushing models/datasets to Hub
92- Accessing private repositories
93- Creating new repositories
94- Modifying existing repositories
95- Using Hub APIs programmatically
96 
97**Not Required:**
98- Downloading public models/datasets
99- Running jobs that don't interact with Hub
100- Reading public repository information
101 
102### How to Provide Tokens to Jobs
103 
104#### Method 1: Automatic Token (Recommended)
105 
106```python
107hf_jobs("uv", {
108    "script": "your_script.py",
109    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # ✅ Automatic replacement
110})
111```
112 
113**How it works:**
114- `$HF_TOKEN` is a placeholder that gets replaced with your actual token
115- Uses the token from your logged-in session (`hf auth login`)
116- Most secure and convenient method
117- Token is encrypted server-side when passed as a secret
118 
119**Benefits:**
120- No token exposure in code
121- Uses your current login session
122- Automatically updated if you re-login
123- Works seamlessly with MCP tools
124 
125#### Method 2: Explicit Token (Not Recommended)
126 
127```python
128hf_jobs("uv", {
129    "script": "your_script.py",
130    "secrets": {"HF_TOKEN": "hf_abc123..."}  # ⚠️ Hardcoded token
131})
132```
133 
134**When to use:**
135- Only if automatic token doesn't work
136- Testing with a specific token
137- Organization tokens (use with caution)
138 
139**Security concerns:**
140- Token visible in code/logs
141- Must manually update if token rotates
142- Risk of token exposure
143 
144#### Method 3: Environment Variable (Less Secure)
145 
146```python
147hf_jobs("uv", {
148    "script": "your_script.py",
149    "env": {"HF_TOKEN": "hf_abc123..."}  # ⚠️ Less secure than secrets
150})
151```
152 
153**Difference from secrets:**
154- `env` variables are visible in job logs
155- `secrets` are encrypted server-side
156- Always prefer `secrets` for tokens
157 
158### Using Tokens in Scripts
159 
160**In your Python script, tokens are available as environment variables:**
161 
162```python
163# /// script
164# dependencies = ["huggingface-hub"]
165# ///
166 
167import os
168from huggingface_hub import HfApi
169 
170# Token is automatically available if passed via secrets
171token = os.environ.get("HF_TOKEN")
172 
173# Use with Hub API
174api = HfApi(token=token)
175 
176# Or let huggingface_hub auto-detect
177api = HfApi()  # Automatically uses HF_TOKEN env var
178```
179 
180**Best practices:**
181- Don't hardcode tokens in scripts
182- Use `os.environ.get("HF_TOKEN")` to access
183- Let `huggingface_hub` auto-detect when possible
184- Verify token exists before Hub operations
185 
186### Token Verification
187 
188**Check if you're logged in:**
189```python
190from huggingface_hub import whoami
191user_info = whoami()  # Returns your username if authenticated
192```
193 
194**Verify token in job:**
195```python
196import os
197assert "HF_TOKEN" in os.environ, "HF_TOKEN not found!"
198token = os.environ["HF_TOKEN"]
199print(f"Token starts with: {token[:7]}...")  # Should start with "hf_"
200```
201 
202### Common Token Issues
203 
204**Error: 401 Unauthorized**
205- **Cause:** Token missing or invalid
206- **Fix:** Add `secrets={"HF_TOKEN": "$HF_TOKEN"}` to job config
207- **Verify:** Check `hf_whoami()` works locally
208 
209**Error: 403 Forbidden**
210- **Cause:** Token lacks required permissions
211- **Fix:** Ensure token has write permissions for push operations
212- **Check:** Token type at https://huggingface.co/settings/tokens
213 
214**Error: Token not found in environment**
215- **Cause:** `secrets` not passed or wrong key name
216- **Fix:** Use `secrets={"HF_TOKEN": "$HF_TOKEN"}` (not `env`)
217- **Verify:** Script checks `os.environ.get("HF_TOKEN")`
218 
219**Error: Repository access denied**
220- **Cause:** Token doesn't have access to private repo
221- **Fix:** Use token from account with access
222- **Check:** Verify repo visibility and your permissions
223 
224### Token Security Best Practices
225 
2261. **Never commit tokens** - Use `$HF_TOKEN` placeholder or environment variables
2272. **Use secrets, not env** - Secrets are encrypted server-side
2283. **Rotate tokens regularly** - Generate new tokens periodically
2294. **Use minimal permissions** - Create tokens with only needed permissions
2305. **Don't share tokens** - Each user should use their own token
2316. **Monitor token usage** - Check token activity in Hub settings
232 
233### Complete Token Example
234 
235```python
236# Example: Push results to Hub
237hf_jobs("uv", {
238    "script": """
239# /// script
240# dependencies = ["huggingface-hub", "datasets"]
241# ///
242 
243import os
244from huggingface_hub import HfApi
245from datasets import Dataset
246 
247# Verify token is available
248assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"
249 
250# Use token for Hub operations
251api = HfApi(token=os.environ["HF_TOKEN"])
252 
253# Create and push dataset
254data = {"text": ["Hello", "World"]}
255dataset = Dataset.from_dict(data)
256dataset.push_to_hub("username/my-dataset", token=os.environ["HF_TOKEN"])
257 
258print("✅ Dataset pushed successfully!")
259""",
260    "flavor": "cpu-basic",
261    "timeout": "30m",
262    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # ✅ Token provided securely
263})
264```
265 
266## Quick Start: Two Approaches
267 
268### Approach 1: UV Scripts (Recommended)
269 
270UV scripts use PEP 723 inline dependencies for clean, self-contained workloads.
271 
272**MCP Tool:**
273```python
274hf_jobs("uv", {
275    "script": """
276# /// script
277# dependencies = ["transformers", "torch"]
278# ///
279 
280from transformers import pipeline
281import torch
282 
283# Your workload here
284classifier = pipeline("sentiment-analysis")
285result = classifier("I love Hugging Face!")
286print(result)
287""",
288    "flavor": "cpu-basic",
289    "timeout": "30m"
290})
291```
292 
293**CLI Equivalent:**
294```bash
295hf jobs uv run my_script.py --flavor cpu-basic --timeout 30m
296```
297 
298**Python API:**
299```python
300from huggingface_hub import run_uv_job
301run_uv_job("my_script.py", flavor="cpu-basic", timeout="30m")
302```
303 
304**Benefits:** Direct MCP tool usage, clean code, dependencies declared inline, no file saving required
305 
306**When to use:** Default choice for all workloads, custom logic, any scenario requiring `hf_jobs()`
307 
308#### Custom Docker Images for UV Scripts
309 
310By default, UV scripts use `ghcr.io/astral-sh/uv:python3.12-bookworm-slim`. For ML workloads with complex dependencies, use pre-built images:
311 
312```python
313hf_jobs("uv", {
314    "script": "inference.py",
315    "image": "vllm/vllm-openai:latest",  # Pre-built image with vLLM
316    "flavor": "a10g-large"
317})
318```
319 
320**CLI:**
321```bash
322hf jobs uv run --image vllm/vllm-openai:latest --flavor a10g-large inference.py
323```
324 
325**Benefits:** Faster startup, pre-installed dependencies, optimized for specific frameworks
326 
327#### Python Version
328 
329By default, UV scripts use Python 3.12. Specify a different version:
330 
331```python
332hf_jobs("uv", {
333    "script": "my_script.py",
334    "python": "3.11",  # Use Python 3.11
335    "flavor": "cpu-basic"
336})
337```
338 
339**Python API:**
340```python
341from huggingface_hub import run_uv_job
342run_uv_job("my_script.py", python="3.11")
343```
344 
345#### Working with Scripts
346 
347⚠️ **Important:** There are *two* "script path" stories depending on how you run Jobs:
348 
349- **Using the `hf_jobs()` MCP tool (recommended in this repo)**: the `script` value must be **inline code** (a string) or a **URL**. A local filesystem path (like `"./scripts/foo.py"`) won't exist inside the remote container.
350- **Using the `hf jobs uv run` CLI**: local file paths **do work** (the CLI uploads your script).
351 
352**Common mistake with `hf_jobs()` MCP tool:**
353 
354```python
355# ❌ Will fail (remote container can't see your local path)
356hf_jobs("uv", {"script": "./scripts/foo.py"})
357```
358 
359**Correct patterns with `hf_jobs()` MCP tool:**
360 
361```python
362# ✅ Inline: read the local script file and pass its *contents*
363from pathlib import Path
364script = Path("hf-jobs/scripts/foo.py").read_text()
365hf_jobs("uv", {"script": script})
366 
367# ✅ URL: host the script somewhere reachable
368hf_jobs("uv", {"script": "https://huggingface.co/datasets/uv-scripts/.../raw/main/foo.py"})
369 
370# ✅ URL from GitHub
371hf_jobs("uv", {"script": "https://raw.githubusercontent.com/huggingface/trl/main/trl/scripts/sft.py"})
372```
373 
374**CLI equivalent (local paths supported):**
375 
376```bash
377hf jobs uv run ./scripts/foo.py -- --your --args
378```
379 
380#### Adding Dependencies at Runtime
381 
382Add extra dependencies beyond what's in the PEP 723 header:
383 
384```python
385hf_jobs("uv", {
386    "script": "inference.py",
387    "dependencies": ["transformers", "torch>=2.0"],  # Extra deps
388    "flavor": "a10g-small"
389})
390```
391 
392**Python API:**
393```python
394from huggingface_hub import run_uv_job
395run_uv_job("inference.py", dependencies=["transformers", "torch>=2.0"])
396```
397 
398### Approach 2: Docker-Based Jobs
399 
400Run jobs with custom Docker images and commands.
401 
402**MCP Tool:**
403```python
404hf_jobs("run", {
405    "image": "python:3.12",
406    "command": ["python", "-c", "print('Hello from HF Jobs!')"],
407    "flavor": "cpu-basic",
408    "timeout": "30m"
409})
410```
411 
412**CLI Equivalent:**
413```bash
414hf jobs run python:3.12 python -c "print('Hello from HF Jobs!')"
415```
416 
417**Python API:**
418```python
419from huggingface_hub import run_job
420run_job(image="python:3.12", command=["python", "-c", "print('Hello!')"], flavor="cpu-basic")
421```
422 
423**Benefits:** Full Docker control, use pre-built images, run any command
424**When to use:** Need specific Docker images, non-Python workloads, complex environments
425 
426**Example with GPU:**
427```python
428hf_jobs("run", {
429    "image": "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel",
430    "command": ["python", "-c", "import torch; print(torch.cuda.get_device_name())"],
431    "flavor": "a10g-small",
432    "timeout": "1h"
433})
434```
435 
436**Using Hugging Face Spaces as Images:**
437 
438You can use Docker images from HF Spaces:
439```python
440hf_jobs("run", {
441    "image": "hf.co/spaces/lhoestq/duckdb",  # Space as Docker image
442    "command": ["duckdb", "-c", "SELECT 'Hello from DuckDB!'"],
443    "flavor": "cpu-basic"
444})
445```
446 
447**CLI:**
448```bash
449hf jobs run hf.co/spaces/lhoestq/duckdb duckdb -c "SELECT 'Hello!'"
450```
451 
452### Finding More UV Scripts on Hub
453 
454The `uv-scripts` organization provides ready-to-use UV scripts stored as datasets on Hugging Face Hub:
455 
456```python
457# Discover available UV script collections
458dataset_search({"author": "uv-scripts", "sort": "downloads", "limit": 20})
459 
460# Explore a specific collection
461hub_repo_details(["uv-scripts/classification"], repo_type="dataset", include_readme=True)
462```
463 
464**Popular collections:** OCR, classification, synthetic-data, vLLM, dataset-creation
465 
466## Hardware Selection
467 
468> **Reference:** [HF Jobs Hardware Docs](https://huggingface.co/docs/hub/en/spaces-config-reference) (updated 07/2025)
469 
470| Workload Type | Recommended Hardware | Use Case |
471|---------------|---------------------|----------|
472| Data processing, testing | `cpu-basic`, `cpu-upgrade` | Lightweight tasks |
473| Small models, demos | `t4-small` | <1B models, quick tests |
474| Medium models | `t4-medium`, `l4x1` | 1-7B models |
475| Large models, production | `a10g-small`, `a10g-large` | 7-13B models |
476| Very large models | `a100-large` | 13B+ models |
477| Batch inference | `a10g-large`, `a100-large` | High-throughput |
478| Multi-GPU workloads | `l4x4`, `a10g-largex2`, `a10g-largex4` | Parallel/large models |
479| TPU workloads | `v5e-1x1`, `v5e-2x2`, `v5e-2x4` | JAX/Flax, TPU-optimized |
480 
481**All Available Flavors:**
482- **CPU:** `cpu-basic`, `cpu-upgrade`
483- **GPU:** `t4-small`, `t4-medium`, `l4x1`, `l4x4`, `a10g-small`, `a10g-large`, `a10g-largex2`, `a10g-largex4`, `a100-large`
484- **TPU:** `v5e-1x1`, `v5e-2x2`, `v5e-2x4`
485 
486**Guidelines:**
487- Start with smaller hardware for testing
488- Scale up based on actual needs
489- Use multi-GPU for parallel workloads or large models
490- Use TPUs for JAX/Flax workloads
491- See `references/hardware_guide.md` for detailed specifications
492 
493## Critical: Saving Results
494 
495**⚠️ EPHEMERAL ENVIRONMENT—MUST PERSIST RESULTS**
496 
497The Jobs environment is temporary. All files are deleted when the job ends. If results aren't persisted, **ALL WORK IS LOST**.
498 
499### Persistence Options
500 
501**1. Push to Hugging Face Hub (Recommended)**
502 
503```python
504# Push models
505model.push_to_hub("username/model-name", token=os.environ["HF_TOKEN"])
506 
507# Push datasets
508dataset.push_to_hub("username/dataset-name", token=os.environ["HF_TOKEN"])
509 
510# Push artifacts
511api.upload_file(
512    path_or_fileobj="results.json",
513    path_in_repo="results.json",
514    repo_id="username/results",
515    token=os.environ["HF_TOKEN"]
516)
517```
518 
519**2. Use External Storage**
520 
521```python
522# Upload to S3, GCS, etc.
523import boto3
524s3 = boto3.client('s3')
525s3.upload_file('results.json', 'my-bucket', 'results.json')
526```
527 
528**3. Send Results via API**
529 
530```python
531# POST results to your API
532import requests
533requests.post("https://your-api.com/results", json=results)
534```
535 
536### Required Configuration for Hub Push
537 
538**In job submission:**
539```python
540{
541    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # Enables authentication
542}
543```
544 
545**In script:**
546```python
547import os
548from huggingface_hub import HfApi
549 
550# Token automatically available from secrets
551api = HfApi(token=os.environ.get("HF_TOKEN"))
552 
553# Push your results
554api.upload_file(...)
555```
556 
557### Verification Checklist
558 
559Before submitting:
560- [ ] Results persistence method chosen
561- [ ] `secrets={"HF_TOKEN": "$HF_TOKEN"}` if using Hub
562- [ ] Script handles missing token gracefully
563- [ ] Test persistence path works
564 
565**See:** `references/hub_saving.md` for detailed Hub persistence guide
566 
567## Timeout Management
568 
569**⚠️ DEFAULT: 30 MINUTES**
570 
571Jobs automatically stop after the timeout. For long-running tasks like training, always set a custom timeout.
572 
573### Setting Timeouts
574 
575**MCP Tool:**
576```python
577{
578    "timeout": "2h"   # 2 hours
579}
580```
581 
582**Supported formats:**
583- Integer/float: seconds (e.g., `300` = 5 minutes)
584- String with suffix: `"5m"` (minutes), `"2h"` (hours), `"1d"` (days)
585- Examples: `"90m"`, `"2h"`, `"1.5h"`, `300`, `"1d"`
586 
587**Python API:**
588```python
589from huggingface_hub import run_job, run_uv_job
590 
591run_job(image="python:3.12", command=[...], timeout="2h")
592run_uv_job("script.py", timeout=7200)  # 2 hours in seconds
593```
594 
595### Timeout Guidelines
596 
597| Scenario | Recommended | Notes |
598|----------|-------------|-------|
599| Quick test | 10-30 min | Verify setup |
600| Data processing | 1-2 hours | Depends on data size |
601| Batch inference | 2-4 hours | Large batches |
602| Experiments | 4-8 hours | Multiple runs |
603| Long-running | 8-24 hours | Production workloads |
604 
605**Always add 20-30% buffer** for setup, network delays, and cleanup.
606 
607**On timeout:** Job killed immediately, all unsaved progress lost
608 
609## Cost Estimation
610 
611**General guidelines:**
612 
613```
614Total Cost = (Hours of runtime) × (Cost per hour)
615```
616 
617**Example calculations:**
618 
619**Quick test:**
620- Hardware: cpu-basic ($0.10/hour)
621- Time: 15 minutes (0.25 hours)
622- Cost: $0.03
623 
624**Data processing:**
625- Hardware: l4x1 ($2.50/hour)
626- Time: 2 hours
627- Cost: $5.00
628 
629**Batch inference:**
630- Hardware: a10g-large ($5/hour)
631- Time: 4 hours
632- Cost: $20.00
633 
634**Cost optimization tips:**
6351. Start small - Test on cpu-basic or t4-small
6362. Monitor runtime - Set appropriate timeouts
6373. Use checkpoints - Resume if job fails
6384. Optimize code - Reduce unnecessary compute
6395. Choose right hardware - Don't over-provision
640 
641## Monitoring and Tracking
642 
643### Check Job Status
644 
645**MCP Tool:**
646```python
647# List all jobs
648hf_jobs("ps")
649 
650# Inspect specific job
651hf_jobs("inspect", {"job_id": "your-job-id"})
652 
653# View logs
654hf_jobs("logs", {"job_id": "your-job-id"})
655 
656# Cancel a job
657hf_jobs("cancel", {"job_id": "your-job-id"})
658```
659 
660**Python API:**
661```python
662from huggingface_hub import list_jobs, inspect_job, fetch_job_logs, cancel_job
663 
664# List your jobs
665jobs = list_jobs()
666 
667# List running jobs only
668running = [j for j in list_jobs() if j.status.stage == "RUNNING"]
669 
670# Inspect specific job
671job_info = inspect_job(job_id="your-job-id")
672 
673# View logs
674for log in fetch_job_logs(job_id="your-job-id"):
675    print(log)
676 
677# Cancel a job
678cancel_job(job_id="your-job-id")
679```
680 
681**CLI:**
682```bash
683hf jobs ps                    # List jobs
684hf jobs logs <job-id>         # View logs
685hf jobs cancel <job-id>       # Cancel job
686```
687 
688**Remember:** Wait for user to request status checks. Avoid polling repeatedly.
689 
690### Job URLs
691 
692After submission, jobs have monitoring URLs:
693```
694https://huggingface.co/jobs/username/job-id
695```
696 
697View logs, status, and details in the browser.
698 
699### Wait for Multiple Jobs
700 
701```python
702import time
703from huggingface_hub import inspect_job, run_job
704 
705# Run multiple jobs
706jobs = [run_job(image=img, command=cmd) for img, cmd in workloads]
707 
708# Wait for all to complete
709for job in jobs:
710    while inspect_job(job_id=job.id).status.stage not in ("COMPLETED", "ERROR"):
711        time.sleep(10)
712```
713 
714## Scheduled Jobs
715 
716Run jobs on a schedule using CRON expressions or predefined schedules.
717 
718**MCP Tool:**
719```python
720# Schedule a UV script that runs every hour
721hf_jobs("scheduled uv", {
722    "script": "your_script.py",
723    "schedule": "@hourly",
724    "flavor": "cpu-basic"
725})
726 
727# Schedule with CRON syntax
728hf_jobs("scheduled uv", {
729    "script": "your_script.py",
730    "schedule": "0 9 * * 1",  # 9 AM every Monday
731    "flavor": "cpu-basic"
732})
733 
734# Schedule a Docker-based job
735hf_jobs("scheduled run", {
736    "image": "python:3.12",
737    "command": ["python", "-c", "print('Scheduled!')"],
738    "schedule": "@daily",
739    "flavor": "cpu-basic"
740})
741```
742 
743**Python API:**
744```python
745from huggingface_hub import create_scheduled_job, create_scheduled_uv_job
746 
747# Schedule a Docker job
748create_scheduled_job(
749    image="python:3.12",
750    command=["python", "-c", "print('Running on schedule!')"],
751    schedule="@hourly"
752)
753 
754# Schedule a UV script
755create_scheduled_uv_job("my_script.py", schedule="@daily", flavor="cpu-basic")
756 
757# Schedule with GPU
758create_scheduled_uv_job(
759    "ml_inference.py",
760    schedule="0 */6 * * *",  # Every 6 hours
761    flavor="a10g-small"
762)
763```
764 
765**Available schedules:**
766- `@annually`, `@yearly` - Once per year
767- `@monthly` - Once per month
768- `@weekly` - Once per week
769- `@daily` - Once per day
770- `@hourly` - Once per hour
771- CRON expression - Custom schedule (e.g., `"*/5 * * * *"` for every 5 minutes)
772 
773**Manage scheduled jobs:**
774```python
775# MCP Tool
776hf_jobs("scheduled ps")                              # List scheduled jobs
777hf_jobs("scheduled inspect", {"job_id": "..."})     # Inspect details
778hf_jobs("scheduled suspend", {"job_id": "..."})     # Pause
779hf_jobs("scheduled resume", {"job_id": "..."})      # Resume
780hf_jobs("scheduled delete", {"job_id": "..."})      # Delete
781```
782 
783**Python API for management:**
784```python
785from huggingface_hub import (
786    list_scheduled_jobs,
787    inspect_scheduled_job,
788    suspend_scheduled_job,
789    resume_scheduled_job,
790    delete_scheduled_job
791)
792 
793# List all scheduled jobs
794scheduled = list_scheduled_jobs()
795 
796# Inspect a scheduled job
797info = inspect_scheduled_job(scheduled_job_id)
798 
799# Suspend (pause) a scheduled job
800suspend_scheduled_job(scheduled_job_id)
801 
802# Resume a scheduled job
803resume_scheduled_job(scheduled_job_id)
804 
805# Delete a scheduled job
806delete_scheduled_job(scheduled_job_id)
807```
808 
809## Webhooks: Trigger Jobs on Events
810 
811Trigger jobs automatically when changes happen in Hugging Face repositories.
812 
813**Python API:**
814```python
815from huggingface_hub import create_webhook
816 
817# Create webhook that triggers a job when a repo changes
818webhook = create_webhook(
819    job_id=job.id,
820    watched=[
821        {"type": "user", "name": "your-username"},
822        {"type": "org", "name": "your-org-name"}
823    ],
824    domains=["repo", "discussion"],
825    secret="your-secret"
826)
827```
828 
829**How it works:**
8301. Webhook listens for changes in watched repositories
8312. When triggered, the job runs with `WEBHOOK_PAYLOAD` environment variable
8323. Your script can parse the payload to understand what changed
833 
834**Use cases:**
835- Auto-process new datasets when uploaded
836- Trigger inference when models are updated
837- Run tests when code changes
838- Generate reports on repository activity
839 
840**Access webhook payload in script:**
841```python
842import os
843import json
844 
845payload = json.loads(os.environ.get("WEBHOOK_PAYLOAD", "{}"))
846print(f"Event type: {payload.get('event', {}).get('action')}")
847```
848 
849See [Webhooks Documentation](https://huggingface.co/docs/huggingface_hub/guides/webhooks) for more details.
850 
851## Common Workload Patterns
852 
853This repository ships ready-to-run UV scripts in `hf-jobs/scripts/`. Prefer using them instead of inventing new templates.
854 
855### Pattern 1: Dataset → Model Responses (vLLM) — `scripts/generate-responses.py`
856 
857**What it does:** loads a Hub dataset (chat `messages` or a `prompt` column), applies a model chat template, generates responses with vLLM, and **pushes** the output dataset + dataset card back to the Hub.
858 
859**Requires:** GPU + **write** token (it pushes a dataset).
860 
861```python
862from pathlib import Path
863 
864script = Path("hf-jobs/scripts/generate-responses.py").read_text()
865hf_jobs("uv", {
866    "script": script,
867    "script_args": [
868        "username/input-dataset",
869        "username/output-dataset",
870        "--messages-column", "messages",
871        "--model-id", "Qwen/Qwen3-30B-A3B-Instruct-2507",
872        "--temperature", "0.7",
873        "--top-p", "0.8",
874        "--max-tokens", "2048",
875    ],
876    "flavor": "a10g-large",
877    "timeout": "4h",
878    "secrets": {"HF_TOKEN": "$HF_TOKEN"},
879})
880```
881 
882### Pattern 2: CoT Self-Instruct Synthetic Data — `scripts/cot-self-instruct.py`
883 
884**What it does:** generates synthetic prompts/answers via CoT Self-Instruct, optionally filters outputs (answer-consistency / RIP), then **pushes** the generated dataset + dataset card to the Hub.
885 
886**Requires:** GPU + **write** token (it pushes a dataset).
887 
888```python
889from pathlib import Path
890 
891script = Path("hf-jobs/scripts/cot-self-instruct.py").read_text()
892hf_jobs("uv", {
893    "script": script,
894    "script_args": [
895        "--seed-dataset", "davanstrien/s1k-reasoning",
896        "--output-dataset", "username/synthetic-math",
897        "--task-type", "reasoning",
898        "--num-samples", "5000",
899        "--filter-method", "answer-consistency",
900    ],
901    "flavor": "l4x4",
902    "timeout": "8h",
903    "secrets": {"HF_TOKEN": "$HF_TOKEN"},
904})
905```
906 
907### Pattern 3: Streaming Dataset Stats (Polars + HF Hub) — `scripts/finepdfs-stats.py`
908 
909**What it does:** scans parquet directly from Hub (no 300GB download), computes temporal stats, and (optionally) uploads results to a Hub dataset repo.
910 
911**Requires:** CPU is often enough; token needed **only** if you pass `--output-repo` (upload).
912 
913```python
914from pathlib import Path
915 
916script = Path("hf-jobs/scripts/finepdfs-stats.py").read_text()
917hf_jobs("uv", {
918    "script": script,
919    "script_args": [
920        "--limit", "10000",
921        "--show-plan",
922        "--output-repo", "username/finepdfs-temporal-stats",
923    ],
924    "flavor": "cpu-upgrade",
925    "timeout": "2h",
926    "env": {"HF_XET_HIGH_PERFORMANCE": "1"},
927    "secrets": {"HF_TOKEN": "$HF_TOKEN"},
928})
929```
930 
931## Common Failure Modes
932 
933### Out of Memory (OOM)
934 
935**Fix:**
9361. Reduce batch size or data chunk size
9372. Process data in smaller batches
9383. Upgrade hardware: cpu → t4 → a10g → a100
939 
940### Job Timeout
941 
942**Fix:**
9431. Check logs for actual runtime
9442. Increase timeout with buffer: `"timeout": "3h"`
9453. Optimize code for faster execution
9464. Process data in chunks
947 
948### Hub Push Failures
949 
950**Fix:**
9511. Add to job: `secrets={"HF_TOKEN": "$HF_TOKEN"}`
9522. Verify token in script: `assert "HF_TOKEN" in os.environ`
9533. Check token permissions
9544. Verify repo exists or can be created
955 
956### Missing Dependencies
957 
958**Fix:**
959Add to PEP 723 header:
960```python
961# /// script
962# dependencies = ["package1", "package2>=1.0.0"]
963# ///
964```
965 
966### Authentication Errors
967 
968**Fix:**
9691. Check `hf_whoami()` works locally
9702. Verify `secrets={"HF_TOKEN": "$HF_TOKEN"}` in job config
9713. Re-login: `hf auth login`
9724. Check token has required permissions
973 
974## Troubleshooting
975 
976**Common issues:**
977- Job times out → Increase timeout, optimize code
978- Results not saved → Check persistence method, verify HF_TOKEN
979- Out of Memory → Reduce batch size, upgrade hardware
980- Import errors → Add dependencies to PEP 723 header
981- Authentication errors → Check token, verify secrets parameter
982 
983**See:** `references/troubleshooting.md` for complete troubleshooting guide
984 
985## Resources
986 
987### References (In This Skill)
988- `references/token_usage.md` - Complete token usage guide
989- `references/hardware_guide.md` - Hardware specs and selection
990- `references/hub_saving.md` - Hub persistence guide
991- `references/troubleshooting.md` - Common issues and solutions
992 
993### Scripts (In This Skill)
994- `scripts/generate-responses.py` - vLLM batch generation: dataset → responses → push to Hub
995- `scripts/cot-self-instruct.py` - CoT Self-Instruct synthetic data generation + filtering → push to Hub
996- `scripts/finepdfs-stats.py` - Polars streaming stats over `finepdfs-edu` parquet on Hub (optional push)
997 
998### External Links
999 
1000**Official Documentation:**
1001- [HF Jobs Guide](https://huggingface.co/docs/huggingface_hub/guides/jobs) - Main documentation
1002- [HF Jobs CLI Reference](https://huggingface.co/docs/huggingface_hub/guides/cli#hf-jobs) - Command line interface
1003- [HF Jobs API Reference](https://huggingface.co/docs/huggingface_hub/package_reference/hf_api) - Python API details
1004- [Hardware Flavors Reference](https://huggingface.co/docs/hub/en/spaces-config-reference) - Available hardware
1005 
1006**Related Tools:**
1007- [UV Scripts Guide](https://docs.astral.sh/uv/guides/scripts/) - PEP 723 inline dependencies
1008- [UV Scripts Organization](https://huggingface.co/uv-scripts) - Community UV script collection
1009- [HF Hub Authentication](https://huggingface.co/docs/huggingface_hub/quick-start#authentication) - Token setup
1010- [Webhooks Documentation](https://huggingface.co/docs/huggingface_hub/guides/webhooks) - Event triggers
1011 
1012## Key Takeaways
1013 
10141. **Submit scripts inline** - The `script` parameter accepts Python code directly; no file saving required unless user requests
10152. **Jobs are asynchronous** - Don't wait/poll; let user check when ready
10163. **Always set timeout** - Default 30 min may be insufficient; set appropriate timeout
10174. **Always persist results** - Environment is ephemeral; without persistence, all work is lost
10185. **Use tokens securely** - Always use `secrets={"HF_TOKEN": "$HF_TOKEN"}` for Hub operations
10196. **Choose appropriate hardware** - Start small, scale up based on needs (see hardware guide)
10207. **Use UV scripts** - Default to `hf_jobs("uv", {...})` with inline scripts for Python workloads
10218. **Handle authentication** - Verify tokens are available before Hub operations
10229. **Monitor jobs** - Provide job URLs and status check commands
102310. **Optimize costs** - Choose right hardware, set appropriate timeouts
1024 
1025## Quick Reference: MCP Tool vs CLI vs Python API
1026 
1027| Operation | MCP Tool | CLI | Python API |
1028|-----------|----------|-----|------------|
1029| Run UV script | `hf_jobs("uv", {...})` | `hf jobs uv run script.py` | `run_uv_job("script.py")` |
1030| Run Docker job | `hf_jobs("run", {...})` | `hf jobs run image cmd` | `run_job(image, command)` |
1031| List jobs | `hf_jobs("ps")` | `hf jobs ps` | `list_jobs()` |
1032| View logs | `hf_jobs("logs", {...})` | `hf jobs logs <id>` | `fetch_job_logs(job_id)` |
1033| Cancel job | `hf_jobs("cancel", {...})` | `hf jobs cancel <id>` | `cancel_job(job_id)` |
1034| Schedule UV | `hf_jobs("scheduled uv", {...})` | - | `create_scheduled_uv_job()` |
1035| Schedule Docker | `hf_jobs("scheduled run", {...})` | - | `create_scheduled_job()` |
1036 
1037
Full transparency — inspect the skill content before installing.
New to skill.md files?
See what a SKILL.md file is, how to install one, and how it differs from AGENTS.md or cursorrules.
Read the guide →