Discover, compare, and run AI models using Replicate's API
Add this skill
npx mdskills install replicate/replicateProvides clear workflow for running AI models via API with practical guidelines and best practices
1---2name: replicate3description: Discover, compare, and run AI models using Replicate's API4---56## Docs78- Reference docs: https://replicate.com/docs/llms.txt9- HTTP API schema: https://api.replicate.com/openapi.json10- MCP server: https://mcp.replicate.com11- Set an `Accept: text/markdown` header when requesting docs pages to get a Markdown response.1213## Workflow1415Here's a common workflow for using Replicate's API to run a model:16171. **Choose the right model** - Search with the API or ask the user182. **Get model metadata** - Fetch model input and output schema via API193. **Create prediction** - POST to /v1/predictions204. **Poll for results** - GET prediction until status is "succeeded"215. **Return output** - Usually URLs to generated content2223## Choosing models2425- Use the search and collections APIs to find and compare the best models. Do not list all the models via API, as it's basically a firehose.26- Collections are curated by Replicate staff, so they're vetted.27- Official models are in the "official" collection.28- Use official models because they:29 - are always running30 - have stable API interfaces31 - have predictable output pricing32 - are maintained by Replicate staff33- If you must use a community model, be aware that it can take a long time to boot.34- You can create always-on deployments of community models, but you pay for model uptime.3536## Running models3738Models take time to run. There are three ways to run a model via API and get its output:39401. Create a prediction, store its id from the response, and poll until completion.412. Set a `Prefer: wait` header when creating a prediction for a blocking synchronous response. Only recommended for very fast models.423. Set an HTTPS webhook URL when creating a prediction, and Replicate will POST to that URL when the prediction completes.4344Follow these guideliness when running models:4546- Use the "POST /v1/predictions" endpoint, as it supports both official and community models.47- Every model has its own OpenAPI schema. Always fetch and check model schemas to make sure you're setting valid inputs. Even popular models change their schemas.48- Validate input parameters against schema constraints (minimum, maximum, enum values). Don't generate values that violate them.49- When unsure about a parameter value, use the model's default example or omit the optional parameter.50- Don't set optional inputs unless you have a reason to. Stick to the required inputs and let the model's defaults do the work.51- Use HTTPS URLs for file inputs whenever possible. You can also send base64-encoded files, but they should be avoided.52- Fire off multiple predictions concurrently. Don't wait for one to finish before starting the next.53- Output file URLs expire after 1 hour, so back them up if you need to keep them, using a service like Cloudflare R2.54- Webhooks are a good mechanism for receiving and storing prediction output.555657
Full transparency — inspect the skill content before installing.