Design patterns for building autonomous coding agents. Covers tool integration, permission systems, browser automation, and human-in-the-loop workflows. Use when building AI agents, designing tool APIs, implementing permission systems, or creating autonomous coding assistants.
Add this skill
npx mdskills install sickn33/autonomous-agent-patternsComprehensive architectural patterns with executable code examples and security best practices
1---2name: autonomous-agent-patterns3description: "Design patterns for building autonomous coding agents. Covers tool integration, permission systems, browser automation, and human-in-the-loop workflows. Use when building AI agents, designing tool APIs, implementing permission systems, or creating autonomous coding assistants."4---56# πΉοΈ Autonomous Agent Patterns78> Design patterns for building autonomous coding agents, inspired by [Cline](https://github.com/cline/cline) and [OpenAI Codex](https://github.com/openai/codex).910## When to Use This Skill1112Use this skill when:1314- Building autonomous AI agents15- Designing tool/function calling APIs16- Implementing permission and approval systems17- Creating browser automation for agents18- Designing human-in-the-loop workflows1920---2122## 1. Core Agent Architecture2324### 1.1 Agent Loop2526```27βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ28β AGENT LOOP β29β β30β ββββββββββββ ββββββββββββ ββββββββββββ β31β β Think βββββΆβ Decide βββββΆβ Act β β32β β (Reason) β β (Plan) β β (Execute)β β33β ββββββββββββ ββββββββββββ ββββββββββββ β34β β² β β35β β ββββββββββββ β β36β βββββββββββ Observe ββββββββββββ β37β β (Result) β β38β ββββββββββββ β39βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ40```4142```python43class AgentLoop:44 def __init__(self, llm, tools, max_iterations=50):45 self.llm = llm46 self.tools = {t.name: t for t in tools}47 self.max_iterations = max_iterations48 self.history = []4950 def run(self, task: str) -> str:51 self.history.append({"role": "user", "content": task})5253 for i in range(self.max_iterations):54 # Think: Get LLM response with tool options55 response = self.llm.chat(56 messages=self.history,57 tools=self._format_tools(),58 tool_choice="auto"59 )6061 # Decide: Check if agent wants to use a tool62 if response.tool_calls:63 for tool_call in response.tool_calls:64 # Act: Execute the tool65 result = self._execute_tool(tool_call)6667 # Observe: Add result to history68 self.history.append({69 "role": "tool",70 "tool_call_id": tool_call.id,71 "content": str(result)72 })73 else:74 # No more tool calls = task complete75 return response.content7677 return "Max iterations reached"7879 def _execute_tool(self, tool_call) -> Any:80 tool = self.tools[tool_call.name]81 args = json.loads(tool_call.arguments)82 return tool.execute(**args)83```8485### 1.2 Multi-Model Architecture8687```python88class MultiModelAgent:89 """90 Use different models for different purposes:91 - Fast model for planning92 - Powerful model for complex reasoning93 - Specialized model for code generation94 """9596 def __init__(self):97 self.models = {98 "fast": "gpt-3.5-turbo", # Quick decisions99 "smart": "gpt-4-turbo", # Complex reasoning100 "code": "claude-3-sonnet", # Code generation101 }102103 def select_model(self, task_type: str) -> str:104 if task_type == "planning":105 return self.models["fast"]106 elif task_type == "analysis":107 return self.models["smart"]108 elif task_type == "code":109 return self.models["code"]110 return self.models["smart"]111```112113---114115## 2. Tool Design Patterns116117### 2.1 Tool Schema118119```python120class Tool:121 """Base class for agent tools"""122123 @property124 def schema(self) -> dict:125 """JSON Schema for the tool"""126 return {127 "name": self.name,128 "description": self.description,129 "parameters": {130 "type": "object",131 "properties": self._get_parameters(),132 "required": self._get_required()133 }134 }135136 def execute(self, **kwargs) -> ToolResult:137 """Execute the tool and return result"""138 raise NotImplementedError139140class ReadFileTool(Tool):141 name = "read_file"142 description = "Read the contents of a file from the filesystem"143144 def _get_parameters(self):145 return {146 "path": {147 "type": "string",148 "description": "Absolute path to the file"149 },150 "start_line": {151 "type": "integer",152 "description": "Line to start reading from (1-indexed)"153 },154 "end_line": {155 "type": "integer",156 "description": "Line to stop reading at (inclusive)"157 }158 }159160 def _get_required(self):161 return ["path"]162163 def execute(self, path: str, start_line: int = None, end_line: int = None) -> ToolResult:164 try:165 with open(path, 'r') as f:166 lines = f.readlines()167168 if start_line and end_line:169 lines = lines[start_line-1:end_line]170171 return ToolResult(172 success=True,173 output="".join(lines)174 )175 except FileNotFoundError:176 return ToolResult(177 success=False,178 error=f"File not found: {path}"179 )180```181182### 2.2 Essential Agent Tools183184```python185CODING_AGENT_TOOLS = {186 # File operations187 "read_file": "Read file contents",188 "write_file": "Create or overwrite a file",189 "edit_file": "Make targeted edits to a file",190 "list_directory": "List files and folders",191 "search_files": "Search for files by pattern",192193 # Code understanding194 "search_code": "Search for code patterns (grep)",195 "get_definition": "Find function/class definition",196 "get_references": "Find all references to a symbol",197198 # Terminal199 "run_command": "Execute a shell command",200 "read_output": "Read command output",201 "send_input": "Send input to running command",202203 # Browser (optional)204 "open_browser": "Open URL in browser",205 "click_element": "Click on page element",206 "type_text": "Type text into input",207 "screenshot": "Capture screenshot",208209 # Context210 "ask_user": "Ask the user a question",211 "search_web": "Search the web for information"212}213```214215### 2.3 Edit Tool Design216217```python218class EditFileTool(Tool):219 """220 Precise file editing with conflict detection.221 Uses search/replace pattern for reliable edits.222 """223224 name = "edit_file"225 description = "Edit a file by replacing specific content"226227 def execute(228 self,229 path: str,230 search: str,231 replace: str,232 expected_occurrences: int = 1233 ) -> ToolResult:234 """235 Args:236 path: File to edit237 search: Exact text to find (must match exactly, including whitespace)238 replace: Text to replace with239 expected_occurrences: How many times search should appear (validation)240 """241 with open(path, 'r') as f:242 content = f.read()243244 # Validate245 actual_occurrences = content.count(search)246 if actual_occurrences != expected_occurrences:247 return ToolResult(248 success=False,249 error=f"Expected {expected_occurrences} occurrences, found {actual_occurrences}"250 )251252 if actual_occurrences == 0:253 return ToolResult(254 success=False,255 error="Search text not found in file"256 )257258 # Apply edit259 new_content = content.replace(search, replace)260261 with open(path, 'w') as f:262 f.write(new_content)263264 return ToolResult(265 success=True,266 output=f"Replaced {actual_occurrences} occurrence(s)"267 )268```269270---271272## 3. Permission & Safety Patterns273274### 3.1 Permission Levels275276```python277class PermissionLevel(Enum):278 # Fully automatic - no user approval needed279 AUTO = "auto"280281 # Ask once per session282 ASK_ONCE = "ask_once"283284 # Ask every time285 ASK_EACH = "ask_each"286287 # Never allow288 NEVER = "never"289290PERMISSION_CONFIG = {291 # Low risk - can auto-approve292 "read_file": PermissionLevel.AUTO,293 "list_directory": PermissionLevel.AUTO,294 "search_code": PermissionLevel.AUTO,295296 # Medium risk - ask once297 "write_file": PermissionLevel.ASK_ONCE,298 "edit_file": PermissionLevel.ASK_ONCE,299300 # High risk - ask each time301 "run_command": PermissionLevel.ASK_EACH,302 "delete_file": PermissionLevel.ASK_EACH,303304 # Dangerous - never auto-approve305 "sudo_command": PermissionLevel.NEVER,306 "format_disk": PermissionLevel.NEVER307}308```309310### 3.2 Approval UI Pattern311312```python313class ApprovalManager:314 def __init__(self, ui, config):315 self.ui = ui316 self.config = config317 self.session_approvals = {}318319 def request_approval(self, tool_name: str, args: dict) -> bool:320 level = self.config.get(tool_name, PermissionLevel.ASK_EACH)321322 if level == PermissionLevel.AUTO:323 return True324325 if level == PermissionLevel.NEVER:326 self.ui.show_error(f"Tool '{tool_name}' is not allowed")327 return False328329 if level == PermissionLevel.ASK_ONCE:330 if tool_name in self.session_approvals:331 return self.session_approvals[tool_name]332333 # Show approval dialog334 approved = self.ui.show_approval_dialog(335 tool=tool_name,336 args=args,337 risk_level=self._assess_risk(tool_name, args)338 )339340 if level == PermissionLevel.ASK_ONCE:341 self.session_approvals[tool_name] = approved342343 return approved344345 def _assess_risk(self, tool_name: str, args: dict) -> str:346 """Analyze specific call for risk level"""347 if tool_name == "run_command":348 cmd = args.get("command", "")349 if any(danger in cmd for danger in ["rm -rf", "sudo", "chmod"]):350 return "HIGH"351 return "MEDIUM"352```353354### 3.3 Sandboxing355356```python357class SandboxedExecution:358 """359 Execute code/commands in isolated environment360 """361362 def __init__(self, workspace_dir: str):363 self.workspace = workspace_dir364 self.allowed_commands = ["npm", "python", "node", "git", "ls", "cat"]365 self.blocked_paths = ["/etc", "/usr", "/bin", os.path.expanduser("~")]366367 def validate_path(self, path: str) -> bool:368 """Ensure path is within workspace"""369 real_path = os.path.realpath(path)370 workspace_real = os.path.realpath(self.workspace)371 return real_path.startswith(workspace_real)372373 def validate_command(self, command: str) -> bool:374 """Check if command is allowed"""375 cmd_parts = shlex.split(command)376 if not cmd_parts:377 return False378379 base_cmd = cmd_parts[0]380 return base_cmd in self.allowed_commands381382 def execute_sandboxed(self, command: str) -> ToolResult:383 if not self.validate_command(command):384 return ToolResult(385 success=False,386 error=f"Command not allowed: {command}"387 )388389 # Execute in isolated environment390 result = subprocess.run(391 command,392 shell=True,393 cwd=self.workspace,394 capture_output=True,395 timeout=30,396 env={397 **os.environ,398 "HOME": self.workspace, # Isolate home directory399 }400 )401402 return ToolResult(403 success=result.returncode == 0,404 output=result.stdout.decode(),405 error=result.stderr.decode() if result.returncode != 0 else None406 )407```408409---410411## 4. Browser Automation412413### 4.1 Browser Tool Pattern414415```python416class BrowserTool:417 """418 Browser automation for agents using Playwright/Puppeteer.419 Enables visual debugging and web testing.420 """421422 def __init__(self, headless: bool = True):423 self.browser = None424 self.page = None425 self.headless = headless426427 async def open_url(self, url: str) -> ToolResult:428 """Navigate to URL and return page info"""429 if not self.browser:430 self.browser = await playwright.chromium.launch(headless=self.headless)431 self.page = await self.browser.new_page()432433 await self.page.goto(url)434435 # Capture state436 screenshot = await self.page.screenshot(type='png')437 title = await self.page.title()438439 return ToolResult(440 success=True,441 output=f"Loaded: {title}",442 metadata={443 "screenshot": base64.b64encode(screenshot).decode(),444 "url": self.page.url445 }446 )447448 async def click(self, selector: str) -> ToolResult:449 """Click on an element"""450 try:451 await self.page.click(selector, timeout=5000)452 await self.page.wait_for_load_state("networkidle")453454 screenshot = await self.page.screenshot()455 return ToolResult(456 success=True,457 output=f"Clicked: {selector}",458 metadata={"screenshot": base64.b64encode(screenshot).decode()}459 )460 except TimeoutError:461 return ToolResult(462 success=False,463 error=f"Element not found: {selector}"464 )465466 async def type_text(self, selector: str, text: str) -> ToolResult:467 """Type text into an input"""468 await self.page.fill(selector, text)469 return ToolResult(success=True, output=f"Typed into {selector}")470471 async def get_page_content(self) -> ToolResult:472 """Get accessible text content of the page"""473 content = await self.page.evaluate("""474 () => {475 // Get visible text476 const walker = document.createTreeWalker(477 document.body,478 NodeFilter.SHOW_TEXT,479 null,480 false481 );482483 let text = '';484 while (walker.nextNode()) {485 const node = walker.currentNode;486 if (node.textContent.trim()) {487 text += node.textContent.trim() + '\\n';488 }489 }490 return text;491 }492 """)493 return ToolResult(success=True, output=content)494```495496### 4.2 Visual Agent Pattern497498```python499class VisualAgent:500 """501 Agent that uses screenshots to understand web pages.502 Can identify elements visually without selectors.503 """504505 def __init__(self, llm, browser):506 self.llm = llm507 self.browser = browser508509 async def describe_page(self) -> str:510 """Use vision model to describe current page"""511 screenshot = await self.browser.screenshot()512513 response = self.llm.chat([514 {515 "role": "user",516 "content": [517 {"type": "text", "text": "Describe this webpage. List all interactive elements you see."},518 {"type": "image", "data": screenshot}519 ]520 }521 ])522523 return response.content524525 async def find_and_click(self, description: str) -> ToolResult:526 """Find element by visual description and click it"""527 screenshot = await self.browser.screenshot()528529 # Ask vision model to find element530 response = self.llm.chat([531 {532 "role": "user",533 "content": [534 {535 "type": "text",536 "text": f"""537 Find the element matching: "{description}"538 Return the approximate coordinates as JSON: {{"x": number, "y": number}}539 """540 },541 {"type": "image", "data": screenshot}542 ]543 }544 ])545546 coords = json.loads(response.content)547 await self.browser.page.mouse.click(coords["x"], coords["y"])548549 return ToolResult(success=True, output=f"Clicked at ({coords['x']}, {coords['y']})")550```551552---553554## 5. Context Management555556### 5.1 Context Injection Patterns557558````python559class ContextManager:560 """561 Manage context provided to the agent.562 Inspired by Cline's @-mention patterns.563 """564565 def __init__(self, workspace: str):566 self.workspace = workspace567 self.context = []568569 def add_file(self, path: str) -> None:570 """@file - Add file contents to context"""571 with open(path, 'r') as f:572 content = f.read()573574 self.context.append({575 "type": "file",576 "path": path,577 "content": content578 })579580 def add_folder(self, path: str, max_files: int = 20) -> None:581 """@folder - Add all files in folder"""582 for root, dirs, files in os.walk(path):583 for file in files[:max_files]:584 file_path = os.path.join(root, file)585 self.add_file(file_path)586587 def add_url(self, url: str) -> None:588 """@url - Fetch and add URL content"""589 response = requests.get(url)590 content = html_to_markdown(response.text)591592 self.context.append({593 "type": "url",594 "url": url,595 "content": content596 })597598 def add_problems(self, diagnostics: list) -> None:599 """@problems - Add IDE diagnostics"""600 self.context.append({601 "type": "diagnostics",602 "problems": diagnostics603 })604605 def format_for_prompt(self) -> str:606 """Format all context for LLM prompt"""607 parts = []608 for item in self.context:609 if item["type"] == "file":610 parts.append(f"## File: {item['path']}\n```\n{item['content']}\n```")611 elif item["type"] == "url":612 parts.append(f"## URL: {item['url']}\n{item['content']}")613 elif item["type"] == "diagnostics":614 parts.append(f"## Problems:\n{json.dumps(item['problems'], indent=2)}")615616 return "\n\n".join(parts)617````618619### 5.2 Checkpoint/Resume620621```python622class CheckpointManager:623 """624 Save and restore agent state for long-running tasks.625 """626627 def __init__(self, storage_dir: str):628 self.storage_dir = storage_dir629 os.makedirs(storage_dir, exist_ok=True)630631 def save_checkpoint(self, session_id: str, state: dict) -> str:632 """Save current agent state"""633 checkpoint = {634 "timestamp": datetime.now().isoformat(),635 "session_id": session_id,636 "history": state["history"],637 "context": state["context"],638 "workspace_state": self._capture_workspace(state["workspace"]),639 "metadata": state.get("metadata", {})640 }641642 path = os.path.join(self.storage_dir, f"{session_id}.json")643 with open(path, 'w') as f:644 json.dump(checkpoint, f, indent=2)645646 return path647648 def restore_checkpoint(self, checkpoint_path: str) -> dict:649 """Restore agent state from checkpoint"""650 with open(checkpoint_path, 'r') as f:651 checkpoint = json.load(f)652653 return {654 "history": checkpoint["history"],655 "context": checkpoint["context"],656 "workspace": self._restore_workspace(checkpoint["workspace_state"]),657 "metadata": checkpoint["metadata"]658 }659660 def _capture_workspace(self, workspace: str) -> dict:661 """Capture relevant workspace state"""662 # Git status, file hashes, etc.663 return {664 "git_ref": subprocess.getoutput(f"cd {workspace} && git rev-parse HEAD"),665 "git_dirty": subprocess.getoutput(f"cd {workspace} && git status --porcelain")666 }667```668669---670671## 6. MCP (Model Context Protocol) Integration672673### 6.1 MCP Server Pattern674675```python676from mcp import Server, Tool677678class MCPAgent:679 """680 Agent that can dynamically discover and use MCP tools.681 'Add a tool that...' pattern from Cline.682 """683684 def __init__(self, llm):685 self.llm = llm686 self.mcp_servers = {}687 self.available_tools = {}688689 def connect_server(self, name: str, config: dict) -> None:690 """Connect to an MCP server"""691 server = Server(config)692 self.mcp_servers[name] = server693694 # Discover tools695 tools = server.list_tools()696 for tool in tools:697 self.available_tools[tool.name] = {698 "server": name,699 "schema": tool.schema700 }701702 async def create_tool(self, description: str) -> str:703 """704 Create a new MCP server based on user description.705 'Add a tool that fetches Jira tickets'706 """707 # Generate MCP server code708 code = self.llm.generate(f"""709 Create a Python MCP server with a tool that does:710 {description}711712 Use the FastMCP framework. Include proper error handling.713 Return only the Python code.714 """)715716 # Save and install717 server_name = self._extract_name(description)718 path = f"./mcp_servers/{server_name}/server.py"719720 with open(path, 'w') as f:721 f.write(code)722723 # Hot-reload724 self.connect_server(server_name, {"path": path})725726 return f"Created tool: {server_name}"727```728729---730731## Best Practices Checklist732733### Agent Design734735- [ ] Clear task decomposition736- [ ] Appropriate tool granularity737- [ ] Error handling at each step738- [ ] Progress visibility to user739740### Safety741742- [ ] Permission system implemented743- [ ] Dangerous operations blocked744- [ ] Sandbox for untrusted code745- [ ] Audit logging enabled746747### UX748749- [ ] Approval UI is clear750- [ ] Progress updates provided751- [ ] Undo/rollback available752- [ ] Explanation of actions753754---755756## Resources757758- [Cline](https://github.com/cline/cline)759- [OpenAI Codex](https://github.com/openai/codex)760- [Model Context Protocol](https://modelcontextprotocol.io/)761- [Anthropic Tool Use](https://docs.anthropic.com/claude/docs/tool-use)762
Full transparency β inspect the skill content before installing.