Practitioner-focused workflows for running, validating, and packaging reproducible open AI governance studies. This repository contains execution pipelines, source artifacts, and validation gates for reproducible governance reports. Every headline finding is expected to map to: - a versioned artifact - a deterministic query in the claims ledger - a reproducible execution path in pipelines/ Use thi
Add this skill
npx mdskills install Clyra-AI/safetyWell-structured research reproducibility framework with detailed execution pipelines and validation gates
Practitioner-focused workflows for running, validating, and packaging reproducible open AI governance studies.
This repository contains execution pipelines, source artifacts, and validation gates for reproducible governance reports.
Every headline finding is expected to map to:
pipelines/Use this repo to run controlled experiments end-to-end and produce auditable report artifacts.
pipelines/openclaw/bootstrap_tools.shpipelines/openclaw/run.sh --run-id --dry-runenv -u OPENAI_API_KEY -u ANTHROPIC_API_KEY -u GEMINI_API_KEY pipelines/openclaw/run.sh --run-id --execution container --workload live --lane-duration-sec 86400 --scenario-set core5 --max-runtime-sec 172800 --max-run-disk-mb 65536pipelines/openclaw/validate.sh --run-id --strictpipelines/openclaw/promote_run_artifacts.sh --run-id pipelines/openclaw/publish_pack.sh --run-id --include-raw-archiveReports are currently in progress (not yet published from this repo):
| Report | Status | Report Folder |
|---|---|---|
| 100% Post-Stop Execution Rate (Baseline Lane): A Governed Evaluation of OpenClaw Agent Behavior | Release candidate (artifact-complete) | reports/openclaw-2026/ |
| The State of AI Tool Sprawl, Q1 2026 | In progress | reports/ai-tool-sprawl-q1-2026/ |
The research workflow uses two open-source tools:
Deterministic contract:
See report-specific methodology:
Pre-registration controls:
reports/openclaw-2026/preregistration.mdreports/ai-tool-sprawl-q1-2026/preregistration.mdinternal/headline_rubric.md (headline selection and scoring contract)AGENTS.md: operating rules and quality bar for AI agents in this repositoryCITATION.cff: citation metadata for researchers and analystsdocs/: GitHub Pages index and per-report pagesreports/: report packages, definitions, protocols, data dictionariesruns/: immutable run outputs keyed by report and run IDpipelines/: run, validation, threshold, and packaging scriptsclaims/: claim ledgers mapping metrics to artifact/query pairsschemas/: schema contractscitations/: source logs for timeline and regulatory claims.runtime-cache/: ignored local cache for OpenClaw live runtime bootstrap artifactsRun IDs are immutable by default:
pipelines/*/run.sh --run-id --dry-runpipelines/*/run.sh --run-id pipelines/*/run.sh --run-id --resumeExecution behavior:
pipelines/openclaw/run.sh executes dual lanes, derives summaries, writes claim-value + threshold-evaluation artifacts, and emits reproducibility metadata.pipelines/sprawl/run.sh executes campaign scans, builds aggregate/appendix artifacts, writes claim-value artifacts, and emits reproducibility metadata.pipelines/sprawl/calibrate_detectors.sh generates non-source_repo detector coverage artifacts and optional gold-label precision/recall evaluation.pipelines/sprawl/generate_targets.sh builds reproducible open-source owner/repo target sets (for example --total 101) and writes internal/repos.md plus internal/repos_candidates.csv.Recommended sprawl order:
pipelines/sprawl/calibrate_detectors.sh --run-id --strictsprawl_non_source_recall_exists_pct >= 60.0 (see pipelines/config/calibration-thresholds.json)If a run ID already exists, run.sh fails fast unless --resume is explicitly provided.
For OpenClaw live container runs, keep provider API key env vars unset unless running an explicit exception mode (ALLOW_EXTERNAL_SECRETS=1).
Readiness checks:
pipelines/openclaw/validate.shpipelines/sprawl/validate.shStrict publish readiness:
pipelines/openclaw/validate.sh --run-id --strictpipelines/sprawl/validate.sh --run-id --strictCommon gates:
pipelines/common/claim_gates.shpipelines/common/citation_gates.shpipelines/common/threshold_gate.shpipelines/common/metric_coverage_gate.shpipelines/common/derive_claim_values.shpipelines/common/evaluate_claim_values.shpipelines/common/hash_manifest.shIn strict mode, unresolved TBD markers in citation logs fail validation.
For OpenClaw runs, strict validation also fails if promoted artifacts contain machine-specific absolute paths or if manuscript/press-pack example timestamps cannot be resolved in promoted anecdotes.json.
Each report is published in two formats:
research-pack: full technical package (report source, methodology, claims, citations, run artifacts)press-pack: media-friendly package (media brief, methods-at-a-glance, stat-card copy)pipelines/*/publish_pack.sh builds both under:
runs///artifacts/publish-pack/research-pack/runs///artifacts/publish-pack/press-pack/Run directories are intentionally ignored in git by default. After a clean run, promote only canonical reproducibility artifacts into a tracked path:
pipelines/openclaw/promote_run_artifacts.sh --run-id Systemic rule for sprawl:
runs/tool-sprawl/sprawl-* is treated as local execution workspace only and must not be committed.reports//data/runs//) when explicitly promoting a release set.Default destination:
reports/openclaw-2026/data/runs//Optional full raw archive (for release upload):
pipelines/openclaw/promote_run_artifacts.sh --run-id --raw-archive-out runs/openclaw//artifacts/openclaw--full-run.tar.gzPackaged research/press bundles can include a full raw archive:
pipelines/openclaw/publish_pack.sh --run-id --include-raw-archiveRelease CI:
.github/workflows/openclaw-release-bundle.ymlrun_id after promoted artifacts are committed.Canonical manuscript source lives under each report's manuscript/ directory.
Preferred source format is Markdown or LaTeX.
Recommended deterministic build commands:
pipelines/common/build_report_pdf.sh --report-dir reports/pandoc reports//manuscript/report.md --pdf-engine=xelatex --include-in-header=reports//manuscript/pdf-header.tex -V geometry:margin=1in -V colorlinks=true -V linkcolor=blue -V urlcolor=blue -V citecolor=blue -V filecolor=blue -o reports//report.pdflatexmk -pdf -interaction=nonstopmode reports//manuscript/report.texHeader policy:
manuscript/pdf-header.tex for portable font/code-block styling.Helvetica Neue, Menlo).hidelinks regressions.Build output paths:
reports/openclaw-2026/report.pdfreports/ai-tool-sprawl-q1-2026/report.pdfSee CONTRIBUTING.md for contribution standards and validation expectations.
Split license model:
See LICENSE and LICENSES/ for details.
Install via CLI
npx mdskills install Clyra-AI/safetyClyra AI Safety Initiative (CAISI) Research Repo is a free, open-source AI agent skill. Practitioner-focused workflows for running, validating, and packaging reproducible open AI governance studies. This repository contains execution pipelines, source artifacts, and validation gates for reproducible governance reports. Every headline finding is expected to map to: - a versioned artifact - a deterministic query in the claims ledger - a reproducible execution path in pipelines/ Use thi
Install Clyra AI Safety Initiative (CAISI) Research Repo with a single command:
npx mdskills install Clyra-AI/safetyThis downloads the skill files into your project and your AI agent picks them up automatically.
Clyra AI Safety Initiative (CAISI) Research Repo works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.