Ever searched for authenticate and gotten 200 results from config files, comments, and test stubs before finding the actual implementation? cs fixes that. It combines the speed of CLI tools with the relevance ranking usually reserved for heavy indexed search engines like Sourcegraph or Zoekt, but without needing to maintain an index. Licensed under MIT. [//]: ([![asciicast]&40;https://asciinema.or
Add this skill
npx mdskills install boyter/csComprehensive code search tool with structural awareness but lacks agent integration as skill documentation
1# codespelunker (cs)23### CLI code search tool that understands code structure and ranks results by relevance. No indexing required45Ever searched for `authenticate` and gotten 200 results from config files, comments, and test stubs before finding the actual implementation? `cs` fixes that.67It combines the speed of CLI tools with the relevance ranking usually reserved for heavy indexed search engines8like Sourcegraph or Zoekt, but without needing to maintain an index.910```shell11cs "authenticate" --gravity=brain # Find the complex implementation, not the interface12cs "FIXME OR TODO OR HACK" --only-comments # Search only in comments, not code or strings13cs "error" --only-strings # Find where error messages are defined14cs "handleRequest" --only-declarations # Jump straight to where it's defined15cs "handleRequest" --only-usages # Find every call site, skip the definition16cs "error" --dedup # Collapse duplicated matches into one result17```1819Licensed under MIT.2021[](https://goreportcard.com/report/github.com/boyter/cs)22[](https://coveralls.io/github/boyter/cs?branch=master)23[](https://github.com/boyter/cs/)2425[//]: # ([](https://asciinema.org/a/589640))2627cs TUI demo2829<https://github.com/user-attachments/assets/3b7f4bb2-d542-406d-9c53-29c0430dd60a>3031### Pitch: Why use cs?3233Most search tools treat code as plain text. `cs` doesn't.3435It parses every file on the fly to understand what is a comment, what is a string, and what is code36then uses that structure to rank by relevance, not just list them by occurrence.3738```shell39cs "authenticate" # BM25-ranked variant results, best match first40cs "authenticate" --gravity=brain # Boost complex implementations over interfaces41cs "TODO" --only-comments # Only matches inside comments42cs "error" --only-strings # Only matches inside string literals43cs "config OR setup" lang:go # Boolean queries with language filters44cs "handleRequest" --only-declarations # Jump to definitions (func, class, def, etc.)45cs "config" --dedup # Collapse byte-identical matches46```4748It comes with a TUI and HTTP mode for Interactive Exploration.4950```shell51cs # enter TUI mode from this location52cs -d # enter HTTP mode on port 8080 by default53```5455#### What makes it different from ripgrep or grep?5657`ripgrep` is a fast text matcher. It finds lines and prints them. It's the *best* at what it does.5859`cs` is a search engine. It finds files, ranks by relevance, extracts the best snippet,60and shows the most relevant results. Think Sourcegraph-quality ranked search as a CLI tool, no index required.6162They solve different problems. You'll probably want both.6364#### Key capabilities6566- Structural Awareness: A match in code ranks higher than the same word in a comment (1.0 vs 0.2) - and it's configurable. Or filter strictly: `--only-code`, `--only-comments`, `--only-strings`.67- Complexity Gravity: Uses [cyclomatic complexity](https://en.wikipedia.org/wiki/Cyclomatic_complexity) as a ranking signal. Searching for `Authenticate`? The complex implementation file ranks above the interface definition. (`--gravity=brain`)68- Smart Ranking: BM25 relevance scoring, file-location boosting, noise penalty for data blobs, and automatic test-file dampening — all on the fly with no pre-built index.69- Multiple interfaces: Console output, a built-in TUI, an HTTP server with syntax highlighting, or an MCP server for LLM tooling.7071### Key Features7273#### Structural Filtering7475Stop grepping through false positives.7677```shell78cs "database" --only-code # Ignore matches in comments/docs79cs "FIXME" --only-comments # Ignore matches in code/strings80cs "error" --only-strings # Find where error messages are defined81cs "handleRequest" --only-declarations # Jump to where it's defined (func, class, def, etc.)82cs "handleRequest" --only-usages # Every call site, skipping the definition83```8485These are mutually exclusive with `--only-code`, `--only-comments`, and `--only-strings`.8687The structural ranker also uses declaration detection to boost matches that appear on declaration lines88(e.g. `func`, `class`, `def`) over plain usages. This currently works for the following languages:8990Go, Python, JavaScript, TypeScript, TSX, Rust, Java, C, C++, C#, Ruby, PHP, Kotlin, Swift,91Shell, Lua, Scala, Elixir, Haskell, Perl, Zig, Dart, Julia, Clojure, Erlang, Groovy, OCaml,92MATLAB, Powershell, Nim, Crystal, V9394For unsupported languages, all matches are treated as usages and ranked by text relevance only.95Structural filtering (`--only-code`, `--only-comments`, `--only-strings`) still works for any language recognised96by [scc](https://github.com/boyter/scc).9798#### Complexity Gravity99100Find where the work happens.101102```shell103cs "login" --gravity=brain # Boosts complex files (the implementation)104cs "login" --gravity=low # Boosts simple files (configs/interfaces)105```106107#### Deduplication108109Collapse byte-identical matches into a single result.110111```shell112cs "Copyright" --dedup # One result per unique copyright notice113cs "error" --dedup # Skip vendored/copied duplicates114```115116**Smart Ranking**117Results are sorted by BM25 (relevance), dampened by file length, and boosted by code structure. Some effort to dampen118test files (when you are not looking for them) is taken into account as well.119120**Non-Smart Ranking**121You can switch the ranking algorithm to pure BM25, TFIDF, or simple most match ranking on the fly.122123### Install124125If you want to create a package to install, please make it. Let me know, and I will ensure I add it here.126127#### Go Get128129If you have Go >= 1.25.2 installed130131`go install github.com/boyter/cs/v3@latest`132133#### Nixos134135`nix-shell -p codespelunker`136137<https://github.com/NixOS/nixpkgs/pull/236073>138139#### Manual140141Binaries for Windows, GNU/Linux, and macOS are available from the [releases](https://github.com/boyter/cs/releases) page.142143### FAQ144145#### Is this as fast as146147No.148149#### You didn't let me finish, I was going to ask if it's as fast as150151The answer is probably no. It's not directly comparable. No other tool I know of works like this outside of full152indexing tools such as hound, searchcode, sourcegraph etc... None work on the fly like this does.153154As far as I know what `cs` does is unique for a command line tool.155156`cs` runs a full lexical analysis and complexity calculation from [scc](https://github.com/boyter/scc) on every matching file.157This is expensive compared to the raw byte-scanning of `ripgrep`, but probably not as slow as you may think.158159On a modern machine (such as Apple Silicon M1), it can search and rank the entire Linux kernel source in ~3 seconds.160Using a 9950x3D it can search the kernel in ~400 milliseconds.161162#### Does it work on normal documents?163164So long as they are text. I wrote it to search code, but it works just as well on full text documents. The snippet165extraction, for example, was tested on Pride and Prejudice, a text I know more about than I probably should considering I'm male.166167#### Where is the index?168169There is none. Everything is brute force calculated on the fly. There is some caching to speed things up, but should in170practice never affect the results.171172#### How does the ranking work?173174`cs` uses a weighted BM25 algorithm.175176Standard BM25 weights matches based on "fields" (so title, body, category). `cs` generates fields dynamically177by parsing the code syntax.178179- A match in code gets full weight (1.0).180- A match in a string gets partial weight (0.5).181- A match in a comment gets lower weight (0.2).182183This means a file where your search term appears in the logic will rank higher than a file where the term only appears184in the documentation, even if the word count is the same.185186You can tweak the values as needed via the CLI, or on the fly change what fields `cs` searches.187188#### What is complexity gravity?189190Complexity gravity is a ranking boost that uses each file's cyclomatic complexity to influence result ordering.191192In code search, the best result is usually where the logic is implemented. These files usually have higher193algorithmic density (branches, loops, conditions). `cs` uses this so implementation files generally outrank194data/config/interface files all things being equal.195196The `--gravity` flag accepts named intent:197198| Intent | Strength | Purpose |199|-----------|----------|-----------------------------------------|200| `brain` | 2.5 | Aggressively surface complex core logic |201| `logic` | 1.5 | Standard boost toward complex code |202| `default` | 1.0 | Balanced (applied when flag not set) |203| `low` | 0.2 | Flatten gravity, find simple boilerplate|204| `off` | 0.0 | Pure text relevance, no complexity boost|205206```shell207cs --gravity=brain "search term" # find complex implementations208cs --gravity=off "search term" # pure text relevance209```210211#### How do you get the snippets?212213It's not fun... see <https://github.com/boyter/cs/blob/master/pkg/snippet/snippet.go> and <https://github.com/boyter/cs/blob/master/pkg/snippet/snippet_lines.go>214215It works by passing the document content to extract the snippet from and all the match locations for each term.216It then looks through each location for each word, and checks on either side looking for terms close to it.217It then ranks on the term frequency for the term we are checking around and rewards rarer terms.218It also rewards more matches, closer matches, exact case matches, and matches that are whole words.219220For more info read the "Snippet Extraction AKA I am PHP developer" section of this blog post <https://boyter.org/posts/abusing-aws-to-make-a-search-engine/>221222#### What does HTTP mode look like?223224It's a little brutalist.225226<img alt="cs http" src="https://github.com/boyter/cs/raw/master/cs_http.png">227228You can change its look and feel using `--template-style` for built-in themes (`dark`, `light`, `bare`), or provide229custom templates with `--template-display` and `--template-search`. See <https://github.com/boyter/cs/tree/master/asset/templates>230for example templates you can use to modify the look and feel.231232```shell233cs -d --template-style light234cs -d --template-display ./asset/templates/display.tmpl --template-search ./asset/templates/search.tmpl235```236237### Usage238239Command line usage of `cs` is designed to be as simple as possible.240Full details can be found in `cs --help` or `cs -h`. Note that the below reflects the state of master not a release, as such241features listed below may be missing from your installation.242243```244$ cs -h245code spelunker (cs) code search.246Version 3.1.0247Ben Boyter <ben@boyter.org>248249cs recursively searches the current directory using some boolean logic250optionally combined with regular expressions.251252Works via command line where passed in arguments are the search terms253or in a TUI mode with no arguments. Can also run in HTTP mode with254the -d or --http-server flag.255256Searches by default use AND boolean syntax for all terms257 - exact match using quotes "find this"258 - fuzzy match within 1 or 2 distance fuzzy~1 fuzzy~2259 - negate using NOT such as pride NOT prejudice260 - OR syntax such as catch OR throw261 - group with parentheses (cat OR dog) NOT fish262 - note: NOT binds to next term, use () with OR263 - regex with toothpick syntax /pr[e-i]de/264265Searches can filter which files are searched by adding266the following syntax267 - file:test (substring match on filename)268 - filename:.go (substring match on filename)269 - path:pkg/search (substring match on full file path)270271Example search that uses all current functionality272 - darcy NOT collins wickham~1 "ten thousand a year" /pr[e-i]de/ file:test path:pkg273274The default input field in tui mode supports some nano commands275- CTRL+a move to the beginning of the input276- CTRL+e move to the end of the input277- CTRL+k to clear from the cursor location forward278279- F1 cycle ranker (simple/tfidf/bm25/structural)280- F2 cycle code filter (default/only-code/only-comments/only-strings/only-declarations/only-usages)281- F3 cycle gravity (off/low/default/logic/brain)282- F4 cycle noise (silence/quiet/default/loud/raw)283284Usage:285 cs [flags]286287Flags:288 --address string address and port to listen on (default ":8080")289 -A, --after-context int lines of context after each match (grep mode)290 -B, --before-context int lines of context before each match (grep mode)291 --binary set to disable binary file detection and search binary files292 -c, --case-sensitive make the search case sensitive293 --color string color output mode [auto, always, never] (default "auto")294 -C, --context int lines of context before and after each match (grep mode)295 --cpu-profile string write CPU profile to file (for use with go tool pprof or PGO)296 --dedup collapse byte-identical search matches, keeping the highest-scored representative297 --dir string directory to search, if not set defaults to current working directory298 --exclude-dir strings directories to exclude (default [.git,.hg,.svn])299 -x, --exclude-pattern strings file and directory locations matching case sensitive patterns will be ignored [comma separated list: e.g. vendor,_test.go]300 -r, --find-root attempts to find the root of this repository by traversing in reverse looking for .git or .hg301 -f, --format string set output format [text, json, vimgrep] (default "text")302 --gravity string complexity gravity intent: brain (2.5), logic (1.5), default (1.0), low (0.2), off (0.0) (default "default")303 -h, --help help for cs304 --hidden include hidden files305 -d, --http-server start the HTTP server306 -i, --include-ext strings limit to file extensions (N.B. case sensitive) [comma separated list: e.g. go,java,js,C,cpp]307 --line-limit int max matching lines per file in grep mode (-1 = unlimited) (default -1)308 --max-read-size-bytes int number of bytes to read into a file with the remaining content ignored (default 1000000)309 --mcp start as an MCP (Model Context Protocol) server over stdio310 --min include minified files311 --min-line-length int number of bytes per average line for file to be considered minified (default 255)312 --no-gitignore disables .gitignore file logic313 --no-ignore disables .ignore file logic314 --no-syntax disable syntax highlighting in output315 --noise string noise penalty intent: silence (0.1), quiet (0.5), default (1.0), loud (2.0), raw (off) (default "default")316 --only-code only rank matches in code (auto-selects structural ranker)317 --only-comments only rank matches in comments (auto-selects structural ranker)318 --only-declarations only show matches on declaration lines (func, type, var, const, class, def, etc.)319 --only-strings only rank matches in string literals (auto-selects structural ranker)320 --only-usages only show matches on usage lines (excludes declarations)321 -o, --output string output filename (default stdout)322 --ranker string set ranking algorithm [simple, tfidf, bm25, structural] (default "structural")323 --result-limit int maximum number of results to return (-1 for unlimited) (default -1)324 --reverse reverse the result order325 -s, --snippet-count int number of snippets to display (default 1)326 -n, --snippet-length int size of the snippet to display (default 300)327 --snippet-mode string snippet extraction mode: auto, snippet, lines, or grep (default "auto")328 --template-display string path to a custom display template329 --template-search string path to a custom search template330 --template-style string built-in theme for the HTTP server UI [dark, light, bare] (default "dark")331 --test-penalty float score multiplier for test files when query has no test intent (0.0-1.0, 1.0=disabled) (default 0.4)332 -t, --type strings limit to language types [comma separated list: e.g. Go,Java,Python]333 -v, --version version for cs334 --weight-code float structural ranker: weight for matches in code (default 1.0) (default 1)335 --weight-comment float structural ranker: weight for matches in comments (default 0.2) (default 0.2)336 --weight-string float structural ranker: weight for matches in strings (default 0.5) (default 0.5)337```338339Searches work on single or multiple words with a logical AND applied between them. You can negate with NOT before a term.340You can combine terms with OR and use parentheses to control grouping.341You can do an exact match with quotes and do regular expressions using toothpicks.342343Example searches,344345```shell346cs t NOT something test~1 "ten thousand a year" "/pr[e-i]de/" file:test347cs (cat OR dog) AND NOT bird348cs path:vendor main # search only under vendor/349cs "func main" path:cmd # find main functions under cmd/350cs handler lang:go # search only Go files351cs TODO lang:go,python # search Go and Python files352cs NOT lang:go test # search all languages except Go353cs handler complexity:>=50 # find complex files containing "handler"354cs "json" --only-code # find "json" in code, ignoring string literals355cs "hack" --only-comments # find "hack" in comments only356cs "func main" --only-declarations # find main function declarations357cs "logger" --only-usages # find where logger is called, not defined358cs "Copyright" --dedup # collapse identical copyright headers359```360361You can use it in a similar manner to `fzf` in TUI mode if you like, since `cs` will return the matching document path362if you hit the enter key one you have highlighted a result.363364```shell365cat `cs` # cat out the matching file366vi `cs` # edit the selected file367```368369### MCP Server Mode370371`cs` can run as an [MCP (Model Context Protocol)](https://modelcontextprotocol.io/) server over stdio, allowing LLM372tools like Claude Desktop, Claude Code, Cursor, and others to use it as a code search tool.373374```shell375cs --mcp --dir /path/to/codebase376```377378#### Claude Desktop Configuration379380Add to your `claude_desktop_config.json`:381382```json383{384 "mcpServers": {385 "codespelunker": {386 "command": "/path/to/cs",387 "args": ["--mcp", "--dir", "/path/to/codebase"]388 }389 }390}391```392393#### Claude Code Configuration394395Add to your `.mcp.json`:396397```json398{399 "mcpServers": {400 "codespelunker": {401 "command": "/path/to/cs",402 "args": ["--mcp", "--dir", "/path/to/codebase"]403 }404 }405}406```407408#### Exposed Tools409410The MCP server exposes two tools:411412**`search`** — Search code files recursively with relevance ranking.413414| Parameter | Type | Required | Description |415|---|---|---|---|416| `query` | string | yes | Search query (supports boolean logic, quotes, regex, fuzzy) |417| `max_results` | number | no | Maximum results to return (default 20) |418| `snippet_length` | number | no | Snippet size in characters |419| `case_sensitive` | boolean | no | Case sensitive search |420| `include_ext` | string | no | Comma-separated file extensions (e.g. `go,js,py`) |421| `language` | string | no | Comma-separated language types (e.g. `Go,Python`) |422| `gravity` | string | no | Complexity gravity intent: `brain`, `logic`, `default`, `low`, `off` |423424Results are returned as JSON with the same fields as `--format json`: filename, location, score, snippet content, match locations, language, and code statistics.425426**`get_file`** — Read the contents of a file within the project directory.427428| Parameter | Type | Required | Description |429|---|---|---|---|430| `path` | string | yes | File path relative to the project directory, or absolute path within the project |431| `start_line` | number | no | 1-based start line number (reads from beginning if omitted) |432| `end_line` | number | no | 1-based end line number, inclusive (reads to end if omitted) |433434Returns JSON with line-numbered file content and, for recognised source files, language, lines, code, comment, blank, and complexity fields.435436### Support437438Using `cs` commercially? If you want priority support for `cs` you can purchase a years worth <https://boyter.gumroad.com/l/vvmyi> which entitles you to priority direct email support from the developer.439440If not, raise a bug report... or don't. I'm not the boss of you.441
Full transparency — inspect the skill content before installing.