How do I install Sloc Cloc and Code (scc)?

Install Sloc Cloc and Code (scc) with a single command: npx mdskills install boyter/scc. This downloads the skill files into your project and your AI agent picks them up automatically.
What platforms support Sloc Cloc and Code (scc)?

Sloc Cloc and Code (scc) works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.
← Back to skills
Sloc Cloc and Code (scc)

Name: Sloc Cloc and Code (scc): AI Agent Skill
Brand: boyter
Availability: InStock
Rating: 6.8 (1 reviews)
Author: boyter
AI & Machine LearningIntermediate
A tool similar to cloc, sloccount and tokei. For counting the lines of code, blank lines, comment lines, and physical lines of source code in many programming languages. Goal is to be the fastest code counter possible, but also perform COCOMO calculation like sloccount, LOCOMO estimation for LLM-based development costs, estimate code complexity similar to cyclomatic complexity calculators and prod
by @boyter1 downloads0Updated 3/10/2026
Add this skill
npx mdskills install boyter/scc
Fork & Edit
Skill Advisor6.8
Well-documented code counting tool with comprehensive installation guides and multiple output formats
+Provides extensive installation options across multiple package managers and platforms
+Clearly documents unique capabilities like COCOMO/LOCOMO calculations and complexity estimates
+Includes performance comparisons and detailed feature explanations
-Lacks agent-specific instructions or integration guidance for AI coding workflows
-Requests broad permissions (network, shell, write) without clear justification in documentation
SKILL.md
Edit in Browser
1# Sloc Cloc and Code (scc)
2 
3![SCC illustration](./scc.jpg)
4 
5A tool similar to cloc, sloccount and tokei. For counting the lines of code, blank lines, comment lines, and physical lines of source code in many programming languages.
6 
7Goal is to be the fastest code counter possible, but also perform COCOMO calculation like sloccount, LOCOMO estimation for LLM-based development costs, estimate code complexity similar to cyclomatic complexity calculators and produce unique lines of code or DRYness metrics. In short one tool to rule them all.
8 
9Also it has a very short name which is easy to type `scc`.
10 
11If you don't like sloc cloc and code feel free to use the name `Succinct Code Counter`.
12 
13[![Go](https://github.com/boyter/scc/actions/workflows/go.yml/badge.svg)](https://github.com/boyter/scc/actions/workflows/go.yml)
14[![Go Report Card](https://goreportcard.com/badge/github.com/boyter/scc)](https://goreportcard.com/report/github.com/boyter/scc)
15[![Coverage Status](https://coveralls.io/repos/github/boyter/scc/badge.svg?branch=master)](https://coveralls.io/github/boyter/scc?branch=master)
16[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/)](https://github.com/boyter/scc/)
17![Scc count downloads](https://img.shields.io/github/downloads/boyter/scc/total?label=downloads%20%28GH%29)
18[![Mentioned in Awesome Go](https://awesome.re/mentioned-badge.svg)](https://github.com/avelino/awesome-go)
19 
20Licensed under MIT licence.
21 
22## Table of Contents
23 
24- [Install](#install)
25- [Background](#background)
26- [Pitch](#pitch)
27- [Usage](#usage)
28- [Complexity Estimates](#complexity-estimates)
29- [Unique Lines of Code (ULOC)](#unique-lines-of-code-uloc)
30- [COCOMO](#cocomo)
31- [LOCOMO](#locomo)
32- [Output Formats](#output-formats)
33- [Performance](#performance)
34- [Development](#development)
35- [Adding/Modifying Languages](#addingmodifying-languages)
36- [Issues](#issues)
37- [Badges](#badges)
38- [Language Support](LANGUAGES.md)
39- [Citation](#citation)
40 
41### scc for Teams & Enterprise
42 
43While scc will always be a free and tool for individual developers, companies and businesses, we are exploring an enhanced version designed for teams and businesses. scc Enterprise will build on the core scc engine to provide historical analysis, team-level dashboards, and policy enforcement to help engineering leaders track code health, manage technical debt, and forecast project costs.
44 
45We are currently gathering interest for a private beta. If you want to visualize your codebase's evolution, integrate quality gates into your CI/CD pipeline, and get a big-picture view across all your projects,
46sign up for the early access list [here](https://docs.google.com/forms/d/e/1FAIpQLScIBKy3y2m0rKu89L67qwe26Xyn9Scu0gW-HQX9lC0qEAx9nQ/viewform)
47 
48### Install
49 
50#### Go Install
51 
52You can install `scc` by using the standard go toolchain.
53 
54To install the latest stable version of scc:
55 
56`go install github.com/boyter/scc/v3@latest`
57 
58To install a development version:
59 
60`go install github.com/boyter/scc/v3@master`
61 
62Note that `scc` needs go version >= 1.25.
63 
64#### Snap
65 
66A [snap install](https://snapcraft.io/scc) exists thanks to [Ricardo](https://feliciano.tech/).
67 
68`$ sudo snap install scc`
69 
70*NB* Snap installed applications cannot run outside of `/home` <https://askubuntu.com/questions/930437/permission-denied-error-when-running-apps-installed-as-snap-packages-ubuntu-17> so you may encounter issues if you use snap and attempt to run outside this directory.
71 
72#### Homebrew
73 
74Or if you have [Homebrew](https://brew.sh/) installed
75 
76`$ brew install scc`
77 
78#### Fedora
79 
80Fedora Linux users can use a [COPR repository](https://copr.fedorainfracloud.org/coprs/lihaohong/scc/):
81 
82`$ sudo dnf copr enable lihaohong/scc && sudo dnf install scc`
83 
84#### MacPorts
85 
86On macOS, you can also install via [MacPorts](https://www.macports.org)
87 
88`$ sudo port install scc`
89 
90#### Scoop
91 
92Or if you are using [Scoop](https://scoop.sh/) on Windows
93 
94`$ scoop install scc`
95 
96#### Chocolatey
97 
98Or if you are using [Chocolatey](https://chocolatey.org/) on Windows
99 
100`$ choco install scc`
101 
102#### WinGet
103 
104Or if you are using [WinGet](https://github.com/microsoft/winget-cli) on Windows
105 
106`winget install --id benboyter.scc --source winget`
107 
108#### FreeBSD
109 
110On FreeBSD, scc is available as a package
111 
112`$ pkg install scc`
113 
114Or, if you prefer to build from source, you can use the ports tree
115 
116`$ cd /usr/ports/devel/scc && make install clean`
117 
118### Run in Docker
119 
120Go to the directory you want to run scc from.
121 
122Run the command below to run the latest release of scc on your current working directory:
123 
124```bash
125docker run --rm -it -v "$PWD:/pwd"  ghcr.io/boyter/scc:master scc /pwd
126```
127 
128#### Manual
129 
130Binaries for Windows, GNU/Linux and macOS for both i386 and x86_64 machines are available from the [releases](https://github.com/boyter/scc/releases) page.
131 
132#### GitLab
133 
134<https://about.gitlab.com/blog/2023/02/15/code-counting-in-gitlab/>
135 
136#### Other
137 
138If you would like to assist with getting `scc` added into apt/chocolatey/etc... please submit a PR or at least raise an issue with instructions.
139 
140### Background
141 
142Read all about how it came to be along with performance benchmarks,
143 
144- <https://boyter.org/posts/sloc-cloc-code/>
145- <https://boyter.org/posts/why-count-lines-of-code/>
146- <https://boyter.org/posts/sloc-cloc-code-revisited/>
147- <https://boyter.org/posts/sloc-cloc-code-performance/>
148- <https://boyter.org/posts/sloc-cloc-code-performance-update/>
149 
150Some reviews of `scc`
151 
152- <https://nickmchardy.com/2018/10/counting-lines-of-code-in-koi-cms.html>
153- <https://www.feliciano.tech/blog/determine-source-code-size-and-complexity-with-scc/>
154- <https://metaredux.com/posts/2019/12/13/counting-lines.html>
155 
156Setting up `scc` in GitLab
157 
158- <https://about.gitlab.com/blog/2023/02/15/code-counting-in-gitlab/>
159 
160A talk given at the first GopherCon AU about `scc` (press S to see speaker notes)
161 
162- <https://boyter.org/static/gophercon-syd-presentation/>
163- <https://www.youtube.com/watch?v=jd-sjoy3GZo>
164 
165For performance see the [Performance](https://github.com/boyter/scc#performance) section
166 
167Other similar projects,
168 
169- [SLOCCount](https://www.dwheeler.com/sloccount/) the original sloc counter
170- [cloc](https://github.com/AlDanial/cloc), inspired by SLOCCount; implemented in Perl for portability
171- [gocloc](https://github.com/hhatto/gocloc) a sloc counter in Go inspired by tokei
172- [loc](https://github.com/cgag/loc) rust implementation similar to tokei but often faster
173- [loccount](https://gitlab.com/esr/loccount) Go implementation written and maintained by ESR
174- [polyglot](https://github.com/vmchale/polyglot) ATS sloc counter
175- [tokei](https://github.com/XAMPPRocky/tokei) fast, accurate and written in rust
176- [sloc](https://github.com/flosse/sloc) coffeescript code counter
177- [stto](https://github.com/mainak55512/stto) new Go code counter with a focus on performance
178 
179Interesting reading about other code counting projects tokei, loc, polyglot and loccount
180 
181- <https://www.reddit.com/r/rust/comments/59bm3t/a_fast_cloc_replacement_in_rust/>
182- <https://www.reddit.com/r/rust/comments/82k9iy/loc_count_lines_of_code_quickly/>
183- <http://blog.vmchale.com/article/polyglot-comparisons>
184- <http://esr.ibiblio.org/?p=8270>
185 
186Further reading about processing files on the disk performance
187 
188- <https://blog.burntsushi.net/ripgrep/>
189 
190Using `scc` to process 40 TB of files from GitHub/Bitbucket/GitLab
191 
192- <https://boyter.org/posts/an-informal-survey-of-10-million-github-bitbucket-gitlab-projects/>
193 
194### Pitch
195 
196Why use `scc`?
197 
198- It is very fast and gets faster the more CPU you throw at it
199- Accurate
200- Works very well across multiple platforms without slowdown (Windows, Linux, macOS)
201- Large language support
202- Can ignore duplicate files
203- Has complexity estimations
204- You need to tell the difference between Coq and Verilog in the same directory
205- cloc yaml output support so potentially a drop in replacement for some users
206- Can identify or ignore minified files
207- Able to identify many #! files ADVANCED! <https://github.com/boyter/scc/issues/115>
208- Can ignore large files by lines or bytes
209- Can calculate the ULOC or unique lines of code by file, language or project
210- Supports multiple output formats for integration, CSV, SQL, JSON, HTML and more
211 
212Why not use `scc`?
213 
214- You don't like Go for some reason
215- It cannot count D source with different nested multi-line comments correctly <https://github.com/boyter/scc/issues/27>
216 
217### Differences
218 
219There are some important differences between `scc` and other tools that are out there. Here are a few important ones for you to consider.
220 
221Blank lines inside comments are counted as comments. While the line is technically blank the decision was made that once in a comment everything there should be considered a comment until that comment is ended. As such the following,
222 
223```c
224/* blank lines follow
225 
226 
227*/
228```
229 
230Would be counted as 4 lines of comments. This is noticeable when comparing scc's output to other tools on large
231repositories.
232 
233`scc` is able to count verbatim strings correctly. For example in C# the following,
234 
235```C#
236private const string BasePath = @"a:\";
237// The below is returned to the user as a version
238private const string Version = "1.0.0";
239```
240 
241Because of the prefixed @ this string ends at the trailing " by ignoring the escape character \ and as such should be
242counted as 2 code lines and 1 comment. Some tools are unable to
243deal with this and instead count up to the "1.0.0" as a string which can cause the middle comment to be counted as
244code rather than a comment.
245 
246`scc` will also tell you the number of bytes it has processed (for most output formats) allowing you to estimate the
247cost of running some static analysis tools.
248 
249### Usage
250 
251Command line usage of `scc` is designed to be as simple as possible.
252Full details can be found in `scc --help` or `scc -h`. Note that the below reflects the state of master not a release, as such
253features listed below may be missing from your installation.
254 
255```text
256Sloc, Cloc and Code. Count lines of code in a directory with complexity estimation.
257Version 3.5.0 (beta)
258Ben Boyter <ben@boyter.org> + Contributors
259 
260Usage:
261  scc [flags] [files or directories]
262 
263Flags:
264      --avg-wage int                       average wage value used for basic COCOMO calculation (default 56286)
265      --binary                             disable binary file detection
266      --by-file                            display output for every file
267  -m, --character                          calculate max and mean characters per line
268      --ci                                 enable CI output settings where stdout is ASCII
269      --cocomo-project-type string         change COCOMO model type [organic, semi-detached, embedded, "custom,1,1,1,1"] (default "organic")
270      --count-as string                    count extension as language [e.g. jsp:htm,chead:"C Header" maps extension jsp to html and chead to C Header]
271      --count-ignore                       set to allow .gitignore and .ignore files to be counted
272      --currency-symbol string             set currency symbol (default "$")
273      --debug                              enable debug output
274      --directory-walker-job-workers int   controls the maximum number of workers which will walk the directory tree (default 8)
275  -a, --dryness                            calculate the DRYness of the project (implies --uloc)
276      --eaf float                          the effort adjustment factor derived from the cost drivers (1.0 if rated nominal) (default 1)
277      --exclude-dir strings                directories to exclude (default [.git,.hg,.svn])
278  -x, --exclude-ext strings                ignore file extensions (overrides include-ext) [comma separated list: e.g. go,java,js]
279  -n, --exclude-file strings               ignore files with matching names (default [package-lock.json,Cargo.lock,yarn.lock,pubspec.lock,Podfile.lock,pnpm-lock.yaml])
280      --file-gc-count int                  number of files to parse before turning the GC on (default 10000)
281      --file-list-queue-size int           the size of the queue of files found and ready to be read into memory (default 8)
282      --file-process-job-workers int       number of goroutine workers that process files collecting stats (default 8)
283      --file-summary-job-queue-size int    the size of the queue used to hold processed file statistics before formatting (default 8)
284  -f, --format string                      set output format [tabular, wide, json, json2, csv, csv-stream, cloc-yaml, html, html-table, sql, sql-insert, openmetrics] (default "tabular")
285      --format-multi string                have multiple format output overriding --format [e.g. tabular:stdout,csv:file.csv,json:file.json]
286      --gen                                identify generated files
287      --generated-markers strings          string markers in head of generated files (default [do not edit,<auto-generated />])
288  -h, --help                               help for scc
289  -i, --include-ext strings                limit to file extensions [comma separated list: e.g. go,java,js]
290      --include-symlinks                   if set will count symlink files
291  -l, --languages                          print supported languages and extensions
292      --large-byte-count int               number of bytes a file can contain before being removed from output (default 1000000)
293      --large-line-count int               number of lines a file can contain before being removed from output (default 40000)
294      --locomo                             enable LOCOMO (LLM Output COst MOdel) cost estimation
295      --locomo-config string               LOCOMO power-user config "tokensPerLine,inputPerLine,complexityWeight,iterations,iterationWeight"
296      --locomo-cycles float               override estimated LLM iteration cycles (default: calculated from complexity)
297      --locomo-input-price float           LOCOMO cost per 1M input tokens in dollars (overrides preset)
298      --locomo-output-price float          LOCOMO cost per 1M output tokens in dollars (overrides preset)
299      --locomo-preset string               LOCOMO model preset [large, medium, small, local] (default "medium")
300      --locomo-review float                human review minutes per line of code for LOCOMO estimate (default 0.01)
301      --locomo-tps float                   LOCOMO output tokens per second (overrides preset)
302      --cost-comparison                    show both COCOMO and LOCOMO estimates side by side
303      --min                                identify minified files
304  -z, --min-gen                            identify minified or generated files
305      --min-gen-line-length int            number of bytes per average line for file to be considered minified or generated (default 255)
306      --no-cocomo                          remove COCOMO calculation output
307  -c, --no-complexity                      skip calculation of code complexity
308  -d, --no-duplicates                      remove duplicate files from stats and output
309      --no-gen                             ignore generated files in output (implies --gen)
310      --no-gitignore                       disables .gitignore file logic
311      --no-gitmodule                       disables .gitmodules file logic
312      --no-hborder                         remove horizontal borders between sections
313      --no-ignore                          disables .ignore file logic
314      --no-large                           ignore files over certain byte and line size set by large-line-count and large-byte-count
315      --no-min                             ignore minified files in output (implies --min)
316      --no-min-gen                         ignore minified or generated files in output (implies --min-gen)
317      --no-scc-ignore                      disables .sccignore file logic
318      --no-size                            remove size calculation output
319  -M, --not-match stringArray              ignore files and directories matching regular expression
320  -o, --output string                      output filename (default stdout)
321      --overhead float                     set the overhead multiplier for corporate overhead (facilities, equipment, accounting, etc.) (default 2.4)
322  -p, --percent                            include percentage values in output
323      --remap-all string                   inspect every file and remap by checking for a string and remapping the language [e.g. "-*- C++ -*-":"C Header"]
324      --remap-unknown string               inspect files of unknown type and remap by checking for a string and remapping the language [e.g. "-*- C++ -*-":"C Header"]
325      --size-unit string                   set size unit [si, binary, mixed, xkcd-kb, xkcd-kelly, xkcd-imaginary, xkcd-intel, xkcd-drive, xkcd-bakers] (default "si")
326      --sloccount-format                   print a more SLOCCount like COCOMO calculation
327  -s, --sort string                        column to sort by [files, name, lines, blanks, code, comments, complexity] (default "files")
328      --sql-project string                 use supplied name as the project identifier for the current run. Only valid with the --format sql or sql-insert option
329  -t, --trace                              enable trace output (not recommended when processing multiple files)
330  -u, --uloc                               calculate the number of unique lines of code (ULOC) for the project
331  -v, --verbose                            verbose output
332      --version                            version for scc
333  -w, --wide                               wider output with additional statistics (implies --complexity)
334```
335 
336Output should look something like the below for the redis project
337 
338```text
339$ scc redis 
340───────────────────────────────────────────────────────────────────────────────
341Language                 Files     Lines   Blanks  Comments     Code Complexity
342───────────────────────────────────────────────────────────────────────────────
343C                          437   267,353   31,103    45,998  190,252     48,269
344JSON                       406    25,392        4         0   25,388          0
345C Header                   288    48,831    5,648    11,302   31,881      3,097
346TCL                        215    66,943    7,330     4,651   54,962      3,816
347Shell                       75     1,626      239       343    1,044        185
348Python                      34     4,802      694       498    3,610        621
349Markdown                    26     4,647    1,226         0    3,421          0
350Autoconf                    22    11,732    1,124     1,420    9,188      1,016
351Lua                         20       525       69        71      385         89
352Makefile                    20     1,956      368       170    1,418         85
353YAML                        20     2,696      147        53    2,496          0
354MSBuild                     11     1,995        2         0    1,993        160
355Plain Text                  10     1,773      313         0    1,460          0
356Ruby                         9       817       73       105      639        123
357C++                          8       546       85        43      418         43
358HTML                         5     9,658    2,928        12    6,718          0
359License                      3        90       17         0       73          0
360CMake                        2       298       49         5      244         12
361CSS                          2       107       16         0       91          0
362Systemd                      2        80        6         0       74          0
363BASH                         1       143       16         5      122         38
364Batch                        1        28        2         0       26          3
365C++ Header                   1         9        1         3        5          0
366Extensible Styleshe…         1        10        0         0       10          0
367JavaScript                   1        31        1         0       30          5
368Module-Definition            1    11,375    2,116         0    9,259        167
369SVG                          1         1        0         0        1          0
370Smarty Template              1        44        1         0       43          5
371m4                           1       951      218        64      669          0
372───────────────────────────────────────────────────────────────────────────────
373Total                    1,624   464,459   53,796    64,743  345,920     57,734
374───────────────────────────────────────────────────────────────────────────────
375Estimated Cost to Develop (organic) $12,517,562
376Estimated Schedule Effort (organic) 35.93 months
377Estimated People Required (organic) 30.95
378───────────────────────────────────────────────────────────────────────────────
379Processed 16601962 bytes, 16.602 megabytes (SI)
380───────────────────────────────────────────────────────────────────────────────
381```
382 
383Note that you don't have to specify the directory you want to run against. Running `scc` will assume you want to run against the current directory.
384 
385You can also run against multiple files or directories `scc directory1 directory2 file1 file2` with the results aggregated in the output.
386 
387Since `scc` writes to standard output, there are many ways to easily share the results. For example, using [netcat](https://manpages.org/nc)
388and [one of many pastebins](https://paste.c-net.org/) gives a public URL:
389 
390```bash
391$ scc | nc paste.c-net.org 9999
392https://paste.c-net.org/Example
393```
394 
395### Ignore Files
396 
397`scc` mostly supports .ignore files inside directories that it scans. This is similar to how ripgrep, ag and tokei work. .ignore files are 100% the same as .gitignore files with the same syntax, and as such `scc` will ignore files and directories listed in them. You can add .ignore files to ignore things like vendored dependency checked in files and such. The idea is allowing you to add a file or folder to git and have ignored in the count.
398 
399It also supports its own ignore file `.sccignore` if you want `scc` to ignore things while having ripgrep, ag, tokei and others support them.
400 
401### Interesting Use Cases
402 
403Used inside Intel Nemu Hypervisor to track code changes between revisions <https://github.com/intel/nemu/blob/topic/virt-x86/tools/cloc-change.sh#L9>
404Appears to also be used inside both <http://codescoop.com/> <https://pinpoint.com/> <https://github.com/chaoss/grimoirelab-graal>
405 
406It also is used to count code and guess language types in <https://searchcode.com/> which makes it one of the most frequently run code counters in the world.
407 
408You can also hook scc into your gitlab pipeline <https://gitlab.com/guided-explorations/ci-cd-plugin-extensions/ci-cd-plugin-extension-scc>
409 
410Used by the following products and services,
411 
412- [GitHub CodeQL](https://github.com/boyter/scc/pull/317) - The CodeQL engine uses `scc` for line counting
413- [JetBrains Qodana](https://github.com/JetBrains/qodana-cli) - The Qodana CLI leverages `scc` as a command-line helper for code analysis
414- [Scaleway](https://twitter.com/Scaleway/status/1488087029476995074?s=20&t=N2-z6O-ISDdDzULg4o4uVQ) - Cloud provider using `scc`
415- [Linux Foundation LFX Insights](https://docs.linuxfoundation.org/lfx/insights/v3-beta-version-current/getting-started/landing-page/cocomo-cost-estimation-simplified) - COCOMO cost estimation
416- [OpenEMS](https://openems.io/)
417 
418### Features
419 
420`scc` uses a small state machine in order to determine what state the code is when it reaches a newline `\n`. As such it is aware of and able to count
421 
422- Single Line Comments
423- Multi Line Comments
424- Strings
425- Multi Line Strings
426- Blank lines
427 
428Because of this it is able to accurately determine if a comment is in a string or is actually a comment.
429 
430It also attempts to count the complexity of code. This is done by checking for branching operations in the code. For example, each of the following `for if switch while else || && != ==` if encountered in Java would increment that files complexity by one.
431 
432### Complexity Estimates
433 
434Let's take a minute to discuss the complexity estimate itself.
435 
436The complexity estimate is really just a number that is only comparable to files in the same language. It should not be used to compare languages directly without weighting them. The reason for this is that its calculated by looking for branch and loop statements in the code and incrementing a counter for that file.
437 
438Because some languages don't have loops and instead use recursion they can have a lower complexity count. Does this mean they are less complex? Probably not, but the tool cannot see this because it does not build an AST of the code as it only scans through it.
439 
440Generally though the complexity there is to help estimate between projects written in the same language, or for finding the most complex file in a project `scc --by-file -s complexity` which can be useful when you are estimating on how hard something is to maintain, or when looking for those files that should probably be refactored.
441 
442As for how it works.
443 
444It's my own definition, but tries to be an approximation of cyclomatic complexity <https://en.wikipedia.org/wiki/Cyclomatic_complexity> although done only on a file level.
445 
446The reason it's an approximation is that it's calculated almost for free from a CPU point of view (since its a cheap lookup when counting), whereas a real cyclomatic complexity count would need to parse the code. It gives a reasonable guess in practice though even if it fails to identify recursive methods. The goal was never for it to be exact.
447 
448In short when scc is looking through what it has identified as code if it notices what are usually branch conditions it will increment a counter.
449 
450The conditions it looks for are compiled into the code and you can get an idea for them by looking at the JSON inside the repository. See <https://github.com/boyter/scc/blob/master/languages.json#L3869> for an example of what it's looking at for a file that's Java.
451 
452The increment happens for each of the matching conditions and produces the number you see.
453 
454### Unique Lines of Code (ULOC)
455 
456ULOC stands for Unique Lines of Code and represents the unique lines across languages, files and the project itself. This idea was taken from
457<https://cmcenroe.me/2018/12/14/uloc.html> where the calculation is presented using standard Unix tools `sort -u *.h *.c | wc -l`. This metric is
458there to assist with the estimation of complexity within the project. Quoting the source
459 
460> In my opinion, the number this produces should be a better estimate of the complexity of a project. Compared to SLOC, not only are blank lines discounted, but so are close-brace lines and other repetitive code such as common includes. On the other hand, ULOC counts comments, which require just as much maintenance as the code around them does, while avoiding inflating the result with license headers which appear in every file, for example.
461 
462You can obtain the ULOC by supplying the `-u` or `--uloc` argument to `scc`.
463 
464It has a corresponding metric `DRYness %` which is the percentage of ULOC to CLOC or `DRYness = ULOC / SLOC`. The
465higher the number the more DRY (don't repeat yourself) the project can be considered. In general a higher value
466here is a better as it indicates less duplicated code. The DRYness metric was taken from a comment by minimax <https://lobste.rs/s/has9r7/uloc_unique_lines_code>
467 
468To obtain the DRYness metric you can use the `-a` or `--dryness` argument to `scc`, which will implicitly set `--uloc`.
469 
470Note that there is a performance penalty when calculating the ULOC metrics which can double the runtime.
471 
472Running the uloc and DRYness calculations against C code a clone of redis produces an output as follows.
473 
474```bash
475$ scc -a -i c redis 
476───────────────────────────────────────────────────────────────────────────────
477Language                 Files     Lines   Blanks  Comments     Code Complexity
478───────────────────────────────────────────────────────────────────────────────
479C                          437   267,353   31,103    45,998  190,252     48,269
480(ULOC)                            149892
481───────────────────────────────────────────────────────────────────────────────
482Total                      437   267,353   31,103    45,998  190,252     48,269
483───────────────────────────────────────────────────────────────────────────────
484Unique Lines of Code (ULOC)       149892
485DRYness %                           0.56
486───────────────────────────────────────────────────────────────────────────────
487Estimated Cost to Develop (organic) $6,681,762
488Estimated Schedule Effort (organic) 28.31 months
489Estimated People Required (organic) 20.97
490───────────────────────────────────────────────────────────────────────────────
491Processed 9390815 bytes, 9.391 megabytes (SI)
492───────────────────────────────────────────────────────────────────────────────
493```
494 
495Further reading about the ULOC calculation can be found at <https://boyter.org/posts/sloc-cloc-code-new-metic-uloc/>
496 
497Interpreting Dryness,
498 
499- 75% (High Density): Very terse, expressive code. Every line counts. (Example: Clojure, Haskell)
500- 60% - 70% (Standard): A healthy balance of logic and structural ceremony. (Example: Java, Python)
501- < 55% (High Boilerplate): High repetition. Likely due to mandatory error handling, auto-generated code, or verbose configuration. (Example: C#, CSS)
502 
503See <https://boyter.org/posts/boilerplate-tax-ranking-popular-languages-by-density/> for more details.
504 
505### COCOMO
506 
507The COCOMO statistics displayed at the bottom of any command line run can be configured as needed.
508 
509```text
510Estimated Cost to Develop (organic) $664,081
511Estimated Schedule Effort (organic) 11.772217 months
512Estimated People Required (organic) 5.011633
513```
514 
515To change the COCOMO parameters, you can either use one of the default COCOMO models.
516 
517```text
518scc --cocomo-project-type organic
519scc --cocomo-project-type semi-detached
520scc --cocomo-project-type embedded
521```
522 
523You can also supply your own parameters if you are familiar with COCOMO as follows,
524 
525```text
526scc --cocomo-project-type "custom,1,1,1,1"
527```
528 
529See below for details about how the model choices, and the parameters they use.
530 
531Organic – A software project is said to be an organic type if the team size required is adequately small, the
532problem is well understood and has been solved in the past and also the team members have a nominal experience
533regarding the problem.
534 
535`scc --cocomo-project-type "organic,2.4,1.05,2.5,0.38"`
536 
537Semi-detached – A software project is said to be a Semi-detached type if the vital characteristics such as team-size,
538experience, knowledge of the various programming environment lie in between that of organic and Embedded.
539The projects classified as Semi-Detached are comparatively less familiar and difficult to develop compared to
540the organic ones and require more experience and better guidance and creativity. Eg: Compilers or
541different Embedded Systems can be considered of Semi-Detached type.
542 
543`scc --cocomo-project-type "semi-detached,3.0,1.12,2.5,0.35"`
544 
545Embedded – A software project with requiring the highest level of complexity, creativity, and experience
546requirement fall under this category. Such software requires a larger team size than the other two models
547and also the developers need to be sufficiently experienced and creative to develop such complex models.
548 
549`scc --cocomo-project-type "embedded,3.6,1.20,2.5,0.32"`
550 
551### LOCOMO
552 
553LOCOMO (LLM Output COst MOdel) estimates the cost to regenerate a codebase using a large language model. It is the LLM-era counterpart to COCOMO — a rough ballpark estimator, not a project planning tool.
554 
555Note: LOCOMO was developed as part of `scc` and is not an industry-standard model. Unlike COCOMO, which is based on decades of empirical research by Barry Boehm, LOCOMO is an experimental heuristic designed to give a useful order-of-magnitude estimate for LLM-assisted development costs. Treat its output as a conversation starter, not a definitive answer.
556 
557**Important distinction:** LOCOMO estimates the cost to **regenerate** known code — essentially "given this exact codebase, how much would it cost to have an LLM produce it?" This is fundamentally different from the cost to **create** something from scratch, which involves exploration, architectural decisions, dead ends, debugging, and iteration that can cost orders of magnitude more. COCOMO estimates the human *creation* cost; LOCOMO estimates the LLM *regeneration* cost. They answer different questions.
558 
559LOCOMO is opt-in. Enable it with `--locomo` or use `--cost-comparison` to display both COCOMO and LOCOMO side by side.
560 
561```
562$ scc --locomo .
563...
564LOCOMO LLM Cost Estimate (medium)
565  Tokens Required (in/out) 3.0M / 0.7M
566  Cost to Generate $20
567  Estimated Cycles 2.1
568  Generation Time (serial) 3.9 hours
569  Human Review Time 5.9 hours
570  Disclaimer: rough ballpark for regenerating code using a LLM.
571  Does not account for context reuse, test generation, or heavy debugging.
572```
573 
574#### How it works
575 
576LOCOMO uses SLOC and complexity data that `scc` already computes. The model works per-file and aggregates:
577 
5781. **Output tokens** — each line of code maps to ~10 LLM output tokens (configurable).
5792. **Input tokens** — estimated prompting cost, scaled by code complexity. More complex code (higher branch density) requires more detailed prompts. Scales to prevent runaway estimates.
5803. **Iteration factor** — LLMs rarely produce correct code on the first try. A retry multiplier scales with complexity, also scales.
5814. **Dollar cost** — input and output tokens multiplied by per-token pricing.
5825. **Generation time** — total serial output tokens divided by tokens-per-second throughput.
5836. **Human review time** — estimated per-line overhead for planning, review, testing, and integration.
584 
585#### Model presets
586 
587Presets are tier-based rather than tied to specific models, so they don't go stale as models are retired or renamed. Use `--locomo-preset` to select a tier:
588 
589| Preset | Represents | Input $/1M | Output $/1M | TPS |
590|--------|-----------|-----------|-------------|-----|
591| `large` | Frontier models (Opus, GPT-5.3, Gemini 3.1 Pro, etc.) | 10.00 | 30.00 | 30 |
592| `medium` (default) | Balanced models (Sonnet, Gemini Flash, etc.) | 3.00 | 15.00 | 50 |
593| `small` | Fast/cheap models (Haiku, GPT-4o-mini, etc.) | 0.50 | 2.00 | 100 |
594| `local` | Self-hosted models (Llama, Mistral, Qwen etc.) | 0.00 | 0.00 | 15 |
595 
596For `local`, cost is $0 but generation time is still reported to capture the compute/time investment. Preset pricing reflects approximate tier rates as of early 2026 and can be overridden with explicit flags.
597 
598```
599scc --locomo --locomo-preset large .
600scc --locomo --locomo-preset local .
601```
602 
603#### Overriding preset values
604 
605You can override individual preset values for pricing or throughput:
606 
607```
608scc --locomo --locomo-input-price 1.0 --locomo-output-price 5.0 .
609scc --locomo --locomo-tps 100 .
610```
611 
612#### Human review time
613 
614The `--locomo-review` flag controls estimated human review minutes per line of code (default: 0.01, i.e. 0.6 seconds per line). This is intentionally optimistic and assumes light oversight.
615 
616For mission-critical, security-sensitive, or complex algorithmic code you should increase this:
617 
618```
619scc --locomo --locomo-review 0.05 .
620scc --locomo --locomo-review 0.1 .
621```
622 
623#### Power-user configuration
624 
625The five internal model parameters can be overridden with a single comma-separated config string:
626 
627```
628scc --locomo --locomo-config "tokensPerLine,inputPerLine,complexityWeight,iterations,iterationWeight"
629```
630 
631The defaults are `"10,20,5,1.5,2"`. Here is what each parameter controls:
632 
633| Position | Name | Default | Description |
634|----------|------|---------|-------------|
635| 1 | tokensPerLine | 10 | Average LLM output tokens per line of code |
636| 2 | inputPerLine | 20 | Base LLM input (prompt) tokens per output line |
637| 3 | complexityWeight | 5 | How much complexity density scales input tokens: `inputFactor = 1 + sqrt(density) * weight` |
638| 4 | iterations | 1.5 | Base iteration/retry cycles before complexity adjustment |
639| 5 | iterationWeight | 2 | How much complexity density adds extra cycles: `cycles = iterations + sqrt(density) * weight` |
640 
641The iteration factor (cycles) scales both input and output tokens — it represents how many generation attempts the LLM needs. Simple code (~0.05 complexity density) produces ~1.9 cycles; complex code (~0.3 density) produces ~2.6 cycles. Use `--locomo-cycles` to override this with a fixed value.
642 
643For example, to model a cheaper/faster LLM that needs fewer tokens but more retries:
644 
645```
646scc --locomo --locomo-config "8,15,3,2.0,1.5"
647```
648 
649#### Comparing COCOMO and LOCOMO
650 
651Use `--cost-comparison` to show both estimates side by side. This enables COCOMO (if it was disabled) and LOCOMO together:
652 
653```
654scc --cost-comparison .
655```
656 
657#### What LOCOMO does not account for
658 
659LOCOMO is a rough estimator with known limitations:
660 
661- **No context reuse.** Real LLM-assisted development shares context across files. The per-file model overestimates input tokens for large projects with shared patterns.
662- **Boilerplate vs algorithmic code.** A 500-line CRUD controller and a 500-line compression algorithm have very different real costs, but the model only differentiates them via complexity density.
663- **Code that LLMs can't write well.** Complex concurrency, platform-specific edge cases, and security-critical crypto need human authoring, not just review.
664- **No test generation cost.** The model estimates source code generation only, not test suites.
665- **Pricing changes.** LLM pricing drops rapidly. Preset defaults will become stale — use explicit price flags for current estimates.
666 
667#### All LOCOMO flags
668 
669| Flag | Default | Description |
670|------|---------|-------------|
671| `--locomo` | false | Enable LOCOMO output |
672| `--cost-comparison` | false | Show COCOMO + LOCOMO side by side |
673| `--locomo-preset` | medium | Model tier preset for pricing and throughput |
674| `--locomo-input-price` | (preset) | Override: cost per 1M input tokens ($) |
675| `--locomo-output-price` | (preset) | Override: cost per 1M output tokens ($) |
676| `--locomo-tps` | (preset) | Override: output tokens per second |
677| `--locomo-review` | 0.01 | Human review minutes per line of code |
678| `--locomo-cycles` | (calculated) | Override estimated LLM iteration cycles |
679| `--locomo-config` | 10,20,5,1.5,2 | Power-user config: tokensPerLine, inputPerLine, complexityWeight, iterations, iterationWeight |
680 
681### Large File Detection
682 
683You can have `scc` exclude large files from the output.
684 
685The option to do so is `--no-large` which by default will exclude files over 1,000,000 bytes or 40,000 lines.
686 
687You can control the size of either value using `--large-byte-count` or `--large-line-count`.
688 
689For example to exclude files over 1,000 lines and 50kb you could use the following,
690 
691`scc --no-large --large-byte-count 50000 --large-line-count 1000`
692 
693### Minified/Generated File Detection
694 
695You can have `scc` identify and optionally remove files identified as being minified or generated from the output.
696 
697You can do so by enabling the `-z` flag like so `scc -z` which will identify any file with an average line byte size >= 255 (by default) as being minified.
698 
699Minified files appear like so in the output.
700 
701```text
702$ scc --no-cocomo -z ./examples/minified/jquery-3.1.1.min.js
703───────────────────────────────────────────────────────────────────────────────
704Language                 Files     Lines   Blanks  Comments     Code Complexity
705───────────────────────────────────────────────────────────────────────────────
706JavaScript (min)             1         4        0         1        3         17
707───────────────────────────────────────────────────────────────────────────────
708Total                        1         4        0         1        3         17
709───────────────────────────────────────────────────────────────────────────────
710Processed 86709 bytes, 0.087 megabytes (SI)
711───────────────────────────────────────────────────────────────────────────────
712```
713 
714Minified files are indicated with the text `(min)` after the language name.
715 
716Generated files are indicated with the text `(gen)` after the language name.
717 
718You can control the average line byte size using `--min-gen-line-length` such as `scc -z --min-gen-line-length 1`. Please note you need `-z` as modifying this value does not imply minified detection.
719 
720You can exclude minified files from the count totally using the flag `--no-min-gen`. Files which match the minified check will be excluded from the output.
721 
722### Remapping
723 
724Some files may not have an extension. They will be checked to see if they are a #! file. If they are then the language will be remapped to the
725correct language. Otherwise, it will not process.
726 
727However, you may have the situation where you want to remap such files based on a string inside it. To do so you can use `--remap-unknown`
728 
729```bash
730 scc --remap-unknown "-*- C++ -*-":"C Header"
731```
732 
733The above will inspect any file with no extension looking for the string `-*- C++ -*-` and if found remap the file to be counted using the C Header rules.
734You can have multiple remap rules if required,
735 
736```bash
737 scc --remap-unknown "-*- C++ -*-":"C Header","other":"Java"
738```
739 
740There is also the `--remap-all` parameter which will remap all files.
741 
742Note that in all cases if the remap rule does not apply normal #! rules will apply.
743 
744### Output Formats
745 
746By default `scc` will output to the console. However, you can produce output in other formats if you require.
747 
748The different options are `tabular, wide, json, csv, csv-stream, cloc-yaml, html, html-table, sql, sql-insert, openmetrics`.
749 
750Note that you can write `scc` output to disk using the `-o, --output` option. This allows you to specify a file to
751write your output to. For example `scc -f html -o output.html` will run `scc` against the current directory, and output
752the results in html to the file `output.html`.
753 
754You can also write to multiple output files, or multiple types to stdout if you want using the `--format-multi` option. This is
755most useful when working in CI/CD systems where you want HTML reports as an artifact while also displaying the counts in stdout.
756 
757```bash
758scc --format-multi "tabular:stdout,html:output.html,csv:output.csv"
759```
760 
761The above will run against the current directory, outputting to standard output the default output, as well as writing
762to output.html and output.csv with the appropriate formats.
763 
764#### Tabular
765 
766This is the default output format when scc is run.
767 
768#### Wide
769 
770Wide produces some additional information which is the complexity/lines metric. This can be useful when trying to
771identify the most complex file inside a project based on the complexity estimate.
772 
773#### JSON
774 
775JSON produces JSON output. Mostly designed to allow `scc` to feed into other programs.
776 
777Note that this format will give you the byte size of every file `scc` reads allowing you to get a breakdown of the
778number of bytes processed.
779 
780#### CSV
781 
782CSV as an option is good for importing into a spreadsheet for analysis.
783 
784Note that this format will give you the byte size of every file `scc` reads allowing you to get a breakdown of the
785number of bytes processed. Also note that CSV respects `--by-file` and as such will return a summary by default.
786 
787#### CSV-Stream
788 
789csv-stream is an option useful for processing very large repositories where you are likely to run into memory issues. It's output format is 100% the same as CSV.
790 
791Note that you should not use this with the `format-multi` option as it will always print to standard output, and because of how it works will negate the memory saving it normally gains.
792savings that this option provides. Note that there is no sort applied with this option.
793 
794#### cloc-yaml
795 
796Is a drop in replacement for cloc using its yaml output option. This is quite often used for passing into other
797build systems and can help with replacing cloc if required.
798 
799```text
800$ scc -f cloc-yml processor
801# https://github.com/boyter/scc/
802header:
803  url: https://github.com/boyter/scc/
804  version: 2.11.0
805  elapsed_seconds: 0.008
806  n_files: 21
807  n_lines: 6562
808  files_per_second: 2625
809  lines_per_second: 820250
810Go:
811  name: Go
812  code: 5186
813  comment: 273
814  blank: 1103
815  nFiles: 21
816SUM:
817  code: 5186
818  comment: 273
819  blank: 1103
820  nFiles: 21
821 
822$ cloc --yaml processor
823      21 text files.
824      21 unique files.
825       0 files ignored.
826 
827---
828# http://cloc.sourceforge.net
829header :
830  cloc_url           : http://cloc.sourceforge.net
831  cloc_version       : 1.60
832  elapsed_seconds    : 0.196972846984863
833  n_files            : 21
834  n_lines            : 6562
835  files_per_second   : 106.613679608407
836  lines_per_second   : 33314.2364566841
837Go:
838  nFiles: 21
839  blank: 1137
840  comment: 606
841  code: 4819
842SUM:
843  blank: 1137
844  code: 4819
845  comment: 606
846  nFiles: 21
847```
848 
849#### HTML and HTML-TABLE
850 
851The HTML output options produce a minimal html report using a table that is either standalone `html` or as just a table `html-table`
852which can be injected into your own HTML pages. The only difference between the two is that the `html` option includes
853html head and body tags with minimal styling.
854 
855The markup is designed to allow your own custom styles to be applied. An example report
856[is here to view](SCC-OUTPUT-REPORT.html).
857 
858Note that the HTML options follow the command line options, so you can use `scc --by-file -f html` to produce a report with every
859file and not just the summary.
860 
861Note that this format if it has the `--by-file` option will give you the byte size of every file `scc` reads allowing you to get a breakdown of the
862number of bytes processed.
863 
864#### SQL and SQL-Insert
865 
866The SQL output format "mostly" compatible with cloc's SQL output format <https://github.com/AlDanial/cloc#sql->
867 
868While all queries on the cloc documentation should work as expected, you will not be able to append output from `scc` and `cloc` into the same database. This is because the table format is slightly different
869to account for scc including complexity counts and bytes.
870 
871The difference between `sql` and `sql-insert` is that `sql` will include table creation while the latter will only have the insert commands.
872 
873Usage is 100% the same as any other `scc` command but sql output will always contain per file details. You can compute totals yourself using SQL, however COCOMO calculations will appear against the metadata table as the columns `estimated_cost` `estimated_schedule_months` and `estimated_people`.
874 
875The below will run scc against the current directory, name the output as the project scc and then pipe the output to sqlite to put into the database code.db
876 
877```bash
878scc --format sql --sql-project scc . | sqlite3 code.db
879```
880 
881Assuming you then wanted to append another project
882 
883```bash
884scc --format sql-insert --sql-project redis . | sqlite3 code.db
885```
886 
887You could then run SQL against the database,
888 
889```bash
890sqlite3 code.db 'select project,file,max(nCode) as nL from t
891                         group by project order by nL desc;'
892```
893 
894See the cloc documentation for more examples.
895 
896#### OpenMetrics
897 
898[OpenMetrics](https://openmetrics.io/) is a metric reporting format specification extending the Prometheus exposition text format.
899 
900The produced output is natively supported by [Prometheus](https://prometheus.io/) and [GitLab CI](https://docs.gitlab.com/ee/ci/testing/metrics_reports.html)
901 
902Note that OpenMetrics respects `--by-file` and as such will return a summary by default.
903 
904The output includes a metadata header containing definitions of the returned metrics:
905 
906```text
907# TYPE scc_files count
908# HELP scc_files Number of sourcecode files.
909# TYPE scc_lines count
910# UNIT scc_lines lines
911# HELP scc_lines Number of lines.
912# TYPE scc_code count
913# HELP scc_code Number of lines of actual code.
914# TYPE scc_comments count
915# HELP scc_comments Number of comments.
916# TYPE scc_blanks count
917# HELP scc_blanks Number of blank lines.
918# TYPE scc_complexity count
919# HELP scc_complexity Code complexity.
920# TYPE scc_bytes count
921# UNIT scc_bytes bytes
922# HELP scc_bytes Size in bytes.
923```
924 
925The header is followed by the metric data in either language summary form:
926 
927```text
928scc_files{language="Go"} 1
929scc_lines{language="Go"} 1000
930scc_code{language="Go"} 1000
931scc_comments{language="Go"} 1000
932scc_blanks{language="Go"} 1000
933scc_complexity{language="Go"} 1000
934scc_bytes{language="Go"} 1000
935```
936 
937or, if `--by-file` is present, in per file form:
938 
939```text
940scc_lines{language="Go",file="./bbbb.go"} 1000
941scc_code{language="Go",file="./bbbb.go"} 1000
942scc_comments{language="Go",file="./bbbb.go"} 1000
943scc_blanks{language="Go",file="./bbbb.go"} 1000
944scc_complexity{language="Go",file="./bbbb.go"} 1000
945scc_bytes{language="Go",file="./bbbb.go"} 1000
946```
947 
948### Performance
949 
950Generally `scc` will the fastest code counter compared to any I am aware of and have compared against. The below comparisons are taken from the fastest alternative counters. See `Other similar projects` above to see all of the other code counters compared against. It is designed to scale to as many CPU's cores as you can provide.
951 
952However, if you want greater performance and you have RAM to spare you can disable the garbage collector like the following on Linux `GOGC=-1 scc .` which should speed things up considerably. For some repositories turning off the code complexity calculation via `-c` can reduce runtime as well.
953 
954Benchmarks are run on fresh 32 Core CPU Optimised Vultr Ocean Virtual Machine 2026/03/05 all done using [hyperfine](https://github.com/sharkdp/hyperfine).
955 
956See <https://github.com/boyter/scc/blob/master/benchmark.sh> to see how the benchmarks are run.
957 
958#### Valkey <https://github.com/valkey-io/valkey>
959 
960```shell
961Benchmark 1: scc valkey
962  Time (mean ± σ):      27.7 ms ±   2.1 ms    [User: 175.7 ms, System: 87.0 ms]
963  Range (min … max):    23.1 ms …  32.1 ms    96 runs
964 
965Benchmark 2: scc -c valkey
966  Time (mean ± σ):      23.0 ms ±   1.5 ms    [User: 131.7 ms, System: 84.0 ms]
967  Range (min … max):    19.5 ms …  31.4 ms    130 runs
968 
969Benchmark 3: tokei valkey
970  Time (mean ± σ):      74.0 ms ±  13.0 ms    [User: 394.2 ms, System: 245.1 ms]
971  Range (min … max):    49.1 ms …  92.5 ms    37 runs
972 
973Benchmark 4: polyglot valkey
974  Time (mean ± σ):      41.1 ms ±   1.2 ms    [User: 54.2 ms, System: 103.3 ms]
975  Range (min … max):    37.5 ms …  47.0 ms    69 runs
976 
977Summary
978  scc -c valkey ran
979    1.20 ± 0.12 times faster than scc valkey
980    1.78 ± 0.13 times faster than polyglot valkey
981    3.21 ± 0.61 times faster than tokei valkey
982```
983 
984#### CPython <https://github.com/python/cpython>
985 
986```shell
987Benchmark 1: scc cpython
988  Time (mean ± σ):      80.8 ms ±   2.6 ms    [User: 751.1 ms, System: 265.6 ms]
989  Range (min … max):    75.7 ms …  87.4 ms    36 runs
990 
991Benchmark 2: scc -c cpython
992  Time (mean ± σ):      70.5 ms ±   2.4 ms    [User: 592.6 ms, System: 254.7 ms]
993  Range (min … max):    66.2 ms …  77.6 ms    40 runs
994 
995Benchmark 3: tokei cpython
996  Time (mean ± σ):     450.2 ms ±  36.1 ms    [User: 1822.0 ms, System: 1246.9 ms]
997  Range (min … max):   378.6 ms … 491.2 ms    10 runs
998 
999Benchmark 4: polyglot cpython
1000  Time (mean ± σ):     149.9 ms ±   5.8 ms    [User: 199.2 ms, System: 326.2 ms]
1001  Range (min … max):   138.3 ms … 164.1 ms    19 runs
1002 
1003Summary
1004  scc -c cpython ran
1005    1.15 ± 0.05 times faster than scc cpython
1006    2.13 ± 0.11 times faster than polyglot cpython
1007    6.39 ± 0.56 times faster than tokei cpython
1008```
1009 
1010#### Linux Kernel <https://github.com/torvalds/linux>
1011 
1012```shell
1013Benchmark 1: scc linux
1014  Time (mean ± σ):     907.2 ms ±  17.1 ms    [User: 13764.7 ms, System: 2957.0 ms]
1015  Range (min … max):   878.2 ms … 925.0 ms    10 runs
1016 
1017Benchmark 2: scc -c linux
1018  Time (mean ± σ):     842.5 ms ±  17.2 ms    [User: 9363.3 ms, System: 2977.0 ms]
1019  Range (min … max):   819.4 ms … 874.0 ms    10 runs
1020 
1021Benchmark 3: tokei linux
1022  Time (mean ± σ):      1.422 s ±  0.089 s    [User: 13.292 s, System: 9.582 s]
1023  Range (min … max):    1.176 s …  1.471 s    10 runs
1024 
1025Benchmark 4: polyglot linux
1026  Time (mean ± σ):      1.862 s ±  0.046 s    [User: 3.802 s, System: 3.543 s]
1027  Range (min … max):    1.800 s …  1.935 s    10 runs
1028 
1029Summary
1030  scc -c linux ran
1031    1.08 ± 0.03 times faster than scc linux
1032    1.69 ± 0.11 times faster than tokei linux
1033    2.21 ± 0.07 times faster than polyglot linux
1034```
1035 
1036#### Sourcegraph <https://github.com/SINTEF/sourcegraph.git>
1037 
1038Sourcegraph has gone dark since I last ran these benchmarks hence using a clone taken before this occured.
1039The reason for this is to track what appears to be a performance regression in tokei.
1040 
1041```shell
1042Benchmark 1: scc sourcegraph
1043  Time (mean ± σ):     108.2 ms ±   3.5 ms    [User: 559.4 ms, System: 323.6 ms]
1044  Range (min … max):   100.5 ms … 115.9 ms    26 runs
1045 
1046Benchmark 2: scc -c sourcegraph
1047  Time (mean ± σ):      99.7 ms ±   4.2 ms    [User: 503.1 ms, System: 316.8 ms]
1048  Range (min … max):    91.4 ms … 109.4 ms    29 runs
1049 
1050Benchmark 3: tokei sourcegraph
1051  Time (mean ± σ):     21.359 s ±  1.025 s    [User: 57.252 s, System: 411.480 s]
1052  Range (min … max):   19.371 s … 22.741 s    10 runs
1053 
1054Benchmark 4: polyglot sourcegraph
1055  Time (mean ± σ):     135.1 ms ±   5.0 ms    [User: 198.6 ms, System: 543.7 ms]
1056  Range (min … max):   126.0 ms … 144.8 ms    21 runs
1057 
1058Summary
1059  scc -c sourcegraph ran
1060    1.08 ± 0.06 times faster than scc sourcegraph
1061    1.36 ± 0.08 times faster than polyglot sourcegraph
1062  214.26 ± 13.64 times faster than tokei sourcegraph
1063```
1064 
1065If you enable duplicate detection expect performance to fall by about 20% in `scc`.
1066 
1067Performance is tracked for some releases and presented below.
1068 
1069[![scc perfromance on Linux kernel](./performance-over-time.png)]
1070The decrease in performance from the 3.3.0 release was due to accurate .gitignore, .ignore and .gitmodule support.
1071Current work is focussed on resolving this.
1072 
1073### CI/CD Support
1074 
1075Some CI/CD systems which will remain nameless do not work very well with the box-lines used by `scc`. To support those systems better there is an option `--ci` which will change the default output to ASCII only.
1076 
1077```text
1078$ scc --ci main.go
1079-------------------------------------------------------------------------------
1080Language                 Files     Lines   Blanks  Comments     Code Complexity
1081-------------------------------------------------------------------------------
1082Go                           1       272        7         6      259          4
1083-------------------------------------------------------------------------------
1084Total                        1       272        7         6      259          4
1085-------------------------------------------------------------------------------
1086Estimated Cost to Develop $6,539
1087Estimated Schedule Effort 2.268839 months
1088Estimated People Required 0.341437
1089-------------------------------------------------------------------------------
1090Processed 5674 bytes, 0.006 megabytes (SI)
1091-------------------------------------------------------------------------------
1092```
1093 
1094The `--format-multi` option is especially useful in CI/CD where you want to get multiple output formats useful for storage or reporting.
1095 
1096### Development
1097 
1098If you want to hack away feel free! PR's are accepted. Some things to keep in mind. If you want to change a language definition you need to update `languages.json` and then run `go generate` which will convert it into the `processor/constants.go` file.
1099 
1100For all other changes ensure you run all tests before submitting. You can do so using `go test ./...`. However, for maximum coverage please run `test-all.sh` which will run `gofmt`, unit tests, race detector and then all of the integration tests. All of those must pass to ensure a stable release.
1101 
1102### API Support
1103 
1104The core part of `scc` which is the counting engine is exposed publicly to be integrated into other Go applications. See <https://github.com/pinpt/ripsrc> for an example of how to do this.
1105 
1106It also powers all of the code calculations displayed in <https://searchcode.com/> such as <https://searchcode.com/file/169350674/main.go/> making it one of the more used code counters in the world.
1107 
1108However as a quick start consider the following,
1109 
1110Note that you must pass in the number of bytes in the content in order to ensure it is counted!
1111 
1112```go
1113package main
1114 
1115import (
1116  "fmt"
1117  "io/ioutil"
1118 
1119  "github.com/boyter/scc/v3/processor"
1120)
1121 
1122type statsProcessor struct{}
1123 
1124func (p *statsProcessor) ProcessLine(job *processor.FileJob, currentLine int64, lineType processor.LineType) bool {
1125  switch lineType {
1126  case processor.LINE_BLANK:
1127    fmt.Println(currentLine, "lineType", "BLANK")
1128  case processor.LINE_CODE:
1129    fmt.Println(currentLine, "lineType", "CODE")
1130  case processor.LINE_COMMENT:
1131    fmt.Println(currentLine, "lineType", "COMMENT")
1132  }
1133  return true
1134}
1135 
1136func main() {
1137  bts, _ := ioutil.ReadFile("somefile.go")  
1138  t := &statsProcessor{}
1139  filejob := &processor.FileJob{
1140    Filename: "test.go",
1141    Language: "Go",
1142    Content:  bts,
1143    Callback: t,
1144    Bytes:    int64(len(bts)),
1145  }  
1146  processor.ProcessConstants() // Required to load the language information and need only be done once
1147  processor.CountStats(filejob)
1148}
1149```
1150 
1151#### Per-Byte Content Classification
1152 
1153For library consumers who need finer granularity than per-line classification, `scc` supports opt-in per-byte content classification. When enabled, `CountStats` populates a byte slice classifying every byte in the file as code, comment, string, or blank. This is useful for stripping comments from source files, extracting only comments, or building syntax-aware tools without reimplementing language parsing.
1154 
1155To enable it, set `ClassifyContent: true` on the `FileJob` before calling `CountStats`. When disabled (the default), there is zero performance impact.
1156 
1157```go
1158package main
1159 
1160import (
1161  "fmt"
1162  "os"
1163 
1164  "github.com/boyter/scc/v3/processor"
1165)
1166 
1167func main() {
1168  processor.ProcessConstants()
1169 
1170  bts, _ := os.ReadFile("main.go")
1171  filejob := &processor.FileJob{
1172    Filename:        "main.go",
1173    Language:        "Go",
1174    Content:         bts,
1175    Bytes:           int64(len(bts)),
1176    ClassifyContent: true, // Enable per-byte classification
1177  }
1178  processor.CountStats(filejob)
1179 
1180  // ContentByteType has one entry per byte with values:
1181  //   processor.ByteTypeBlank   (0) - blank lines / leading whitespace
1182  //   processor.ByteTypeCode    (1) - code
1183  //   processor.ByteTypeComment (2) - comments (including docstrings)
1184  //   processor.ByteTypeString  (3) - string literals
1185 
1186  // Example: extract only code, replacing everything else with spaces
1187  codeOnly := filejob.FilterContentByType(processor.ByteTypeCode)
1188  fmt.Println(string(codeOnly))
1189 
1190  // Example: extract only comments
1191  commentsOnly := filejob.FilterContentByType(processor.ByteTypeComment)
1192  fmt.Println(string(commentsOnly))
1193 
1194  // Example: keep both code and strings, strip comments
1195  noComments := filejob.FilterContentByType(processor.ByteTypeCode, processor.ByteTypeString)
1196  fmt.Println(string(noComments))
1197}
1198```
1199 
1200`FilterContentByType` returns a copy of the content with non-matching bytes replaced by spaces. Newlines are always preserved regardless of type, so the output maintains the same line structure as the original file. It returns `nil` if classification was not enabled.
1201 
1202Note that at syntax marker boundaries (e.g., `//`, `/*`, `"`), the first byte of the marker may be classified as the preceding state. This is a 1-byte approximation that is acceptable for content filtering use cases.
1203 
1204### Adding/Modifying Languages
1205 
1206To add or modify a language you will need to edit the `languages.json` file in the root of the project, and then run `go generate` to build it into the application. You can then `go install` or `go build` as normal to produce the binary with your modifications.
1207 
1208### Issues
1209 
1210Its possible that you may see the counts vary between runs. This usually means one of two things. Either something is changing or locking the files under scc, or that you are hitting ulimit restrictions. To change the ulimit see the following links.
1211 
1212- <https://superuser.com/questions/261023/how-to-change-default-ulimit-values-in-mac-os-x-10-6#306555>
1213- <https://unix.stackexchange.com/questions/108174/how-to-persistently-control-maximum-system-resource-consumption-on-mac/221988#221988>
1214- <https://access.redhat.com/solutions/61334>
1215- <https://serverfault.com/questions/356962/where-are-the-default-ulimit-values-set-linux-centos>
1216- <https://www.tecmint.com/increase-set-open-file-limits-in-linux/>
1217 
1218To help identify this issue run scc like so `scc -v .` and look for the message `too many open files` in the output. If it is there you can rectify it by setting your ulimit to a higher value.
1219 
1220### Low Memory
1221 
1222If you are running `scc` in a low memory environment < 512 MB of RAM you may need to set `--file-gc-count` to a lower value such as `0` to force the garbage collector to be on at all times.
1223 
1224A sign that this is required will be `scc` crashing with panic errors.
1225 
1226### Tests
1227 
1228scc is pretty well tested with many unit, integration and benchmarks to ensure that it is fast and complete.
1229 
1230### Package
1231 
1232Packaging as of version v3.1.0 is done through <https://goreleaser.com/>
1233 
1234### Containers
1235 
1236Note if you plan to run `scc` in Alpine containers you will need to build with CGO_ENABLED=0.
1237 
1238See the below Dockerfile as an example on how to achieve this based on this issue <https://github.com/boyter/scc/issues/208>
1239 
1240```Dockerfile
1241FROM golang as scc-get
1242 
1243ENV GOOS=linux \
1244GOARCH=amd64 \
1245CGO_ENABLED=0
1246 
1247ARG VERSION
1248RUN git clone --branch $VERSION --depth 1 https://github.com/boyter/scc
1249WORKDIR /go/scc
1250RUN go build -ldflags="-s -w"
1251 
1252FROM alpine
1253COPY --from=scc-get /go/scc/scc /bin/
1254ENTRYPOINT ["scc"]
1255```
1256 
1257### Badges
1258 
1259You can use `scc` to provide badges on your github/bitbucket/gitlab/sr.ht open repositories. For example, [![Scc Count Badge](https://sloc.xyz/github/boyter/scc/)](https://github.com/boyter/scc/)
1260 The format to do so is,
1261 
1262<https://sloc.xyz/PROVIDER/USER/REPO>
1263 
1264An example of the badge for `scc` is included below, and is used on this page.
1265 
1266```Markdown
1267[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/)](https://github.com/boyter/scc/)
1268```
1269 
1270By default the badge will show the repo's lines count. You can also specify for it to show a different category, by using the `?category=` query string.
1271 
1272Valid values include `code, blanks, lines, comments, cocomo, effort` and examples of the appearance are included below.
1273 
1274[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/?category=code)](https://github.com/boyter/scc/)
1275[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/?category=blanks)](https://github.com/boyter/scc/)
1276[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/?category=lines)](https://github.com/boyter/scc/)
1277[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/?category=comments)](https://github.com/boyter/scc/)
1278[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/?category=cocomo)](https://github.com/boyter/scc/)
1279[![Scc Count Badge](https://sloc.xyz/github/boyter/scc/?category=effort)](https://github.com/boyter/scc/)
1280 
1281For `cocomo` you can also set the `avg-wage` value similar to `scc` itself. For example,
1282 
1283<https://sloc.xyz/github/boyter/scc/?category=cocomo&avg-wage=1>
1284<https://sloc.xyz/github/boyter/scc/?category=cocomo&avg-wage=100000>
1285 
1286Note that the avg-wage value must be a positive integer otherwise it will revert back to the default value of 56286.
1287 
1288You can also configure the look and feel of the badge using the following parameters,
1289 
1290- ?lower=true will lower the title text, so "Total lines" would be "total lines"
1291 
1292The below can control the colours of shadows, fonts and badges. Colors can be specified as either hex codes or named colors (similar to shields.io):
1293 
1294- ?font-color=fff
1295- ?font-shadow-color=010101
1296- ?top-shadow-accent-color=bbb
1297- ?title-bg-color=555
1298- ?badge-bg-color=4c1
1299 
1300##### Named Colors
1301 
1302For convenience, you can use named colors instead of hex codes. The following named colors are supported:
1303 
1304**Shields.io colors:** `brightgreen`, `green`, `yellowgreen`, `yellow`, `orange`, `red`, `blue`, `lightgrey`, `blueviolet`
1305 
1306**Semantic aliases:** `success`, `important`, `critical`, `informational`, `inactive`
1307 
1308**CSS colors:** `white`, `black`, `silver`, `gray`, `maroon`, `purple`, `fuchsia`, `lime`, `olive`, `navy`, `teal`, `aqua`, `cyan`, `magenta`, `pink`, `coral`, `salmon`, `gold`, `khaki`, `violet`, `indigo`, `crimson`, `turquoise`, `tan`, `brown`, and many more standard CSS color names.
1309 
1310For example, instead of `?badge-bg-color=007ec6` you can use `?badge-bg-color=blue`.
1311 
1312An example of using some of these parameters to produce an admittedly ugly result
1313 
1314[![Scc Count Badge](https://sloc.xyz/github/boyter/scc?font-color=ff0000&badge-bg-color=0000ff&lower=true)](https://github.com/boyter/scc/)
1315 
1316An example using named colors for as a slightly nicer result
1317 
1318[![Scc Count Badge](https://sloc.xyz/github/boyter/scc?title-bg-color=navy&badge-bg-color=blue&font-color=white)](https://github.com/boyter/scc/)
1319 
1320*NB* it may not work for VERY large repositories (has been tested on Apache hadoop/spark without issue).
1321 
1322You can find the source code for badges in the repository at <https://github.com/boyter/scc/blob/master/cmd/badges/main.go>
1323 
1324#### A example for each supported provider
1325 
1326- Github - <https://sloc.xyz/github/boyter/scc/>
1327- sr.ht - <https://sloc.xyz/sr.ht/~nektro/magnolia-desktop/>
1328- Bitbucket - <https://sloc.xyz/bitbucket/boyter/decodingcaptchas>
1329- Gitlab - <https://sloc.xyz/gitlab/esr/loccount>
1330 
1331### Languages
1332 
1333List of supported languages. The master version of `scc` supports 322 languages at last count. Note that this is always assumed that you built from master, and it might trail behind what is actually supported. To see what your version of `scc` supports run `scc --languages`
1334 
1335[Click here to view all languages supported by master](LANGUAGES.md)
1336 
1337### Citation
1338 
1339Please use the following bibtex entry to cite scc in a publication:
1340 
1341<pre>
1342@software{scc,
1343  author       = {Ben Boyter},
1344  title        = {scc: v3.5.0},
1345  month        = ...,
1346  year         = ...,
1347  publisher    = {...},
1348  version      = {v3.5.0},
1349  doi          = {...},
1350  url          = {...}
1351}
1352</pre>
1353 
1354You may need to check the release page <https://github.com/boyter/scc/releases> to find the correct year and month for the release you are using.
1355 
1356### Release Checklist
1357 
1358- Update version
1359- Push code with release number
1360- Tag off
1361- Release via goreleaser
1362- Update dockerfile
1363
Full transparency — inspect the skill content before installing.