comm — Three-column set comparison of two sorted files across all 5 shells
Equivalents in every shell
comm file1.txt file2.txtRequires BOTH inputs to be sorted (the algorithm is single-pass merge). Output is three tab-separated columns: col1 = only in file1, col2 = only in file2, col3 = in both. Suppress columns with `-1`, `-2`, `-3` (e.g. `comm -12 a b` = lines in both).
comm file1.txt file2.txtSame external `/usr/bin/comm` binary as bash. Zsh process substitution helps: `comm <(sort a.txt) <(sort b.txt)` saves the explicit pre-sort step.
comm file1.txt file2.txtSame binary. Fish doesn't have process substitution syntactically — use `sort a.txt | psub` instead: `comm (sort a.txt | psub) (sort b.txt | psub)`.
Compare-Object (Get-Content file1.txt) (Get-Content file2.txt)`Compare-Object` is the conceptual analog but DIFFERENT: it does object diff with a `SideIndicator` column (`<=` only-in-reference, `=>` only-in-difference) — does NOT require sorted input. For three-column comm-style output, use `-IncludeEqual` and group on `SideIndicator`. The pure pwsh-native equivalent for "lines in both" is `(Get-Content a) | Where-Object { $_ -in (Get-Content b) }` — O(n²) but readable.
fc /b file1.txt file2.txtNo native cmd `comm`. `fc /b` does BINARY compare (byte-level diff with offsets), `fc /L` does line compare but with diff-style output (not column suppression). For true comm-style set comparison, shell out to pwsh: `powershell -Command "Compare-Object (gc a) (gc b)"`.
Worked examples
List lines present in BOTH files (intersection)
comm -12 <(sort a.txt) <(sort b.txt)comm -12 <(sort a.txt) <(sort b.txt)comm -12 (sort a.txt | psub) (sort b.txt | psub)Compare-Object (Get-Content a.txt) (Get-Content b.txt) -IncludeEqual | Where-Object SideIndicator -eq "==" | Select-Object -Expand InputObjectList lines only in file1 (difference a − b)
comm -23 <(sort a.txt) <(sort b.txt)comm -23 (sort a.txt | psub) (sort b.txt | psub)Compare-Object (gc a.txt) (gc b.txt) | Where-Object SideIndicator -eq "<=" | Select-Object -Expand InputObjectpowershell -Command "Compare-Object (gc a.txt) (gc b.txt) | ? SideIndicator -eq \"<=\" | %% { $_.InputObject }"Symmetric difference (lines in either but not both)
comm -3 <(sort a.txt) <(sort b.txt)comm -3 (sort a.txt | psub) (sort b.txt | psub)Compare-Object (gc a.txt) (gc b.txt) | Select-Object -Expand InputObjectGotchas
- BOTH inputs MUST be sorted with the same collation. `comm` walks them in lockstep and breaks the moment ordering diverges — GNU comm 8.30+ at least PRINTS `comm: file 1 is not in sorted order` on stderr, older versions silently produce wrong output. Always pre-sort with `LC_ALL=C sort` for byte-stable ordering.
- Sort COLLATION matters: `sort` with default locale (`en_US.UTF-8`) treats `A` and `a` as adjacent, mixed-case files sort case-insensitively. Pre-sort BOTH files with the SAME collation: `LC_ALL=C sort` for byte-order (recommended for code/IDs), or pass `--check-order` to `comm` to catch divergence early.
- Column suppression flags are NEGATIVES — `-1` HIDES col 1 (only-in-file1). To show ONLY lines in both, you hide cols 1 and 2: `comm -12`. Easy to flip backwards. Mnemonic: "`-12` shows column 3 (both)", "`-3` shows columns 1+2 (only-in-X)".
- `Compare-Object` is NOT comm-equivalent in two key ways: (1) it does NOT require sorted input (uses internal hashing), so O(n+m) hash space vs comm's O(1) sequential — large inputs blow up memory in pwsh; (2) output is OBJECTS with `InputObject` + `SideIndicator`, not text columns — downstream consumers must unwrap `.InputObject`.
- `comm` output is TAB-separated (col 1, then tab, col 2, then tab, col 3). When the inputs contain tabs themselves, the output becomes ambiguous — there's no quoting. Workaround: use `--output-delimiter=,` (GNU coreutils 8.6+) to switch to commas, or pre-process inputs to strip tabs.