split — Split a file into smaller chunks by lines, bytes, or chunk count across all 5 shells
Equivalents in every shell
split -l 1000 huge.txt chunk_Splits `huge.txt` into 1000-line files named `chunk_aa`, `chunk_ab`, …. Other modes: `-b 10M` (byte size), `-n 10` (10 equal-size chunks). Default suffix is 2 lowercase letters → 26² = 676 files max; `-a 4` raises to 4 chars = 456,976 files.
split -l 1000 huge.txt chunk_Same external binary. Zsh process substitution can pipe to `split` from a transformed stream: `split -l 1000 <(zcat huge.gz) chunk_`.
split -l 1000 huge.txt chunk_Same external binary. `string split` is a fish builtin but it splits STRINGS by a separator into multiple values (different operation entirely — for in-shell list manipulation).
$lines = Get-Content huge.txt; for ($i=0; $i -lt $lines.Count; $i += 1000) { $lines[$i..($i+999)] | Set-Content "chunk_$([Math]::Floor($i/1000)).txt" }No native pwsh `split`. The idiom uses array slicing in a `for` loop. For byte-size chunks, switch to `[System.IO.File]::ReadAllBytes()` + chunked-write. Memory-wise, `Get-Content huge.txt` materialises the whole file — for files > available RAM, use `.NET`'s `StreamReader` line-by-line.
powershell -Command "$l=gc huge.txt;for($i=0;$i -lt $l.Count;$i+=1000){$l[$i..($i+999)]|sc \"chunk_$([Math]::Floor($i/1000)).txt\"}"No native cmd `split`. Shell out to pwsh. Older Windows machines without pwsh need a `for /F` loop with a counter — much slower and more error-prone.
Worked examples
Split a CSV by line count (1M lines each)
split -l 1000000 huge.csv part_ --additional-suffix=.csvsplit -l 1000000 huge.csv part_ --additional-suffix=.csv$lines = Get-Content huge.csv; for ($i=0; $i -lt $lines.Count; $i += 1000000) { $lines[$i..($i+999999)] | Set-Content "part_$([Math]::Floor($i/1000000)).csv" }Split a binary by size (100MB chunks)
split -b 100M backup.bin chunk_split -b 100M backup.bin chunk_$bytes = [System.IO.File]::ReadAllBytes("backup.bin"); $chunk = 100MB; for ($i=0; $i -lt $bytes.Count; $i += $chunk) { [System.IO.File]::WriteAllBytes("chunk_$([Math]::Floor($i/$chunk))", $bytes[$i..($i+$chunk-1)]) }Split into exactly N equal chunks
split -n 10 huge.txt chunk_split -n 10 huge.txt chunk_Gotchas
- Default suffix is 2 lowercase letters (`aa`–`zz`) → maximum 676 files. Splitting a 10GB file at 10MB chunks produces 1000+ chunks → `split: output file suffixes exhausted` error. Use `-a 4` (4-char suffix, 456,976 max) up-front for large workloads, or `-d` for numeric suffixes (`-d -a 6` for 6-digit numeric, 1M max).
- macOS BSD `split` PRE-dates GNU coreutils — `split -n` (chunk-count) is GNU-only; older macOS will reject it. `-b` size flag works on both but only GNU accepts `K/M/G` suffixes uppercase (BSD wants lowercase `k/m/g`, no `G`). For cross-OS scripts: install GNU coreutils via `brew install coreutils` and call `gsplit`.
- `-n l/N` (lower-case L slash N) splits BY LINES into N chunks — not bytes. Easy to misread: `split -n 10 file` is byte-equal chunks; `split -n l/10 file` is line-equal chunks. Both produce N output files but with different content boundary semantics.
- pwsh `Get-Content huge.txt` loads the WHOLE FILE into memory before the loop runs. For files > available RAM, use `[System.IO.File]::OpenText(...)` + `ReadLine()` in a `while` loop. The .NET streaming approach scales to TB-class files; the array-load approach OOMs.
- After split, reassembly requires matching the chunk order. `cat chunk_*` works because lexicographic order matches creation order — UNTIL suffix-length differs (`chunk_99` vs `chunk_100` sort as `100, 99` because lex). Always use fixed-width suffix (`-a 4`) or numeric (`-d`) for guaranteed reassembly.