Skip to content
shellmap

splitSplit a file into smaller chunks by lines, bytes, or chunk count across all 5 shells

Equivalents in every shell

Bashunix
split -l 1000 huge.txt chunk_

Splits `huge.txt` into 1000-line files named `chunk_aa`, `chunk_ab`, …. Other modes: `-b 10M` (byte size), `-n 10` (10 equal-size chunks). Default suffix is 2 lowercase letters → 26² = 676 files max; `-a 4` raises to 4 chars = 456,976 files.

Zshunix
split -l 1000 huge.txt chunk_

Same external binary. Zsh process substitution can pipe to `split` from a transformed stream: `split -l 1000 <(zcat huge.gz) chunk_`.

Fishunix
split -l 1000 huge.txt chunk_

Same external binary. `string split` is a fish builtin but it splits STRINGS by a separator into multiple values (different operation entirely — for in-shell list manipulation).

PowerShellwindows
$lines = Get-Content huge.txt; for ($i=0; $i -lt $lines.Count; $i += 1000) { $lines[$i..($i+999)] | Set-Content "chunk_$([Math]::Floor($i/1000)).txt" }

No native pwsh `split`. The idiom uses array slicing in a `for` loop. For byte-size chunks, switch to `[System.IO.File]::ReadAllBytes()` + chunked-write. Memory-wise, `Get-Content huge.txt` materialises the whole file — for files > available RAM, use `.NET`'s `StreamReader` line-by-line.

cmd.exewindows
powershell -Command "$l=gc huge.txt;for($i=0;$i -lt $l.Count;$i+=1000){$l[$i..($i+999)]|sc \"chunk_$([Math]::Floor($i/1000)).txt\"}"

No native cmd `split`. Shell out to pwsh. Older Windows machines without pwsh need a `for /F` loop with a counter — much slower and more error-prone.

Worked examples

Split a CSV by line count (1M lines each)

Bash
split -l 1000000 huge.csv part_ --additional-suffix=.csv
Fish
split -l 1000000 huge.csv part_ --additional-suffix=.csv
PowerShell
$lines = Get-Content huge.csv; for ($i=0; $i -lt $lines.Count; $i += 1000000) { $lines[$i..($i+999999)] | Set-Content "part_$([Math]::Floor($i/1000000)).csv" }

Split a binary by size (100MB chunks)

Bash
split -b 100M backup.bin chunk_
Fish
split -b 100M backup.bin chunk_
PowerShell
$bytes = [System.IO.File]::ReadAllBytes("backup.bin"); $chunk = 100MB; for ($i=0; $i -lt $bytes.Count; $i += $chunk) { [System.IO.File]::WriteAllBytes("chunk_$([Math]::Floor($i/$chunk))", $bytes[$i..($i+$chunk-1)]) }

Split into exactly N equal chunks

Bash
split -n 10 huge.txt chunk_
Fish
split -n 10 huge.txt chunk_

Gotchas

  • Default suffix is 2 lowercase letters (`aa`–`zz`) → maximum 676 files. Splitting a 10GB file at 10MB chunks produces 1000+ chunks → `split: output file suffixes exhausted` error. Use `-a 4` (4-char suffix, 456,976 max) up-front for large workloads, or `-d` for numeric suffixes (`-d -a 6` for 6-digit numeric, 1M max).
  • macOS BSD `split` PRE-dates GNU coreutils — `split -n` (chunk-count) is GNU-only; older macOS will reject it. `-b` size flag works on both but only GNU accepts `K/M/G` suffixes uppercase (BSD wants lowercase `k/m/g`, no `G`). For cross-OS scripts: install GNU coreutils via `brew install coreutils` and call `gsplit`.
  • `-n l/N` (lower-case L slash N) splits BY LINES into N chunks — not bytes. Easy to misread: `split -n 10 file` is byte-equal chunks; `split -n l/10 file` is line-equal chunks. Both produce N output files but with different content boundary semantics.
  • pwsh `Get-Content huge.txt` loads the WHOLE FILE into memory before the loop runs. For files > available RAM, use `[System.IO.File]::OpenText(...)` + `ReadLine()` in a `while` loop. The .NET streaming approach scales to TB-class files; the array-load approach OOMs.
  • After split, reassembly requires matching the chunk order. `cat chunk_*` works because lexicographic order matches creation order — UNTIL suffix-length differs (`chunk_99` vs `chunk_100` sort as `100, 99` because lex). Always use fixed-width suffix (`-a 4`) or numeric (`-d`) for guaranteed reassembly.

WSL & PowerShell Core notes

pwshNo native pwsh `split`. The array-slicing pattern works cross-OS. For TB-class files prefer `.NET`'s `StreamReader` (text) or `FileStream` (binary) over the `Get-Content`/`ReadAllBytes` shortcuts that materialise everything. Pwsh 7+ adds `-ReadCount` to `Get-Content` for chunked reading: `Get-Content huge.txt -ReadCount 1000` yields 1000-line arrays per iteration.
WSLWSL `split` is GNU coreutils — full flag set including `-n`, `-d`, `-a`, `--additional-suffix`. Useful from Windows scripts that need GNU semantics: `wsl split -n l/10 /mnt/c/Users/.../huge.csv /mnt/c/Users/.../part_` (uses WSL paths but writes to Windows-visible locations).

Common tasks using split

Related commands