Skip to content
shellmap

Split a large file into pieces

Split a single large file into smaller chunks by byte count or line count.

How to split a large file into pieces in each shell

Bashunix
split -b 100M big.iso big.iso.part_

`-b SIZE` = bytes per chunk (`100M`, `2G`, `512K`). `-l N` = N lines per chunk (for text). `-n N` = split into exactly N pieces (`-n l/4` = 4 line-aware pieces; `-n r/4` = 4 round-robin pieces). Output: `big.iso.part_aa`, `big.iso.part_ab`, ... Reassemble: `cat big.iso.part_* > big.iso` (alphabetical glob order matters — `aa` before `ab` before `ba`).

Zshunix
split -b 100m big.iso big.iso.part_

Same external `split`. macOS BSD `split` uses LOWERCASE `m`/`k`/`g` (`100m` = 100 MB); GNU uses either case. BSD `split` has NO `-n N` flag (`split -n l/4` is GNU-only) — for N equal pieces on macOS: `split -b $(( $(stat -f%z big.iso) / 4 + 1 )) big.iso part_`.

Fishunix
split -b 100M big.iso big.iso.part_

Same external. Fish status: `if split -b 100M big.iso big.iso.part_; echo done; end`.

PowerShellwindows
$chunk=104857600; $reader=[IO.File]::OpenRead("big.iso"); $i=0; $buf=New-Object byte[] $chunk; while(($n=$reader.Read($buf,0,$chunk)) -gt 0){ [IO.File]::WriteAllBytes("big.iso.part_$($i.ToString('D3'))", $buf[0..($n-1)]); $i++ }; $reader.Dispose()

pwsh has NO native `split` cmdlet. Above is the .NET FileStream pattern for 100 MB binary chunks. For TEXT line-split: `Get-Content big.txt -ReadCount 1000 | ForEach-Object -Begin {$i=0} -Process { $_ | Set-Content "part_$($i.ToString('D3')).txt"; $i++ }` writes 1000-line chunks. Or shell out to `split` via WSL / Git Bash if available.

cmd.exewindows
powershell -Command "$bytes=[IO.File]::ReadAllBytes('big.iso'); $chunkSize=104857600; for($i=0;$i*$chunkSize -lt $bytes.Length;$i++){ [IO.File]::WriteAllBytes('big.iso.part_'+$i, $bytes[($i*$chunkSize)..([Math]::Min(($i+1)*$chunkSize-1, $bytes.Length-1))]) }"

cmd has NO native split. The pwsh shell-out above reads the whole file into memory (`ReadAllBytes`) — fine for ≤ 1 GB on a 16 GB box, fails on 10 GB+. For huge files use the streaming pattern from the PowerShell row. Legacy `split` from UnxUtils / GnuWin32 is the cleanest if you can install it.

Equivalents listed for Bash, Zsh, Fish, PowerShell, cmd.exe.

Gotchas & notes

  • **Byte split + reassembly are LOSSLESS only if you use a portable concatenation**: GNU `split -b 100M file part_` + `cat part_* > file` is byte-perfect on Linux/macOS. Windows native `copy /b part1+part2+part3 file` works for binary reassembly. `type part1 part2 > file` (cmd) does NOT — `type` is TEXT-mode + interprets EOF (Ctrl-Z, 0x1A) as end-of-stream — corrupts binary mid-stream. Always use `copy /b` on Windows for binary parts.
  • **Suffix length & lexical order**: default `split` uses 2-letter suffixes (`aa, ab, ..., az, ba, bb, ...`) → 676 chunks max before overflow. `split -a 4` uses 4-letter suffixes (≤ 456,976 chunks). The lexical-sort `cat part_*` works because `aa < ab < az < ba` — but if the suffix length exceeds the default and you don't expand to `-a 4`, GNU `split` errors with `output file suffixes exhausted`. Match `-a` to expected chunk count.
  • **Line-split — `csplit` for regex boundaries**: `split -l 1000` splits every N lines BLINDLY (mid-record breaks possible). `csplit big.log '/^---/' '{*}'` splits at every line matching `^---` — preserves logical records (chapters, JSON blob delimiters, log boundaries). `csplit` exists on Linux + macOS + BSD; not on Windows native (use pwsh `Switch -File -Regex` + accumulator pattern).
  • **pwsh `Get-Content -ReadCount` controls memory**: by default `Get-Content` reads ONE LINE AT A TIME (slow on huge files — 1 KB file = fine; 1 GB log = takes minutes). `Get-Content -ReadCount 1000` batches 1000 lines per pipeline element — 10–100× faster for line-iteration. `Get-Content -Raw` reads the WHOLE file into one string — fast for small files, OOM-killer for multi-GB.

Related commands

Related tasks