iconv — Convert text files between encodings (UTF-8, UTF-16, Latin-1, etc) across all 5 shells
Equivalents in every shell
iconv -f UTF-8 -t UTF-16LE input.txt > output.txt`-f FROM -t TO`. Add `//IGNORE` after the target encoding to drop unmappable chars (`-t ASCII//IGNORE`), or `//TRANSLIT` for best-effort lossy conversion (`é` → `e`). `iconv -l` lists every supported encoding.
iconv -f UTF-8 -t UTF-16LE input.txt > output.txtSame external binary. Zsh redirections handle BOM-bytes correctly when piping: `echo $'\xef\xbb\xbf'; cat utf8-file.txt | iconv -f UTF-8 -t UTF-16LE` prepends a UTF-8 BOM first.
iconv -f UTF-8 -t UTF-16LE input.txt > output.txtSame external binary. Fish process substitution: `iconv -f UTF-8 -t UTF-16LE (some_cmd | psub) > output.txt`.
$content = Get-Content input.txt -Raw -Encoding UTF8; [System.IO.File]::WriteAllText("output.txt", $content, [System.Text.Encoding]::Unicode)`[System.Text.Encoding]::Unicode` is UTF-16LE-with-BOM. Other options: `UTF8` (no BOM in pwsh 6+, BOM in 5.1!), `UTF8NoBOM`, `ASCII`, `Default` (current ANSI codepage). For full encoding list: `[System.Text.Encoding]::GetEncodings()`.
powershell -Command "[IO.File]::WriteAllText('output.txt', [IO.File]::ReadAllText('input.txt', [Text.Encoding]::UTF8), [Text.Encoding]::Unicode)"No native cmd `iconv`. Shell out to pwsh as above, or use `chcp 65001` to switch the console codepage to UTF-8 first (then redirection captures UTF-8 bytes correctly — but file conversion still needs pwsh).
Worked examples
Convert UTF-16 (Windows-default Notepad) to UTF-8
iconv -f UTF-16LE -t UTF-8 input.txt > output.txticonv -f UTF-16LE -t UTF-8 input.txt > output.txt$c = Get-Content input.txt -Raw -Encoding Unicode; [IO.File]::WriteAllText("output.txt", $c, (New-Object System.Text.UTF8Encoding $false))Convert GBK (Chinese Windows codepage 936) to UTF-8
iconv -f GBK -t UTF-8 chinese.txt > utf8.txticonv -f GBK -t UTF-8 chinese.txt > utf8.txt$c = Get-Content chinese.txt -Raw -Encoding ([System.Text.Encoding]::GetEncoding(936)); [IO.File]::WriteAllText("utf8.txt", $c, [System.Text.UTF8Encoding]::new($false))Strip non-ASCII characters from a UTF-8 file (lossy)
iconv -f UTF-8 -t ASCII//TRANSLIT input.txt > ascii.txt$c = (Get-Content input.txt -Raw) -replace "[^\x00-\x7F]", ""; Set-Content ascii.txt -Value $c -Encoding ASCIIGotchas
- Alpine `iconv` (musl libc) supports a MUCH smaller encoding set than glibc — many CJK and legacy codepages are missing. `apk add libc6-compat` does NOT add them. Use Debian/Ubuntu base images for full encoding coverage, or pre-convert on the host.
- BOM handling is silent: `iconv -t UTF-8` does NOT add a BOM, `iconv -t UTF-16LE` DOES add one (BSD/glibc/musl all agree here). For BOM-free UTF-16 (rare but some tools demand it), pipe through `tail -c +3` to strip the leading 2 BOM bytes.
- pwsh 5.1 (Windows PowerShell, NOT pwsh 7+) writes UTF-8 WITH a BOM by default — even when using `-Encoding UTF8`. Many Unix tools choke on the BOM. To force no-BOM UTF-8 in 5.1: `[IO.File]::WriteAllText($path, $content, (New-Object System.Text.UTF8Encoding $false))` where `$false` = "no BOM".
- `//IGNORE` silently drops unmappable chars — no warning, no exit code change. Diff the byte counts before/after to confirm what was lost: `wc -c input.txt; wc -c output.txt`. For audit-trail conversion, log to stderr: `iconv -f UTF-8 -t ASCII//IGNORE 2>&1 | tee log.txt`.
- Windows `chcp 65001` (switch console to UTF-8 codepage) affects DISPLAY only — pipe redirection still uses the previous codepage in many cmd contexts. Files written via `cmd> echo foo > out.txt` get the OS-default ANSI encoding (cp1252 in en-US), not UTF-8, despite the chcp change. Always use pwsh for encoding-sensitive write paths.