Detect and fix corrupted UTF-16/BOM files by converting them to clean UTF-8 in-place. Runs as an MCP server (stdio) or standalone CLI tool.
- Detects UTF-8, UTF-8 BOM, UTF-16 LE/BE (with and without BOM), null-laced ASCII, and mixed encoding
- Converts all detected encodings to clean UTF-8
- Reports unpaired surrogates (replaced with U+FFFD)
- Optional
.bakbackup before in-place modification - Preserves original file permissions
- Structured JSON output for automation
Requires Go 1.25+.
go build -o utf8 .
cp utf8 /usr/local/bin/utf8Or with just:
just install# Fix a file in-place (creates .bak backup)
utf8 --cli --file /path/to/file.txt
# Fix without backup
utf8 --cli --file /path/to/file.txt --no-backup
# Output converted content to stdout, JSON result to stderr
utf8 --cli --file /path/to/file.txt --stdout# Start as MCP stdio server (default)
utf8Add to your MCP config:
{
"mcpServers": {
"utf8": {
"command": "/usr/local/bin/utf8",
"args": []
}
}
}The server exposes a single tool utf8_fix with parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
file |
string | yes | Absolute path to the file to fix |
backup |
boolean | no | Create a .bak backup (default: true) |
| Code | Meaning |
|---|---|
| 0 | Success (file was converted) |
| 1 | Error |
| 2 | No changes needed (already valid UTF-8) |
- macOS
- Linux
Windows is not supported.
go test -v ./...Or with just:
just test- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Commit your changes
- Push to the branch (
git push origin feature/your-feature) - Open a Pull Request
MIT