CLI reference
The zhhz binary reads text from stdin or files, writes converted text to stdout, and errors to stderr. Same input → byte-identical output every time. No TUI, no progress bars, no network calls.
Synopsis
zhhz [OPTIONS] [FILE...]
zhhz detect [OPTIONS] [FILE...]
zhhz --list
zhhz --version
zhhz --help
Conversion (zhhz convert, default)
echo '汉字' | zhhz # default s2t: 漢字
echo '漢字' | zhhz -c t2s # t2s: 汉字
echo '信息' | zhhz -c s2twp # s2twp: 資訊
zhhz -c s2t input.txt # convert a file
zhhz -c s2t -i input.txt # rewrite in place
| Flag | Description |
|---|---|
-c, --config <NAME> |
One of the 16 OpenCC configs (default s2t). See zhhz --list. |
-i, --in-place |
Rewrite the input file(s) instead of writing to stdout. |
-f, --from <REGION> |
Semantic source region (cn-s, cn-t, cn-tw, cn-hk, jp-t, jp-n). Alternative to -c. |
-t, --to <REGION> |
Semantic target region. Alternative to -c. |
--dict <PATH> |
Custom dictionary file (TSV: key<TAB>value). Repeatable; entries override built-in tables at the highest priority. |
--no-detect |
Skip automatic script-variant detection when -f/-t are not given. |
- as a filename means stdin. Multiple input files are processed sequentially.
Detection (zhhz detect)
Mirrors chardet’s CLI: file or stdin, --files-from, -0 --null, recursive dir walk.
echo '汉字计算机软件' | zhhz detect # cn-s 57 -
echo '漢字計算機軟體' | zhhz detect # cn-t 66 -
echo 'こんにちは世界' | zhhz detect # jp-n 50 -
zhhz detect corpus.txt # cn-s ... corpus.txt
Output is tab-separated: <region>\t<confidence>\t<path>. confidence is 0–100 (share of signature characters in the input). path is - for stdin. Region codes are the six listed above, or unknown when there are no CJK characters / kana.
| Flag | Description |
|---|---|
--files-from <PATH\|-> |
Read newline-separated list of paths from a file (or stdin with -). |
-0, --null |
Use NUL-separated file lists (with --files-from). |
-r, -R, --recursive |
Recursively walk directories for CJK files. |
-q, --quiet |
Suppress non-output lines (e.g. “skipping binary file”). |
Configs
16 OpenCC configs, listed via zhhz --list:
| Config | Direction |
|---|---|
s2t / t2s |
Simplified ↔ Traditional (OpenCC standard) |
s2tw / tw2s |
Simplified ↔ Traditional (Taiwan) |
s2twp / tw2sp |
…with Taiwan phrases |
s2hk / hk2s |
Simplified ↔ Traditional (Hong Kong) |
s2hkp / hk2sp |
…with Hong Kong phrases |
t2tw / tw2t |
Traditional (standard) ↔ Taiwan |
t2hk / hk2t |
Traditional (standard) ↔ Hong Kong |
t2jp / jp2t |
Japanese Kyūjitai ↔ Shinjitai |
Semantic region flags (--from / --to) are an alias for the config names — --from cn-s --to cn-tw is equivalent to -c s2twp (the phrase-aware variant is preferred when both exist).
Examples
# Taiwan phrase conversion: 鼠标 → 滑鼠
echo '鼠标' | zhhz --from cn-s --to cn-tw
# 滑鼠
# Custom terminology override
cat > /tmp/mywords.tsv <<EOF
软件 軟體
独家 獨家
EOF
echo '买软件吃独家' | zhhz -c s2t --dict /tmp/mywords.tsv
# 買軟體喫獨家
# Batch convert a directory tree
find corpus/ -name '*.txt' -print0 | xargs -0 zhhz -c s2twp -i
# Pipe through other tools
echo '汉字' | zhhz -c s2t | tr ' ' '\n' | sort | uniq -c | sort -rn | head
Exit codes
| Code | Meaning |
|---|---|
0 |
Success. |
1 |
Conversion error (bad config name, malformed custom-dict line, etc.). |
2 |
I/O error (file not found, permission denied, etc.). |
Errors are written to stderr with a short message. The CLI never silently fails; if you see no output, that’s the answer.