macvision

Turn any image into agent-friendly JSON — local macOS OCR and image understanding.

What is macvision?

macvision wraps Apple’s Vision framework in a tiny Swift binary. Point it at a screenshot, photo, or scan and get back text, scene labels, and detected faces, barcodes, and documents — all as compact JSON, all processed on your Mac. There’s no big model to download and nothing is uploaded.

Use it wherever you’d otherwise pay for an LLM vision call: OCR the image locally for free, then send only the text to your model.

At a glance

macvision ocr ./screenshot.png                       # extract text
macvision ocr ./screenshot.png --lang zh-Hans,en-US   # Chinese + English
macvision classify ./photo.jpg --top 5                # scene/object labels
macvision detect ./photo.jpg --barcodes               # barcodes / QR
macvision feature ./a.jpg --compare ./b.jpg           # image distance

Output schema: {"ok": true, ...} on success, {"ok": false, "error": "..."} on failure.

For AI agents

Paste this one-line prompt into Claude Code, Cursor, or any agent’s system prompt:

Use `macvision` to read images on macOS (OCR, classify, detect). Install if missing: `brew install ljh-sh/cli/macvision`. JSON output, check `ok`. Run `macvision --help` for subcommands.

Where to go next