If you've used a Mac in the last few years you've probably seen Live Text in action even if you didn't know its name. Hover over a photo of a sign or a screenshot stuck inside Preview and the cursor turns into a text caret. You highlight, you copy, you move on. It's the kind of feature that quietly disappears into the workflow, which is the highest compliment a system feature can earn.

So why write a comparison piece? Because once you start needing OCR for work — pulling a paragraph out of a Slack screenshot, grabbing a Zoom slide, copying a code snippet from a YouTube tutorial paused in Chrome — you bump into the edges of Live Text faster than you'd expect. The edges aren't bugs. They're the natural consequences of how Live Text was integrated into macOS. Understanding where they are tells you whether you can stop at the built-in tool or whether you need a second one in your toolbox.

This piece is the team's honest take on both tools. We make Cheese! OCR, so the bias is real. We've also tried to call out where Live Text is genuinely the better choice, because pretending otherwise would waste your time.

What macOS Live Text is, and what it does well

Apple introduced Live Text in macOS 12 Monterey (2021) and refined it in subsequent releases. The core idea is that any image rendered by an Apple-controlled view — say, an image in Photos, a page in Preview, a frame in QuickTime, an image inside Safari — can be analyzed in place. The text becomes selectable as if it were native text. You don't trigger anything; the recognition runs on demand when you start interacting with the image.

The places where it shines are predictable and pleasant:

For the casual user, this is plenty. If your daily OCR need is "occasionally pull a phone number off a photo," you don't need anything more.

The four scenarios where Live Text falls short

Once you start working with text-heavy material across multiple apps, four gaps tend to show up.

1. Third-party apps that render their own image viewers

Live Text is plumbed into AppKit's image rendering, but third-party apps with custom image components don't always pick it up. Slack's image viewer, Discord's lightbox, Notion's embedded images, Telegram's media preview, and many enterprise apps render images in ways that bypass the system text-selection layer. You can see the text on screen and you cannot select it.

The workaround inside macOS — open the image in Preview — is fine for one image but tedious as a habit. You have to right-click, save, find the file, open it, then OCR. By that point you could have screenshotted and OCR'd twice.

2. Video conferencing

Zoom, Microsoft Teams, Google Meet (in Chrome), and Webex frequently mark their meeting windows with screen-protection or DRM-style flags. Apple's screen capture and Live Text both respect those flags in many configurations. The result: you see a slide, you try to drag-select the bullet points, nothing selects. Sometimes a screenshot of the meeting saves as a black rectangle.

This isn't malice on the conferencing apps' part — it's a privacy-by-default choice meant for sensitive presentations. But for the perfectly mundane case of "I want to copy a URL from this slide," it's a wall. A dedicated screenshot OCR tool that uses the system's screen-recording APIs can usually capture and recognize the visible pixels, depending on how the source app draws its frames.

3. Protected PDFs and certain corporate documents

Native PDFs with text layers are easy: any tool can copy from them, Live Text included. Scanned PDFs are harder, and Preview's Live Text usually handles them. The trouble starts with the in-between cases:

If you're allowed to read the document but not to select its text, you have a legitimate use case for OCR: you're not bypassing access control, you're working around UI-level copy restrictions on a document already in your hands. Cheese! OCR captures the rendered pixels and returns the text, the same way you might transcribe a paragraph by hand.

4. Video frames outside Safari

Live Text in paused video works great in Safari. It does not work in Chrome, Firefox, or Edge, because those browsers use their own rendering engines and don't expose frames to AppKit's Live Text hooks. It also doesn't reach IINA, VLC, or most media players. If you're watching a coding tutorial in Chrome and the instructor types out a command, you're left transcribing it.

Same problem, same solution: a screenshot OCR tool reads the pixels regardless of which app put them there.

How Cheese! OCR fills the gaps

Cheese! OCR is a menu bar app that does one thing. You press a global hotkey (default ⇧⌘E, configurable). Your screen dims, the cursor becomes a crosshair, and you drag-select a rectangle. The recognized text lands on your clipboard. Paste it wherever you need it. The whole interaction takes about two seconds once muscle memory kicks in.

A few details that matter for the comparison:

None of these are revolutionary on their own. Together they cover the four gaps above without forcing a context switch.

When Live Text is actually enough

It's worth being explicit about the cases where you don't need a second tool:

If any of these describe you, stop reading and use the tool you already have. We mean it.

Side-by-side comparison

Aspect macOS Live Text Cheese! OCR
Price Free, built into macOS 12+ $5.99 one-time, no subscription
Coverage Apple-native apps and WebKit views Anything visible on screen
Trigger Hover and select text inside an image Global hotkey + drag-select
Languages Apple Vision (multiple, on-demand) EN / ZH-Hans / JA / KO recognized automatically
History None Local, searchable
Engine Apple Vision, on-device Apple Vision, on-device
Network None Zero entitlements, verifiable
Best for Casual, in-app text grabs Cross-app, repeatable, high-volume

A simple decision framework

Three rules cover most situations.

  1. If you can hover and the cursor turns into a text caret, use Live Text. It's already loaded.
  2. If the source is a third-party app, a video conference, a non-Safari browser, or a protected document, reach for the hotkey.
  3. If you do this more than a few times a week, configure the hotkey and let muscle memory take over. The choice between tools should be subconscious within a day or two.

In practice we keep both tools enabled all the time. Live Text handles the polite, native cases automatically; Cheese! OCR handles the awkward ones. If you'd like to try the second half of that setup, the app is on the Mac App Store. Below the FAQ we've also linked to a few related articles you might want to read after this one — particularly the PDF guide and the screenshot guide, which build on the same ideas in more depth.