Can I use Live Text and Cheese! OCR together?

Yes. They don't conflict. Many people leave Live Text on for Photos and Safari, then reach for the Cheese! OCR hotkey when they hit a third-party app or a video frame. Both rely on Apple Vision under the hood, so accuracy is comparable; the difference is coverage.

Does Live Text work in Chrome?

Not directly. Live Text is hooked into WebKit, which Safari uses. Chrome and Firefox use their own engines, so Live Text won't recognize text inside images or paused video frames there. A screenshot OCR tool sidesteps the issue by reading pixels regardless of which app rendered them.

Will Cheese! OCR be slower than Live Text?

Both run Apple Vision on-device, so raw recognition speed is similar. The user-facing speed depends on workflow: Live Text needs you to hover and select text in the image, Cheese! OCR needs a hotkey and a drag. For a few words, Live Text is often faster; for long passages or anything that isn't text-selectable, Cheese! OCR ends up being faster overall.

Can either tool recognize handwriting?

Apple Vision's printed-text recognition is strong; handwriting recognition is more limited and language-dependent. English handwriting that's neat and high-contrast often works. Cursive, dense notes, or non-Latin handwriting are unreliable in both tools because they share the same underlying model.

Do I need to give Cheese! OCR special permissions?

Cheese! OCR needs Screen Recording permission, because it captures the area you drag-select. macOS prompts you the first time you trigger the hotkey. The app has zero network entitlements in its sandbox, so granting Screen Recording does not mean the captured pixels leave your Mac.

Cheese! OCR vs macOS Live Text: When Apple's Built-in Tool Isn't Enough

If you've used a Mac in the last few years you've probably seen Live Text in action even if you didn't know its name. Hover over a photo of a sign or a screenshot stuck inside Preview and the cursor turns into a text caret. You highlight, you copy, you move on. It's the kind of feature that quietly disappears into the workflow, which is the highest compliment a system feature can earn.

So why write a comparison piece? Because once you start needing OCR for work — pulling a paragraph out of a Slack screenshot, grabbing a Zoom slide, copying a code snippet from a YouTube tutorial paused in Chrome — you bump into the edges of Live Text faster than you'd expect. The edges aren't bugs. They're the natural consequences of how Live Text was integrated into macOS. Understanding where they are tells you whether you can stop at the built-in tool or whether you need a second one in your toolbox.

This piece is the team's honest take on both tools. We make Cheese! OCR, so the bias is real. We've also tried to call out where Live Text is genuinely the better choice, because pretending otherwise would waste your time.

What macOS Live Text is, and what it does well

Apple introduced Live Text in macOS 12 Monterey (2021) and refined it in subsequent releases. The core idea is that any image rendered by an Apple-controlled view — say, an image in Photos, a page in Preview, a frame in QuickTime, an image inside Safari — can be analyzed in place. The text becomes selectable as if it were native text. You don't trigger anything; the recognition runs on demand when you start interacting with the image.

The places where it shines are predictable and pleasant:

Photos.app: any photo you've ever taken with text in it — a receipt, a whiteboard, a business card, a street sign — becomes searchable and copyable.
Preview.app: open an image (PNG, JPEG, HEIC) and you can select text directly. For PDFs, Live Text handles native PDFs trivially because the text is already there; it also recognizes scanned PDFs in many cases.
Safari: paused video frames, images inside articles, and even text inside SVG-rendered content are usually selectable.
Notes and Mail: drop in an image, and the text inside it is selectable.
Quick Look: the floating preview from the Finder spacebar shortcut also exposes Live Text.

For the casual user, this is plenty. If your daily OCR need is "occasionally pull a phone number off a photo," you don't need anything more.

The four scenarios where Live Text falls short

Once you start working with text-heavy material across multiple apps, four gaps tend to show up.

1. Third-party apps that render their own image viewers

Live Text is plumbed into AppKit's image rendering, but third-party apps with custom image components don't always pick it up. Slack's image viewer, Discord's lightbox, Notion's embedded images, Telegram's media preview, and many enterprise apps render images in ways that bypass the system text-selection layer. You can see the text on screen and you cannot select it.

The workaround inside macOS — open the image in Preview — is fine for one image but tedious as a habit. You have to right-click, save, find the file, open it, then OCR. By that point you could have screenshotted and OCR'd twice.

2. Video conferencing

Zoom, Microsoft Teams, Google Meet (in Chrome), and Webex frequently mark their meeting windows with screen-protection or DRM-style flags. Apple's screen capture and Live Text both respect those flags in many configurations. The result: you see a slide, you try to drag-select the bullet points, nothing selects. Sometimes a screenshot of the meeting saves as a black rectangle.

This isn't malice on the conferencing apps' part — it's a privacy-by-default choice meant for sensitive presentations. But for the perfectly mundane case of "I want to copy a URL from this slide," it's a wall. A dedicated screenshot OCR tool that uses the system's screen-recording APIs can usually capture and recognize the visible pixels, depending on how the source app draws its frames.

3. Protected PDFs and certain corporate documents

Native PDFs with text layers are easy: any tool can copy from them, Live Text included. Scanned PDFs are harder, and Preview's Live Text usually handles them. The trouble starts with the in-between cases:

Password-protected PDFs that allow viewing but disable copy and Live Text.
DRM-protected PDFs distributed by some publishers and corporate document systems.
PDFs rendered inside specific reader apps that disable text selection at the app level.

If you're allowed to read the document but not to select its text, you have a legitimate use case for OCR: you're not bypassing access control, you're working around UI-level copy restrictions on a document already in your hands. Cheese! OCR captures the rendered pixels and returns the text, the same way you might transcribe a paragraph by hand.

4. Video frames outside Safari

Live Text in paused video works great in Safari. It does not work in Chrome, Firefox, or Edge, because those browsers use their own rendering engines and don't expose frames to AppKit's Live Text hooks. It also doesn't reach IINA, VLC, or most media players. If you're watching a coding tutorial in Chrome and the instructor types out a command, you're left transcribing it.

Same problem, same solution: a screenshot OCR tool reads the pixels regardless of which app put them there.

How Cheese! OCR fills the gaps

Cheese! OCR is a menu bar app that does one thing. You press a global hotkey (default ⇧⌘E, configurable). Your screen dims, the cursor becomes a crosshair, and you drag-select a rectangle. The recognized text lands on your clipboard. Paste it wherever you need it. The whole interaction takes about two seconds once muscle memory kicks in.

A few details that matter for the comparison:

Same Apple Vision engine. Cheese! OCR is built on the Vision framework — the exact pipeline that powers Live Text. Accuracy on printed text is comparable. We are not claiming a custom model that beats Apple's; we are using Apple's, then wrapping it in a different workflow.
Cross-app by definition. Because the input is "whatever pixels are on your screen," it doesn't matter whether the source is Slack, Zoom, IINA, a Citrix session, or a remote desktop window. If you can see it, you can OCR it.
100% on-device. The app's sandbox declares no network entitlements at all. You can verify this in the Mac App Store privacy report. No screenshot ever leaves the machine.
Searchable history. Every recognition is timestamped and stored locally. If you pasted a paragraph into a chat an hour ago and lost it, the history pane finds it.
Multi-language by default. English, Simplified Chinese, Japanese, and Korean are recognized out of the box, automatically. You don't toggle a language for each capture.

None of these are revolutionary on their own. Together they cover the four gaps above without forcing a context switch.

When Live Text is actually enough

It's worth being explicit about the cases where you don't need a second tool:

You only OCR occasionally and only inside Apple apps. Photos, Preview, Safari, Notes. If that describes your week, Live Text is free, fast, and right there.
You want full image-text selection, not just copy-paste. Live Text lets you drag through text inside an image to highlight specific words. Cheese! OCR captures whatever you box-select and returns the entire result.
You don't want any third-party app at all. A reasonable preference. Live Text is built in, signed by Apple, and updated through the OS.
You need accessibility-level live recognition. Live Text integrates with VoiceOver and other accessibility services; a screenshot OCR tool is a different category of workflow.

If any of these describe you, stop reading and use the tool you already have. We mean it.

Side-by-side comparison

Aspect	macOS Live Text	Cheese! OCR
Price	Free, built into macOS 12+	$5.99 one-time, no subscription
Coverage	Apple-native apps and WebKit views	Anything visible on screen
Trigger	Hover and select text inside an image	Global hotkey + drag-select
Languages	Apple Vision (multiple, on-demand)	EN / ZH-Hans / JA / KO recognized automatically
History	None	Local, searchable
Engine	Apple Vision, on-device	Apple Vision, on-device
Network	None	Zero entitlements, verifiable
Best for	Casual, in-app text grabs	Cross-app, repeatable, high-volume

A simple decision framework

Three rules cover most situations.

If you can hover and the cursor turns into a text caret, use Live Text. It's already loaded.
If the source is a third-party app, a video conference, a non-Safari browser, or a protected document, reach for the hotkey.
If you do this more than a few times a week, configure the hotkey and let muscle memory take over. The choice between tools should be subconscious within a day or two.

In practice we keep both tools enabled all the time. Live Text handles the polite, native cases automatically; Cheese! OCR handles the awkward ones. If you'd like to try the second half of that setup, the app is on the Mac App Store. Below the FAQ we've also linked to a few related articles you might want to read after this one — particularly the PDF guide and the screenshot guide, which build on the same ideas in more depth.

Cheese! OCR vs macOS Live Text: When Apple's Built-in Tool Isn't Enough

What macOS Live Text is, and what it does well

The four scenarios where Live Text falls short

1. Third-party apps that render their own image viewers

2. Video conferencing

3. Protected PDFs and certain corporate documents

4. Video frames outside Safari

How Cheese! OCR fills the gaps

When Live Text is actually enough

Side-by-side comparison

A simple decision framework

Frequently Asked Questions

Try Cheese! OCR

What macOS Live Text is, and what it does well

The four scenarios where Live Text falls short

1. Third-party apps that render their own image viewers

2. Video conferencing

3. Protected PDFs and certain corporate documents

4. Video frames outside Safari

How Cheese! OCR fills the gaps

When Live Text is actually enough

Side-by-side comparison

A simple decision framework

Frequently Asked Questions

Try Cheese! OCR

Related articles

How to OCR a PDF on Mac (2026): Native, Scanned, and Everything In Between

How to Extract Text from a Screenshot on Mac (Three Methods Compared)

Cheese! OCR vs Text Sniper: Honest Comparison Between Two Mac OCR Tools