How to Compress a PDF Without Losing Quality

May 30, 20269 min read

Every PDF compression tool promises "smaller files, same quality." Most of them lie — or, more charitably, hide the trade-off. There is no magic. If you understand why a PDF is large, you can pick a compression strategy that actually keeps the parts you care about and sheds the parts you don't. This guide walks through that, with concrete numbers and the mistakes I see people make.

Why your PDF got so big in the first place

A PDF is a container. Inside it you'll typically find a mix of:

Text streams — the actual characters, usually with embedded fonts. Text is cheap. A 50-page text-only PDF weighs maybe 200 KB.
Embedded images — every photo, logo, chart, or scanned page. Images are by far the heaviest part of most PDFs. A single high-resolution photo can be 5 MB on its own.
Fonts — sometimes embedded in full, sometimes subsetted. A fully embedded font can add 200–500 KB per family.
Form fields, annotations, metadata, attachments — usually small, but they add up.

When someone says "my PDF is 25 MB," the answer almost always boils down to images. Either there are a lot of them, or they're at higher resolution than they need to be, or — the worst case — the document is a stack of scanned pages where each page is itself a giant raster image. Knowing which of these you've got determines what compression will actually work.

The three real compression strategies

Most online PDF compressors don't tell you which strategy they use. They give you a slider labelled "low / medium / high" and hope for the best. There are really only three things any compressor can do, and they have very different consequences.

1. Re-encode embedded images

This is the one you want most of the time. The compressor walks through every image stream inside the PDF, decodes it, re-encodes it at a lower quality (typically JPEG at 60–80%) or a more efficient codec, and writes it back. The text, fonts, and structure stay untouched.

For a typical report with a few embedded photos, this gets you 30–60% smaller files with quality loss that's invisible at normal viewing zoom. Text remains selectable and searchable. Form fields keep working.

Catch: it requires a real PDF parser that can find and rewrite image XObjects. Not every tool does this — some skip it entirely and only do strategy 3 below, which is why so many "compressors" save you 2%.

2. Rasterize every page

Render each page as a single image (usually JPEG), throw away the original text and structure, and rebuild the PDF as a flipbook of pictures. For scanned documents this is fantastic — the page alreadyis an image, and you can crush it from 2 MB to 200 KB at 150 DPI without anyone noticing.

For text documents this is a disaster. You lose selectable text. You lose searchability. Accessibility tools (screen readers) can't read the file any more. And — counter-intuitively — the file often gets larger, because rasterizing crisp vector text into pixels is deeply inefficient. A 400 KB text PDF can balloon to 4 MB after page-rasterization.

Use rasterization when you have scans. Avoid it when you have real text.

3. Re-save the PDF stream

The most boring and safest option. The compressor re-parses the PDF, deduplicates identical objects, runs zlib over uncompressed streams, and writes a clean new file. Nothing is altered, nothing is lost — the file is just packed more efficiently.

Realistic savings: 2–10%. On a file that was already produced by a modern tool (Word, InDesign, LaTeX), often less than 5%. On something exported from an older scanner or a misbehaving plugin, sometimes 30%.

This is the right strategy when you cannot lose any information — legal documents, signed contracts, forms that need to stay fillable.

So which one should you actually use?

Use this decision tree. It's the same one a good compressor should be running internally.

If your PDF is mostly text (reports, essays, books)

Try stream re-save first. If the file is already under 1 MB, you're done — there's nothing left to compress. If it's bigger, the size is almost certainly coming from a handful of embedded images. Use image re-encoding at 80% quality. Do not rasterize whole pages — you will make your text unsearchable for no reason.

If your PDF is a scanned document

Page rasterization at 150 DPI and 70% JPEG quality is usually the sweet spot. Each scanned page becomes a JPEG, and the savings can be enormous (10× is normal for 300 DPI scans). If you need the text to be searchable afterwards, run OCR before compressing, not after, and embed the OCR text layer.

If your PDF has mixed content (text + a few large images)

Image re-encoding at 75% quality. Avoid page rasterization unless you've checked that there's no important text. A good compressor will detect the mix and recommend it for you — for example, our own PDF compressor samples each page's whiteness and warns you before you flatten text into pixels.

If your PDF is a signed contract or a fillable form

Stream re-save only. Anything else may invalidate signatures or break form fields. If that gets you 5% savings, accept it. Some files genuinely can't be compressed further without losing what makes them legally usable.

Step by step: compressing without losing what matters

Identify what's in your PDF. Open it. Try selecting text on a page with the cursor. If the cursor selects characters, you have a text PDF. If it draws a rectangle over the whole page, you have a scan.
Check the actual size. A 1 MB PDF doesn't need compression. A 12 MB PDF does. Don't waste effort on files that are already small.
Pick the strategy using the decision tree above.
Compress in your browser, not on a server. Uploading a contract to a random website to "make it smaller" is a worse trade-off than the extra megabyte. Tools like our PDF compressor do everything client-side — the file never leaves your device.
Verify the result. Open the compressed file. Check that text still selects, form fields still fill in, signatures still validate. If anything broke, the strategy was wrong.
Keep the original. Compression is destructive in most modes. The compressed file is the deliverable; the original is the archive.

Mistakes that make people give up on PDF compression

Compressing a text PDF in "image mode"

Single most common mistake. You upload a 600 KB Word-exported PDF to a compressor, pick the "maximum" setting because more is better, and end up with a 4 MB file of fuzzy, unsearchable text. Always check the document type first.

Stacking compressors

Running a compressed PDF through another compressor rarely helps and often hurts. JPEG artifacts compound. Stream re-save can't shrink an already-packed stream. If the first pass got you 40% savings, the second pass will get you maybe 1%, and possibly degrade image quality further.

Lowering DPI below 96

96 DPI is roughly screen resolution. Below that, text becomes visibly mushy and images pixelate at any reasonable zoom. For screen-only viewing 96–120 DPI is fine. For anything that might be printed, stay at 150 DPI or higher.

Trying to compress an already-optimized PDF

PDFs exported by recent versions of Word, Google Docs, or LaTeX are already compressed about as well as a re-save will get you. If stream re-save returns less than 5% savings, the file is at the floor. Stop. The only way to go smaller is to remove content (re-encode images, drop pages, downsample).

Removing fonts thinking it'll help

Some compressors offer "remove embedded fonts." Don't, unless you're absolutely sure every recipient has the exact same fonts installed. Otherwise the document re-flows on their machine and looks broken. The 200 KB you save is rarely worth the cosmetic damage.

Privacy: why browser-based compression is the only safe option

Most online PDF compressors upload your file to their server, run a Ghostscript pipeline, and send the result back. For a tutorial PDF about cats this is fine. For a contract, a tax return, a medical record, a passport scan — it's not.

Server-side compressors keep your file in their pipeline for at least a few minutes, often longer for cache reasons. Their privacy policy might claim immediate deletion. You have no way to verify that.

Client-side compressors (the kind that runs in your browser tab) do the entire job locally. The PDF is read into memory, processed with WebAssembly or a JavaScript PDF library, and written back to a download — no network round-trip, no server involved. You can confirm this by opening DevTools and watching the network panel while you compress. There should be no upload.

If you're compressing anything sensitive, the only sane choice is a browser-based tool. The performance is fine — modern laptops handle 50 MB PDFs in a few seconds.

What "lossless" actually means here

"Lossless" is one of those words that gets thrown around carelessly. Be precise about what you mean.

Byte-for-byte lossless — impossible if you change the file in any way. Even a stream re-save changes bytes.
Visually lossless — the result looks identical on screen. JPEG at 90% quality is visually lossless for most photographs. JPEG at 75% is visually lossless at normal viewing zoom but starts showing artifacts when you zoom in.
Structurally lossless — text, fonts, form fields, and metadata are preserved. Only image streams might be re-encoded. This is what you usually want.

When a tool claims "lossless compression" it almost always means the third one. Read the fine print.

The honest bottom line

A 10 MB PDF rarely needs to be 10 MB. With the right strategy you can usually get it to 2–4 MB without anyone being able to tell the difference. But there's no slider that does this safely without you knowing what kind of document you have. Spend ten seconds figuring out whether your PDF is text or scan, pick the right strategy, and verify the result. That's the whole game.

If you want to try it without uploading anything, our browser-based PDF compressor detects whether your document is text-heavy or scan-heavy and recommends the right mode automatically. Everything runs on your device — the file never leaves your browser.