Why Scanned PDFs Are So Big — and How to Compress Them
A scanned PDF is huge because it's a fundamentally different thing from a text document: every scanned page is really a high-resolution color image, and a few of them quickly add up to tens of megabytes. Shrinking it comes down to down-sampling and re-compressing those images — which is exactly what compress cat is built for. Set a target size in Compress PDF and click Start, and it re-samples the scanned pages locally in your browser, often taking a file from tens of MB down to a few hundred KB while keeping it readable. Below: why scans are so big and how to compress them well.
Why is a scanned PDF tens of megabytes?
Because a scan contains no 'text' — only images. A scanner or phone camera produces bitmaps, and the PDF just wraps those images one per page. An image's size depends on its resolution (DPI) and color: scanners often default to 300 or even 600 DPI in full color, so a single A4 page can be several megabytes, and a dozen pages add up fast.
A text PDF exported straight from Word or Google Docs is the opposite — the text is vector data and barely adds size, so the same number of pages is often just a few hundred KB. So 'same ten pages, why is the scan the big one?' comes down entirely to one being images and the other being text.
- High DPI: scans default to 300–600 DPI; more resolution means more pixels and a bigger file.
- Color: a color scan is far larger than grayscale or black-and-white, and many text documents don't need color at all.
- No text layer: a scanned page is pure pixels, so all of its size comes from the image, not compact text.
Scans compress best with compress cat — because it's built for image PDFs
compress cat compresses by rasterizing — rendering each page, then down-sampling it and raising the JPEG ratio. For a scan that's exactly right: the page was already an image, so compressing it breaks no 'text structure' (there was no selectable text to begin with) yet can strip 80–95% of the size. That's why it shrinks scans far more aggressively than text documents.
- Open Compress PDF and drag the scan in (several at once is fine).
- Set a target, or click a 200KB / 500KB / 1MB preset to match the upload limit you're beating.
- Click Start; compress cat re-samples each page and binary-searches toward your target locally.
- Compare the before/after, zoom in to confirm the text is still crisp, and download.
How do you shrink it without losing legibility?
The danger with scans is compressing until they're unreadable. A few habits keep you under the limit and clear: cut color at the source, don't crush it in one step, and trim pages when there are many.
- Save size at the source: for black-and-white text, scan in 'black & white' or grayscale — much smaller, and steadier to compress.
- Tighten in steps: hit a looser target first, confirm it's acceptable, then push smaller — don't aim for 50KB straight away.
- Trim first when there are many pages: a dozen scanned pages won't fit in a few hundred KB, so remove blanks with Delete Pages or keep only what you need with Split PDF.
- Zoom in page by page afterwards to confirm text, stamps and signatures are clear before uploading or sending.
Frequently asked questions
Why is my scanned PDF tens of megabytes when it's only a few pages?
Because each scanned page is a high-resolution color image, not text. A 300–600 DPI color scan can be several MB per page, so a few pages reach tens of MB. Re-sampling them with Compress PDF often strips 80–90% of the size.
Can I still select and search the text after compressing a scan?
No — but that's no loss for a scan, since the page was always an image with no selectable text. If you need searchable, copyable text, add a text layer first with OCR (that's a separate step).
How should I scan so the file is smaller to begin with?
Scan in grayscale or black-and-white (color is unnecessary for plain text) and set the DPI to 200–300 (sharp without being bloated). Size you save at the source costs less clarity than crushing it afterwards.
Is compressing a scan safe — is it uploaded?
Safe, and not uploaded. Compression runs locally in your browser via WebAssembly, so scanned contracts, IDs and the like stay on your device and never touch a server.
Updated · compress cat team