OCR PDF

OCR PDF

Extract text from scanned PDFs or create a searchable PDF with a hidden text layer

OCR is ready. Tesseract Python bindings detected. Make sure the language pack you select is installed in your Tesseract tessdata directory — you'll get a clear error if it isn't.

Two output modes:

  • Searchable PDF — keeps the original page images and adds an invisible text layer underneath, so you can copy-paste and search. The PDF still looks identical to the scan.
  • Extracted text — just the recognised text, plain.

Higher DPI = better OCR accuracy but slower. 200 DPI is the sweet spot for most scans; bump to 300+ for small fonts or low-quality scans.

Drag & drop file here

or click to browse Accepted: .pdf