pdfsandwich generates {"sandwich"} OCR PDF files, i.e. PDF files which contain only images (no text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly {"behind"} the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text.

WWW: http://www.tobias-elze.de/pdfsandwich/
