PDF Hacks

RubyPDF Release pdf2htmlEX Windows Version

pdf2htmlEX is an open source tool that can easily convert PDF to HTML without losing text or format, the source code has released for a long time, but still no windows port, now, rubypdf.com gives us a chance to use this tool under windows, win32 static version, only one exe and some necessary resource files.

for details, please visit,

pdf2htmlEX Windows Verion

pdf2htmlEX v0.9 Windows Verion Release

btw, rubypf.com also releases a windows version mktemp, a little tool that safe temporary file creation from shell scripts

August 20, 2013 Posted by rubypdf | PDF News, Software | fontforge, jquery, pdf2html, pdf2html5, pdf2htmlex, pdftohtml, poppler | Leave a comment

Pdfgrep–freely search PDF with a grep like software

pdfgrep – search pdf files for a regular expression, it works similar to grep.

pdfgrep is an open source project developed by Hans-Peter Deifel, before RubyPDF blog released Pdfgrep Windows version, we can only find Linux and Mac version.

pdfgrep works much like grep, with one distinction: It operates on pages and not on lines.

OPTIONS

-i, –ignore-case

Ignore case distinctions in both the PATTERN and the input files.

-H, –with-filename

Print the file name for each match. This is the default setting when there is more than one file to search.

-h, –no-filename

Suppress the prefixing of file name on output. This is the default setting when there is only one file to search.

-n, –page-number

Prefix each match with the number of the page where it was found.

-c, –count

Suppress normal output. Instead print the number of matches for each input file. Note that unlike grep, multiple matches on the same page will be counted individually.

-C, –context NUM

Print at most NUM characters of context around each match. The exact number will vary, because pdfgrep tries to respect word boundaries. If NUM is “line“, the whole line will be printed. If this option is not set, pdfgrep tries to print lines that are not longer than the terminal width.

–color WHEN

Surround file names, page numbers and matched text with escape sequences to display them in color on the terminal. (The default setting is auto).

WHEN can be:

always: Always use colors, even when stdout is not a terminal.
never: Do not use colors.
auto: Use colors only when stdout is a terminal.

-R, -r, –recursive

Recursively search all files (restricted by –include and –exclude) under each directory.

–exclude=GLOB

Skip files whose base name matches GLOB. See glob(7) for wildcards you can use. You can use this option multiple times to exclude more patterns. It takes precedence over –include. Note, that in- and excludes apply only to files found via –recursive and not to the argument list.

–include=GLOB

Only search files whose base name matches GLOB. See –exclude for details. The default is *.pdf.

–unac

Remove accents and ligatures from both the search pattern and the PDF documents. This is useful if you want to search for a word containing ‘ae’, but the PDF uses the single character ‘æ’ instead. See unac(3) and unaccent(1) for details.

[This option is experimental and only available if pdfgrep is compiled with unac support.]

-q, –quiet

Suppress all normal output to stdout. Errors will be printed and the exit codes will be returned (see below).

–help

Print a short summary of the options.

-V, –version

Show version information

references,

http://pdfgrep.sourceforge.net/

http://blog.rubypdf.com/2012/09/03/pdfgrep-windows-version-releases/

http://soft.rubypdf.com/software/pdfgrep-windows-version

September 4, 2012 Posted by rubypdf | Uncategorized | grep, PDF Search, Pdfgrep, Poppler-CPP | Leave a comment

diffpdf-Free Cross Platform Software to compare PDF

DiffPDF can compare two PDF files. It offers two comparison modes: Text and Appearance.

By default the comparison is of the text on each pair of pages, but comparing the appearance of pages is also supported (for example, if a diagram is changed or if a paragraph is reformatted). It is also possible to compare particular pages or page ranges. For example, if there are two versions of a PDF file, one with pages 1-12 and the other with pages 1-13 because of an extra page having been added as page 4, they can be compared by specifying two page ranges, 1-12 for the first and 1-3, 5-13 for the second. This will make DiffPDF compare pages in the pairs (1, 1), (2, 2), (3, 3), (4, 5), (5, 6), and so on, to (12, 13).

Reference,

diffpdf-free software to compare two PDF files textually or visually

Free software to Compare the appearance difference of two PDF

diffpdf windows 32 version download address

November 19, 2010 Posted by rubypdf | Open Source, Software, Windows | compare pdf, pdfdiff, Qt4 | Leave a comment

Google Docs support OCR for PDF and Images

This feature only works for the following languages: English, French, Italian, German and Spanish. “For the technically curious: we’re using Optical Character Recognition (OCR) that our friends from Google Books helped us set up. OCR works best with high-resolution images, and not all formatting may be preserved.”, Google Docs Blog says.

for details, please visit Google Docs add OCR support to PDF and Images.

July 16, 2010 Posted by rubypdf | PDF News | adobe pdf, OCR Google Docs | 1 Comment

Freely Rotate PDF Page Online-Google App Engine Application

Rotate PDF Page Online(PdfRotate)

RubyPDF release the 3rd Google App Engine Application, PDFRotate Online, wit it, you can freely rotate PDF page online, the rotate angles support 90, 180 and 270 degrees.

Rotate PDF Page Online(PdfRotate)

If you want offline version, please check pdfrotate.

January 12, 2010 Posted by rubypdf | Uncategorized | Adobe Acrobat, Adobe Reader, GAE, GAE/J, Google App Engine Application, iText#, java, PDF Converter Online, pdfrotate | Leave a comment

Free Divide PDF Page Online-Another Google App Engine Application

Today, RubyPDF released another Google App Engine Application, Freely Divide PDF Page Online, also bases on iText.

the main feature is Split a PDF page to two half size Pdf Page, for example, Split a A3 Page to two A4 pages.

btw, RubyPDF also released desktop version before.

January 6, 2010 Posted by rubypdf | Uncategorized | Acrobat, Adobe Acrobat, adobe pdf, GAE, Google App Engine, Google Application Engine, iText in Action, java | Leave a comment

How to download the big files through Google App Engine UrlFetch API Call

I offer the UrlFetch function in my PDF Password Remover Online application, but I do not want to let it only manipulate no more 1M PDF, after some study, I got the solution, let UrlFetch API download no more 1M data each time, but repeat many times until all data downloaded, of course, there still a limit, 30 second request limit.
For details, please visit

How to Use Google App Engine UrlFetch API to download the files over 1M

December 25, 2009 Posted by rubypdf | Tutorials | GAE, GAE/J, Google App Engine, UrlFetch API Call | Leave a comment

First PDF Password Remover application hosted on Google App Engine

RubyPDF Software released the First PDF Password Remover application hosted on Google App Engine, bases on iText(version 2.1.7, but with many modification). with it, you can easily remove the user password or owner password online, and it is free.

remove restrictions on any secured PDF document (you should have the right to do it, for example, if you forgot the password). Any Acrobat version up to 9 is supported, even with 128-bit AES or 128-bit RC4 encryption. PDF restrictions removal is an instant process. Unlocked file can be opened in any PDF viewer without any restrictions so you may edit/copy/print it.
remove the PDF open password. Decryption of the file with password for opening is guaranteed for PDF files Any Acrobat version up to 9 is supported, even with 128-bit AES or 128-bit RC4 encryption,but you must know the password first.

For details, please visit RubyPDF PDF Password Remover Online.

December 23, 2009 Posted by rubypdf | PDF News, Software | cloud computing, GAE, Google App Engine, iText#, pdfdecrypt, pdfunlock | Leave a comment

how to Optimize and Reduce PDF File Size with the Help of Adobe Acrobat

I noticed How to use Adobe Acrobat to Optimize and Reduce PDF File Size lists two PDF version tutorials,
PDF version tutorial of Adobe Acrobat 6 solution to optimize and redue file size,
http://www.adobe.com/designcenter/acrobat/articles/acr6optimize/acr6optimize.pdf
PDF version tutorial of Adobe Acrobat 7 solution to optimize and redue file size,
http://www.adobe.com/designcenter/acrobat/articles/acr7optimize/acr7optimize.pdf
and I just wonder why they do not release the tutorials for Adobe Acrobat 8 and Adobe Acrobat 9.

October 30, 2009 Posted by rubypdf | Hacks, Tutorials | Adobe Acrobat, adobe pdf, PDF optimizer, Reduce PDF File Size | Leave a comment

using pdfsizeopt to Optimize & Reduce PDF File Size

pdfsizeopt is open source project hosting on Google Code, the main feature is PDF file size optimizer. it bases on the following tools,

pdfsizeopt.py
Python
Ghostscript
Java
sam2p
jbig2
png22pnm
pngtopnm
Multivalent.jar
PNGOUT

pdfsizeopt is a collection of best practices and scripts for Unix to optimize the size of PDF files, with focus on PDFs created from TeX and LaTeX documents. pdfsizeopt is developed on a Linux system, and it depends on existing tools such as Python 2.4, Ghostscript 8.50, jbig2enc (optional), sam2p, pngtopnm, pngout (optional), and the Multivalent PDF compressor (optional) written in Java.

for details, please visit pdfsizeopt-a Free and Open Source PDF Manipulation Tool to Reduce PDF File Size

references,

pdfsizeopt home page
Convert JBIG2 to PDF with free and open source software agl’s jbig2enc
Windows version JBIG2 Encoder-Jbig2.exe

October 30, 2009 Posted by rubypdf | Hacks, Linux, Open Source, Software, Tutorials, Windows | Leave a comment

	pdc1975 on Two Free RAR Password Cra…
	The Halfshot on Run J2SE and J2ME on Windows…
	Paulo Lima on Free Software to Convert XPS t…
	sanjeev kumar on 20 of the Best SEO Plugins for…
	rubypdf on Google Docs support OCR for PD…

PDF Hacks

PDF Hacks

RubyPDF Release pdf2htmlEX Windows Version

pdf2htmlEX Windows Verion

pdf2htmlEX v0.9 Windows Verion Release

Pdfgrep–freely search PDF with a grep like software

diffpdf-Free Cross Platform Software to compare PDF

Google Docs support OCR for PDF and Images

Freely Rotate PDF Page Online-Google App Engine Application

Rotate PDF Page Online(PdfRotate)

Rotate PDF Page Online(PdfRotate)

Free Divide PDF Page Online-Another Google App Engine Application

How to download the big files through Google App Engine UrlFetch API Call

How to Use Google App Engine UrlFetch API to download the files over 1M

how to Optimize and Reduce PDF File Size with the Help of Adobe Acrobat

using pdfsizeopt to Optimize & Reduce PDF File Size

About

Categories

Blogroll

Recent Posts

RubyPdf Technologies

Twitter Updates

RubyPDF Blog

Recent Comments

Blog Stats

Site info