Extract pages from PDF or Merge multiple PDF files into Single PDF with Latex

Extract pages from PDF or Merge multiple PDF files into Single PDF with Latex

  • Comments 2

In the recent times I often found myself looking for tools to perform few simple operations on PDF files, such as extracting single or selective multiple pages from my scanned PDF files, and/or merge multiple scanned PDF documents into a single PDF and so on.

However, the solution did not come straight, but it just occurred to me while I was working on something else entirely different.

Here I was, preparing my Healthcare Analytics white paper working on its Title page from scratch, and the realization dawned upon me - if you are a Latex user like me, then you have all the necessary tools at your disposal to perform the tricks on PDF files - once you become aware of this small package called pdfpages, that is. You can quickly, in just a single line or two, do whatever you want with your scanned PDF documents all by your own Latex code.

The includepdf command is going to be your solution. For example, the below code lets you extract 3rd page from your scanned PDF file:

\documentclass{article} 
\usepackage{pdfpages}

\begin{document}
\includepdf[pages=3]{scanned_Doc.pdf}
\end{document}

Now, you might encounter online tools to merge few PDF documents into a single document, but they may not let you selectively pick pages from each of those documents and then do the merge. This kind of merge is pretty straight forward with includepdf as shown below.

Here is how you can combine pages from your cover letter PDF document with selective pages from your scanned certificate documents:

\documentclass{article} 
\usepackage{pdfpages}

\begin{document}
\includepdf[pages=1]{CoverLetter.pdf}
\includepdf[pages={3,4}]{cert.pdf}
\end{document}

The package also lets you insert multiple extracted pages in a single page with nup option, as below (where first 1 to 4 pages are arranged in 3 columns x 2 rows on a single page):

\includepdf[nup=3x2, pages=1-4]{cert.pdf}

You should be able to offset the extracted PDF into correct location on your page using the regular setlength commands. For a full page PDF insertion, you can try something like below:

\documentclass{article} 
\usepackage{pdfpages}
\pagestyle{plain} % no page numbers
\begin{document}
\setlength\voffset{-0.1in} % adjust the vertical offset
\setlength\hoffset{-0.4in} % adjust the horizontal offset
\includepdf[pages={2,2,2}]{Scanned_Doc.pdf} % insert 2nd page from Scanned_Doc three times
\end{document}

For further details on how to add overlays or annotations to the PDF pages, visit my article: PDF Page Manipulation.

  • Hello Mr. Gopalakrishna Palem, the article you shared is really very informative. But i feel that for a layman, its not that easy to work with codes easily. I found an article by a user on Ezine, who has shared a descriptive article about how we can <a href="ezinearticles.com or split mutiple PDF files</a> into a single file.

    There are other options as well but i found this useful and easy to use.

  • Nice article.But it seems a bit complicated for a beginner in programming. In fact, I also sample project from MSDN which domenstrates a fine way to extract contents from PDF using C#, refer-code.msdn.microsoft.com/Extracting-text-and-image-d47ac957 .This solution is based on <a href=www.e-iceblue.com/.../free-pdf-component.html>C PDF</a> component..Hope it helps.

    Regards

Page 1 of 1 (2 items)
Leave a Comment
  • Please add 3 and 5 and type the answer here:
  • Post