User Tools

Site Tools


notes:use_page_ranges_in_pypdf2_pdffilemerger

Use Page Ranges in PyPDF2 PdfFileMerger

PdfFileMerger is a nice Python class provided by the PyPDF package. If you want to catenate multiple pages using expressions you can use a Page Range expression.

To specify page ranges, you need to import the PageRange class

from PyPDF2 import PdfFileMerger, PageRange

then you can specify the range as a PageRange() object. For example, the range from page 13 (REMEMBER: page indices start with zero.!) to the last one can be written:

merger.append(inpdf, pages=PageRange('13:-1'))

More page range expression examples follows, if you want to play a bit with them:

  :     all pages.                   -1    last page.
  22    just the 23rd page.          :-1   all but the last page.
  0:3   the first three pages.       -2    second-to-last page.
  :3    the first three pages.       -2:   last two pages.
  5:    from the sixth page onward.  -3:-1 third & second to last.

The third, “stride” or “step” number is also recognized.

  ::2       0 2 4 ... to the end.    3:0:-1    3 2 1 but not 0.
  1:10:2    1 3 5 7 9                2::-1     2 1 0.
  ::-1      all pages in reverse order.

See below for a complete working python script with PyPDF2 and page ranges. Just set the in/out filenames, the starting/ending page numbers and run the script:

from PyPDF2 import PdfFileMerger, PdfFileReader, PageRange

infn = "infilename.pdf"
outfn = "outfilename.pdf"
startpage = 5 #set starting page in the pdf -1 (i.e. here we want to start from page 6)
endpage = -1  #last page

srcfile = PdfFileReader(infn, 'rb')
merger = PdfFileMerger()
page_range = str(startpage) + ':' + str(endpage)
merger.append(srcfile, pages=PageRange(page_range))
merger.write(outfn)

And if you are using Ubuntu, remember to install pypdf2 package first

sudo apt install python3-pypdf2 
notes/use_page_ranges_in_pypdf2_pdffilemerger.txt · Last modified: 2018/09/08 07:16 by admin