import PyPDF4 as p2 pdffile = open("XXXX.pdf","rb") pdfread=p2.PdfFileReader(pdffile) print(pdfread.getNumPages()) pageinfo=pdfread.getPage(0) print(pageinfo.extractText())
While running the above the 4th line of code successfully returns the correct value i.e no. of pages in the PDF, however, the 6th line (PDF extraction) gives a one page long blank data. I’ve tried using PyPDF2 and PyPDF4 and ran the code in both Python terminal and sublimetext and in both cases the I received blank page instead of actual text.
PDF is a tax return and is completely all text format. No images whatsoever. What am I doing wrong ?