Situatie
PDF (Portable Document Format) may be a file format that has captured all the weather of a printed document as a bitmap that you simply can view, navigate, print, or forward to somebody else. PDF files are created using Adobe Acrobat.
Solutie
Pasi de urmat
Suppose a PDF file contains a Table
User_ID | Name | Occupation |
1 | David | Product Manage |
2 | Leo | IT Administrator |
3 | John | Lawyer |
And we want to read this table into our Python Program.
Method 1: Using tabula-py
The tabula-py is a simple Python wrapper of tabula-java, which can read tables in a PDF. You can install the tabula-py library using the command.
pip install tabula-py pip install tabulate
The methods used in the example are:
read_pdf(): reads the data from the tables of the PDF file of the given address
tabulate(): arranges the data in a table format
from
tabula
import
read_pdf
from
tabulate
import
tabulate
#reads table from pdf file
df
=
read_pdf(
"abc.pdf"
,pages
=
"all"
)
#address of pdf file
(tabulate(df))
Leave A Comment?