REGULAR EXPRESSIONS Beginner

Who knows how to extract images from pdf textfile and get rid of images,so only their name would have left

Hey there,

can you provide an example?

I am pretty confused
where do you want to extract images from?
what pdf textfiles?
what are u using to do this?
example of code u have?

Images stored in pdf in raw binary data rather then as embedded files, so while removing images is possible substituting it with original file name - not possible*

*There are multiple pdf standards, so there is a possibility that some documents may actually store information about source image file

I use the Tika module to get text out of PDFs
Here’s a stackoverflow I used to get started

Nope( It is the task we have for test .To make such a task through a regular :pensive:

Nope( It is the task we have for test .To make such a task through a regular :frowning_face:

So basicly
Have a PDF files with pictues that has names like a photo of a cat is named cat.
1: Remove the image of that PDF
2: The name of the file cat is the only thing that would be left aka the name of the image file. For web this called metadata.

Okay so what you could do.
1: Is open up a new file for Python code onto your computer
2: Import a module such as PyPDF2 or another PDF package that you seem fit
3: Think about, the steps and check out the libary of the imported module. A quick Google will do.