Dear and @academicchatter folks:

Ajay Iyer@mastodon.social · edit-2 7 months ago

Dear and @academicchatter folks:

René Seindal@mastodon.social · 7 months ago

@ajayiyer @linux @academicchatter https://en.m.wikipedia.org/wiki/Pdftotext

Responsabilidade@lemmy.eco.br · edit-2 7 months ago

The first tool I can think of is LibreOffice Draw

Maybe there are other tools, but I think LibreOffice Draw do the job pretty well

Edit: If the PDF has written text, you may wanna use an OCR tool, but I don’t have any to suggest

impure9435@kbin.run · 7 months ago

@[email protected] OCRmyPDF is exactly what you are looking for

Carunga@feddit.de · 7 months ago

Try Zotero. It is a complete literature databas but it’s PDF reader is very good at extracting images and text. Works on all OS, web and mobile. Native Linux client has been very smooth for me. Oh, terminal it doesn’t do though. If you want to extract a large amount in an automated way, its probably not the right tool.

CCRhode@lemmy.ml · edit-2 7 months ago

I’m mystified that poppler-utils is not a viable option. Of course the *.pdf file would have to include the text itself, but many do.