Extracting images from PDFs with Inkscape

Published on ago. 30, 2016

My work as a scientist and also as an occasional outreach communicator has put me in a situation where I need to extract graphical material from publications and other documents. The two alternatives I normally found were

Go to a webpage and pray for a high-resolution image that can be used. In many journals this is rarely the case. Bandwidth considerations force the journals to place mediocre size pictures that are sometimes ok for presentations, unless you really use a good projector or a big screen. In most cases they are not ready for printed reproduction.
Use the PDF of a publication and capture a screenshot. You would typically open the document in Acrobat, zooming in until the region you are interested in fills the screen, and then use the "screenshot" or "camera" icon to capture that region. This normally improves the quality of the resulting image, but it is still limited by the resolution of your screen!

I have rediscovered a third method that was probably invented by one of my PhD students. The idea is to open the PDF as if it was a drawing, using some vector graphic program, of which I recommend and detail Inkscape. This is a drawing program which is available for Windows, Mac and Linux, and which is very powerful -perhaps too powerful for many much simpler designs.

Recent versions of Inkscape can open a wide variety of formats, including PDFs. The workflow I would therefore suggest to you is the following one

Open Inkscape and press Ctrl-O to open the PDF you want to work with. In the import window you should select "import via Poppler". Make also sure you select the page where the resource you wish to copy lives
Once the page is imported, you can begin editing. Start by selecting "Ungroup" by clicking with the right button on top of the imported page. After this, all elements will be editable and you will be able to strip down the image to your desired minimum. Notice below how I deleted several portions of text.
Finally, save the image in your desired format. I normally use SVG for archival and PDF to embed in other publications.

Two final remarks: First, make use you have the rights for a fair use of the image. I assume that in talks giving credit is enough, while in publications you would have to contact the journal. Second, beware of how you build images for your paper! This way of proceding can reveal hacks in your images, text that is underlying your original plots, etc. Just a warning. Curious PhD students may have a lot of fun with your work otherwise.

Juan José García Ripoll

Senior scientist

IFF-CSIC

[En español]

Extracting images from PDFs with Inkscape