Programatically Capturing Web pages as Images

Eric PughMay 12, 2009

I don’t normally post blog articles that are reposts of other content, but this email thread answered a question that I’ve struggled with, which is how do you render a web page and save it as an image. I do this on HighTechCville, and our Fish4Brains RailsRumble entry a couple of years ago via thumbshots.org, but I’ve never been happy with that service:

At Sun, 3 May 2009 11:19:17 -0400, Eric Pugh wrote: Cool!

It’s one of those things that seems like everybody wants it, but no one has quite figured out. And the various “services” like thumbshots all feel kinda “seedy”, I am always expecting to see advertisements for viagra stamped on top of the screenshots and other questionable business practices.

It seems like you should be able to have the pages be render inside of a library such as WebKit, but I guess rendering is very intertwined with monitor displays and resolutions etc.

I have a research projects that aggregates info about people, events, and organizations and I’d love a better solution for linking in screenshots of the organizations and individuals site. Here is an example using the thumbshot service for now..: http://www.hightechcville.com/organizations/318-worrell-water-technologies

Here is the text (thanks to Mark Phillips for this):

Khtml2png – http://khtml2png.sourceforge.net/ “Khtml2png is a command line program to create screenshots of webpages. It uses libkhtml (the library that is used in the KDE web browser Konqueror). In khtml2png 2.0.5 to 2.5.0, “convert” from the ImageMagick graphic conversion toolkit is used to create the output files in various image file formats. 2.6.0 and future development will use the built-in conversion of the Qt library.” – from the Khtml2png website

Pearl Crescent Page Saver – http://pearlcrescent.com/products/pagesaver/ “Pearl Crescent Page Saver” is an extension for Mozilla Firefox that lets you capture images of web pages. These images can be saved in PNG format or (with Firefox 2) in JPEG format. The entire page or just the visible portion may be captured. Options let you control whether images are captured at full size (which is the default) or scaled down to a smaller size. Page Saver uses the canvas feature that was introduced in Firefox 1.5.” – from the Pearl Crescent Page Saver website

Webkit2png – http://www.paulhammond.org/webkit2png/ “Webkit2png is a command line tool that creates PNG screenshots of webpages. … webkit2png makes use of webkit, the rendering engine used in Safari.” – from the Webkit2png website This utility is only available for Mac OSX because of the dependence on Safari.

Webshot – http://www.websitescreenshots.com/ “WebShot is a program that allows you to take screenshots and thumbnails of web pages or whole websites. It comes with a command line interface for advanced users. The following image formats are supported: JPG, GIF, PNG, BMP.” – from the WebShot website WebShot uses Internet Explorer as the engine for creating thumbnails of HTML files.

best, Erik Hetzner




More blog articles:


Let's do a project together!

We provide tailored search, discovery and analytics solutions using Solr and Elasticsearch. Learn more about our service offerings