Programmatically Creating Website Thumbnails

I have been working recently on a project which I am capturing links to web pages and need to display a thumbnail-size screen shot of the web page being linked.  For example, if I have a link to the Ruby on Rails web site, the thumbnail I need would look like this:

rubyonrails_thumb

This particular one is 202×152 pixels, just the size I need.

I was really surprised by the level of difficulty I had trying to find such a solution and searching the web led to very little.  It appears Amazon had such a solution at one time under the Alexa umbrella called Alex Site Thumbnail but they shut the service down a while ago.  There happened to be a very nice Ruby interface to the Alexa service as well.

I ended up coming across a couple viable and workable solutions, Websnapr and Bluga.net.

Websnapr

It is a web service which is pretty simple to implement and in its simplest form can just be referenced from an img tag from a web page.  My implementation uses Ruby on Rails but it’s not needed to see a very simple example.  Taking the URL I want a screen image of from above:

<img src=”http://images.websnapr.com/?size=S&key=mywebsnaprkey&url=rubyonrails.org” />

The web service will allow for snapshots to be made without the API key but limits to 80 requests per hour, which is reasonable.  The service also seems to cache previously requested snapshots taken from other users.   The normal process when requesting a snapshot is a temporary image is returned while the request is in the queue, but if there is already one available it is returned immediately. 

My tests have shown the resulting image being returned in less than a minute.

Bluga.net WebThumb

The other thumbnail service under consideration uses a different approach to the submission and retrieval of images, both thumbnail and full-size website snapshots.  Their approach is much more flexible but can be bit more complex. 

WebThumb offers more features and quicker response times then any other service.

Bluga offers a gem hosted on Github for Ruby developers to interface to their web service, simply wrapping the API calls in some nice objects.

One problem I saw with simply using the gem was when requesting a snapshot that it might not be ready right away and would through an exception, which we need to code.  Not a huge problem but to be aware of.  Peter Cooper posted some code at DZone which shows his implementation using the Bluga service:

require 'net/http'
require 'rubygems'
require 'xmlsimple'
class Nailer
@@api_baseurl = 'http://webthumb.bluga.net/api.php'
@@api_key = 'PUT YOUR API KEY HERE'
attr_accessor :collection_time, :job_id, :ok
def initialize(url, width = 1024, height = 768)
api_request = %Q{<webthumb><apikey>#{@@api_key}</apikey><request><url>#{url}</url><width>#{width}</width><height>#{height}</height></request></webthumb>}
result = do_request(api_request)
if result.class == Net::HTTPOK
result_data = XmlSimple.xml_in(result.body)
@job_id = result_data['jobs'].first['job'].first['content']
@collection_time = Time.now.to_i + result_data['jobs'].first['job'].first['estimate'].to_i
@ok = true
else
@ok = false
end
end
def retrieve(size = :small)
api_request = %Q{<webthumb><apikey>#{@@api_key}</apikey><fetch><job>#{@job_id}</job><size>#{size.to_s}</size></fetch></webthumb>}
result = do_request(api_request)
result.body
end
def retrieve_to_file(filename, size = :small)
File.new(filename, 'w+').write(retrieve(size.to_s))
end
def ready?
return unless Time.now.to_i >= @collection_time
api_request = %Q{<webthumb><apikey>#{@@api_key}</apikey><status><job>#{@job_id}</job></status></webthumb>}
result = do_request(api_request)
if result.class == Net::HTTPOK
@ok = true
result_data = XmlSimple.xml_in(result.body)
begin
@result_url = result_data['jobStatus'].first['status'].first['pickup']
@completion_time = result_data['jobStatus'].first['status'].first['completionTime']
rescue
@collection_time += 60
return false
end
else
@ok = false
end
true
end
def ok?
@ok == true
end
def wait_until_ready
sleep 1 until ready?
end
private
def do_request(body)
api_url = URI.parse(@@api_baseurl)
request = Net::HTTP::Post.new(api_url.path)
request.body = body
Net::HTTP.new(api_url.host, api_url.port).start {|h| h.request(request) }
end
end
url = 'http://www.rubyinside.com/'
t = Nailer.new(url)
if t.ok?
t.wait_until_ready
t.retrieve_to_file('out1.jpg', :small)
t.retrieve_to_file('out2.jpg', :medium)
t.retrieve_to_file('out3.jpg', :medium2)
t.retrieve_to_file('out4.jpg', :large)
puts "Thumbnails saved"
else
puts "Error"
end

 

You can see from the example there are some things to take into consideration which might lead taking longer to implement.   Bluga does offer an Easythumb API which looks very similar to Websnapr but gives the option of cached and non-cached snapshots, which may likely be important.

As I indicated Bluga has some good support for developers including a well-documented API, code samples in multiple languages and article references so you can see how others are using the service to solve their problems.  Bluga is also hosted on EC2 so the uptime should be really good.

Finally

The solution I am currently using for this project works well but relies on a third-party service, which I would like to avoid.  If someone has a solution, preferably in Ruby natively or Ruby interfacing to some Linux libraries or tools, I would really like to hear about it.

For now the solution with Websnapr works very well, although not instantaneous, is pretty fast.  I good part of this approach is not having to store images and the drawback being I am not storing images.  If Websnapr goes down, no images are displayed.

After finding these two solutions I came across some others that might do the job but haven’t had a reason to investigate; Shrink the Web, Girafa, PageGlimpse and scURLr.

Most of the services have a component you need to pay for beyond an initial free hits to the service.   In the long-term Bluga seems to be a really good solution because of the flexibility and being host on Amazon EC2 but for a short-term solution it is hard to be Websnapr.

Technorati Tags: ,,,
  • http://realurl.org/twitted.php?id=2870754395 Twitted by bkirsten

    RE: Programmatically Creating Website Thumbnails

    Pingback from Twitted by bkirsten

  • http://realurl.org/twitted.php?id=2870754395 Twitted by bkirsten

    RE: Programmatically Creating Website Thumbnails

    Pingback from Twitted by bkirsten

  • Scott

    We use the Bluga.net solution and are quite happy with it. Plus you can get a free account and get 100 thumbs a month for free for testing purposes.

  • Scott

    We use the Bluga.net solution and are quite happy with it. Plus you can get a free account and get 100 thumbs a month for free for testing purposes.

  • http://nuttnet.net/ Michael Nutt

    I was investigating this problem a few years ago and didn’t find a satisfactory solution. The reason it’ll probably never be possible in pure ruby is that you need a rendering engine to actually render the web page.
    One option is to use vnc+firefox (http://codesnippets.joyent.com/posts/show/1784) but there is quite a bit of overhead running X on your server. People keep on talking about a null widget set to run firefox without an X server, but I’ve never gotten anything working.
    You could also set up a couple of mac minis and script webkit to create thumbnails.

  • http://nuttnet.net Michael Nutt

    I was investigating this problem a few years ago and didn’t find a satisfactory solution. The reason it’ll probably never be possible in pure ruby is that you need a rendering engine to actually render the web page.

    One option is to use vnc+firefox (http://codesnippets.joyent.com/posts/show/1784) but there is quite a bit of overhead running X on your server. People keep on talking about a null widget set to run firefox without an X server, but I’ve never gotten anything working.

    You could also set up a couple of mac minis and script webkit to create thumbnails.

  • http://matthijslangenberg.nl/ Matthijs Langenberg

    Depending on your setup, webkit2png (http://www.paulhammond.org/webkit2png) might be useful. For linux there is khtml2png.

  • http://matthijslangenberg.nl/ Matthijs Langenberg

    Depending on your setup, webkit2png (http://www.paulhammond.org/webkit2png) might be useful. For linux there is khtml2png.

  • grimen

    My implementation of PageGlimpse API in a neat little Ruby wrapper. I’ll not get more trivial than this:
    github.com/…/page_glimpse
    Example:
    snapper = PageGlimse.new(‘ec0ccd….26df’)
    snapper.save!(‘http://techcrunch.com’, :size => :large)

  • grimen

    My implementation of PageGlimpse API in a neat little Ruby wrapper. I’ll not get more trivial than this:

    github.com/…/page_glimpse

    Example:

    snapper = PageGlimse.new(‘ec0ccd….26df’)

    snapper.save!(‘http://techcrunch.com’, :size => :large)

  • https://github.com/Valafar Petri Kivikangas

    The "Ruby natively or Ruby interfacing to some Linux libraries or tools" -solution I used, is ImageMagick + RMagick (Ruby interface) + UploadColumn. I know, using ImageMagick only for thumbnail generation is a bit overkill, but it does the job.
    By using UploadColumn, I can write following in my picture model:
    image_column :jpg,
    :versions => { :thumb => ‘c100x100′ },
    :extensions => %w(jpg jpeg),
    :store_dir => ‘album’
    validates_presence_of :title, :jpg
    validates_size_of :jpg, :maximum => 5.megabyte, :message => "is too big, must be smaller than 200kB!"
    validates_integrity_of :jpg
    It creates thumbnail by cropping the original image into 100×100 and performs few validations.
    http://rmagick.rubyforge.org/ http://www.imagemagick.org/script/index.php uploadcolumn.rubyforge.org

  • https://github.com/Valafar Petri Kivikangas

    The "Ruby natively or Ruby interfacing to some Linux libraries or tools" -solution I used, is ImageMagick + RMagick (Ruby interface) + UploadColumn. I know, using ImageMagick only for thumbnail generation is a bit overkill, but it does the job.

    By using UploadColumn, I can write following in my picture model:

    image_column :jpg,

    :versions => { :thumb => ‘c100x100′ },

    :extensions => %w(jpg jpeg),

    :store_dir => ‘album’

    validates_presence_of :title, :jpg

    validates_size_of :jpg, :maximum => 5.megabyte, :message => "is too big, must be smaller than 200kB!"

    validates_integrity_of :jpg

    It creates thumbnail by cropping the original image into 100×100 and performs few validations.

    http://rmagick.rubyforge.org/ http://www.imagemagick.org/script/index.php uploadcolumn.rubyforge.org

  • http://afreshcup.com/2009/07/28/double-shot-505/ Double Shot #505 « A Fres

    RE: Programmatically Creating Website Thumbnails

    Pingback from Double Shot #505 « A Fresh Cup

  • http://afreshcup.com/2009/07/28/double-shot-505/ Double Shot #505 « A Fresh Cup

    RE: Programmatically Creating Website Thumbnails

    Pingback from Double Shot #505 « A Fresh Cup

  • http://www.accidentaltechnologist.com/ Rob Bazinet

    @scott, great to know. I think Bluga.net is a great solution and likely what I will use over the medium term before creating my own solution.

  • http://www.accidentaltechnologist.com Rob Bazinet

    @scott, great to know. I think Bluga.net is a great solution and likely what I will use over the medium term before creating my own solution.

  • http://www.accidentaltechnologist.com/ Rob Bazinet

    @grimen – Thanks for the link, I wasn’t aware of your wrapper. It looks easy enough to use. One complaint I have with Github is that it seems the search could be improved, I miss a lot of projects when I am searching around.

  • http://www.accidentaltechnologist.com Rob Bazinet

    @grimen – Thanks for the link, I wasn’t aware of your wrapper. It looks easy enough to use. One complaint I have with Github is that it seems the search could be improved, I miss a lot of projects when I am searching around.

  • http://www.accidentaltechnologist.com/ Rob Bazinet

    @petri – interesting solution. I am going to dig more into what you are using and see if I can create my own service with it. It makes me wonder what other services are using for their solutions. Thanks.

  • http://www.accidentaltechnologist.com Rob Bazinet

    @petri – interesting solution. I am going to dig more into what you are using and see if I can create my own service with it. It makes me wonder what other services are using for their solutions. Thanks.

  • http://neilmiddleton.com/2009/07/28/links-for-2009-07-28/ links for 2009-07-28 | :neil_m

    RE: Programmatically Creating Website Thumbnails

    Pingback from links for 2009-07-28 | :neil_middleton

  • http://neilmiddleton.com/2009/07/28/links-for-2009-07-28/ links for 2009-07-28 | :neil_middleton

    RE: Programmatically Creating Website Thumbnails

    Pingback from links for 2009-07-28 | :neil_middleton

  • http://jasper22.wordpress.com/2009/07/29/programmatically-creating-website-thumbnails/ Programmatically Creating Webs

    RE: Programmatically Creating Website Thumbnails

    Pingback from Programmatically Creating Website Thumbnails « Jasper Blog

  • http://jasper22.wordpress.com/2009/07/29/programmatically-creating-website-thumbnails/ Programmatically Creating Website Thumbnails « Jasper Blog

    RE: Programmatically Creating Website Thumbnails

    Pingback from Programmatically Creating Website Thumbnails « Jasper Blog

  • http://www.bencurtis.com/ Benjamin Curtis

    Here’s a library I put together that does it all locally, if you have the luxury of running on OSX:
    http://www.bencurtis.com/…/taking-snapshot

  • http://www.bencurtis.com/ Benjamin Curtis

    Here’s a library I put together that does it all locally, if you have the luxury of running on OSX:

    http://www.bencurtis.com/…/taking-snapshot

  • Can be done with selenium for firefox (tested on linux):
    require ‘rubygems’
    require "selenium"
    selenium = Selenium::SeleniumDriver.new("localhost", 4444, "*chrome", "http://www.google.com/&quot;, 10000);
    selenium.start
    selenium.open("http://www.google.com&quot;)
    selenium.capture_entire_page_screenshot("/tmp/test.jpg","*")

  • Can be done with selenium for firefox (tested on linux):

    require ‘rubygems’

    require "selenium"

    selenium = Selenium::SeleniumDriver.new("localhost", 4444, "*chrome", "http://www.google.com/&quot;, 10000);

    selenium.start

    selenium.open("http://www.google.com&quot;)

    selenium.capture_entire_page_screenshot("/tmp/test.jpg","*")

  • http://www.accidentaltechnologist.com/ Rob Bazinet

    @Benjamin – awesome, thank you for the link to the article. I think I remember reading that on your blog when you first put it out. It sounds like a good way to get my own service going.

  • http://www.accidentaltechnologist.com Rob Bazinet

    @Benjamin – awesome, thank you for the link to the article. I think I remember reading that on your blog when you first put it out. It sounds like a good way to get my own service going.

  • http://blog.robincurry.com/2009/08/04/chain-links-013/ Proc#curry | Chain Links #013

    RE: Programmatically Creating Website Thumbnails

    Pingback from Proc#curry | Chain Links #013

  • http://blog.robincurry.com/2009/08/04/chain-links-013/ Proc#curry | Chain Links #013

    RE: Programmatically Creating Website Thumbnails

    Pingback from Proc#curry | Chain Links #013

  • http://www.pass4suresite.com/ pass4sure

    Thanks for giving good information.<a href="http://www.pass4suresite.com">pass4sure</a&gt;

  • http://www.pass4suresite.com pass4sure

    Thanks for giving good information.<a href="http://www.pass4suresite.com">pass4sure</a&gt;

  • http://www.laditta.com.br/ Camisetas

    Thank you very much for this useful information.
    Please keep on blogging.
    I am looking forward to read your next great article.

  • http://www.laditta.com.br Camisetas

    Thank you very much for this useful information.

    Please keep on blogging.

    I am looking forward to read your next great article.

  • http://www.briankirsten.com/2009/07/27/links-for-2009-07-27/ links for 2009-07-27

    RE: Programmatically Creating Website Thumbnails

    Pingback from links for 2009-07-27

  • http://www.briankirsten.com/2009/07/27/links-for-2009-07-27/ links for 2009-07-27

    RE: Programmatically Creating Website Thumbnails

    Pingback from links for 2009-07-27

  • http://www.fix-iphones.com/ iphone fix

    Depending on your configuration, webkit2png (http://www.paulhammond.org/webkit2png) may be useful. For Linux, it is khtml2png.

  • http://www.fix-iphones.com/ iphone fix

    Depending on your configuration, webkit2png (http://www.paulhammond.org/webkit2png) may be useful. For Linux, it is khtml2png.

  • http://www.thedogtrainingclub.com Dog Training and care

    So that’s how it’s done…

    I’ve been wondering how this is done.

  • http://www.thedogtrainingclub.com Dog Training and care

    So that’s how it’s done…

    I’ve been wondering how this is done.

  • http://www.window-blinds-project.com/window-blinds-and-shades.html Window Blinds and Shades

    I was investigating this problem a few years ago and didn’t find a satisfactory solution. The reason it’ll probably never be possible in pure ruby is that you need a rendering engine to actually render the web page.

  • Dave

    The wkhtmltopdf project has a tool called wkhtmltoimage which makes this easy to do locally, without a web service: http://code.google.com/p/wkhtmltopdf/