Archive for the ‘ruby’ Category

em-http-requestで非同期httpダウンロード

水曜日, 1月 18th, 2012

rubyにはem-http-requestという非同期httpクライアントがあるらしいので
使ってみた。

gemでem-http-requestのインストール

$ gem install em-http-request

download.rb


#!/usr/bin/env ruby
# -*- coding: utf-8 -*-

require ‘pathname’
require ‘rubygems’
require ‘eventmachine’
require ’em-http’

class Downloader
def download(fetch_list_file)
pending = File.open(fetch_list_file).each{}.lineno
abort “Stop Downloader, because fetch list is empty.” if pending <= 0 download_dir = "#{Pathname.new(ENV["HOME"])}/Downloads" Dir.mkdir(download_dir) unless File.exists?(download_dir) EM.run do File.open(fetch_list_file) {|f| f.each do |url| url.chomp! http = EM::HttpRequest.new(url).get http.callback { puts "==> Fetched `#{url}'”
filename = File.basename(url)
if http.response_header.status == 200 then
begin
File.open(“#{download_dir}/#{filename}”, File::WRONLY|File::CREAT|File::TRUNC|File::NONBLOCK) { |f|
f.puts http.response
puts “==> Overwriting `#{download_dir}/#{filename}'”
}
rescue => ex
puts “Failed to open file : file=[#{download_dir}/#{filename}}] : reason=[#{ex.message}]”
end

else
puts “Failed request : url=[#{url}] : status=[#{http.response_header.status}]”
end
pending -= 1
EM.stop if pending < 1 } http.errback { puts "#{url} : #{http.error}" pending -= 1 EM.stop if pending < 1 } end } end end end def main(fetch_list_file) if fetch_list_file == nil || !File.exist?(fetch_list_file) then abort "set first arg as `fetch list file!'" end downloader = Downloader.new downloader.download(fetch_list_file) end if __FILE__ == $0 main(ARGV[0]) end [/ruby] URLのリストファイルを読み込んでfetchする実装にしてみた。

$ cat fetch_list.txt
http://logsoku.com/thread/ikura.2ch.net/news/1322798773/
http://guide.jp.real.com/moviecollection/synopsis_27528.htm
http://guide.jp.real.com/moviecollection/synopsis_27526.htm
http://guide.jp.real.com/moviecollection/synopsis_27520.htm
http://guide.jp.real.com/moviecollection/synopsis_27502.htm

$ ruby ./download.rb fetch_list.txt
==> Fetched `http://guide.jp.real.com/moviecollection/synopsis_27528.htm’
==> Overwriting `/Users/hmatsuda/Downloads/synopsis_27528.htm’
==> Fetched `http://guide.jp.real.com/moviecollection/synopsis_27526.htm’
==> Overwriting `/Users/hmatsuda/Downloads/synopsis_27526.htm’
==> Fetched `http://guide.jp.real.com/moviecollection/synopsis_27520.htm’
==> Overwriting `/Users/hmatsuda/Downloads/synopsis_27520.htm’
==> Fetched `http://guide.jp.real.com/moviecollection/synopsis_27502.htm’
==> Overwriting `/Users/hmatsuda/Downloads/synopsis_27502.htm’
==> Fetched `http://logsoku.com/thread/ikura.2ch.net/news/1322798773/’
==> Overwriting `/Users/hmatsuda/Downloads/1322798773′