In and Out Filters for Hacked mod_ruby #
I’ve been playing with an experimental MouseHole which uses an Apache output filter on content pulled through mod_proxy. On my Linux machine, mod_proxy is actually much slower than WEBrick::HTTPProxy, so there’s been no gain. Except the newly hacked input and output filters for mod_ruby.
May still have a pile of bugs, but download here: mod_ruby-filtered-12.27.2005.tar.gz.
To write your own filtered proxy:
RubySafeLevel 0 RubyTimeOut 10 RubyAddPath /home/why/lib RubyRequire proxyTest <IfModule mod_proxy.c> ProxyRequests On <Proxy *> Order deny,allow Deny from all Allow from 127.0.0.1 RubyOutputFilter ProxyTest.instance REWRITER SetOutputFilter REWRITER </Proxy> ProxyVia On </IfModule>
And the proxyTest.rb looks like this:
require 'singleton' class ProxyTest include Singleton def output_filter(filter) if filter.req.content_type !~ %r!text\/html! filter.pass_on else s = filter.read while s filter.write(s.gsub(/Ruby/i, "#{filter.req.content_type}")) s = filter.read end filter.close if filter.eos? end end end
So, yeah, the object must respond to input_filter
or output_filter
. (Turns to Shugo Maeda.) I think we should start duck typing mod_ruby. Rather than having to explicitly state the various handlers in the httpd.conf, we should use respond_to?
in mod_ruby to scan for the capabilities of the class.
why
Oh and if mod_proxy was up to it, this is such an easy way for all the languages to get custom filtering proxies because mod_python, mod_perl and many of the others all support this kind of filtering. However, mod_io and mod_haskell do not.
Ezra
Thats seriously cool _why. Must… go… play…
Matt
I’m not generally slow or stupid, but I’ll say that Why makes my head spin! I’ll be honest and say that I don’t quite understand exactly what your code does. Generally I can figure out your examples (despite sparse descriptions), but today is just not my day. Can you give me a quick run-down of what you’re actually doing here?
Thanks, and sorry.
M.T.
why
Okay, well, mod_ruby is an extension to the Apache web server, right? So you can run a Ruby interpreter inside Apache. You add directives to Apache, which can let URLs, authentication headers, etc. go through a Ruby object.
This hack adds two new directives: RubyOutputFilter and RubyInputFilter. These filters are used to completely modify the request and response within Apache. (Think filtering like: removing cuss words or eliminating ads from a page.) mod_ruby’s handlers can’t currently do this because they happen before the page ever comes back.
I’m actually not sure why you’d need filters in a traditional Apache setup (without mod_proxy.) But it could be used like Monkeygrease to offer slightly modified versions of your own applications.
Anyway, what I’m advocating is use of mod_proxy with mod_ruby filtering. And in the above example, I’m using mod_proxy to set up a personal proxy at http://127.0.0.1:37004 on my laptop. And then I’m passing all the proxy pages through the ProxyTest object. See, the output_filter method takes an HTML page and replaces the word Ruby with the mime type of the page. It’s a stupid example that’s full of baloney, but it illustrates the basics. If I can get mod_proxy to speed up, then I’m sure you’ll be seeing a lot more of it.
MenTaLguY
Wow, interesting. I’m a bit surprised that mod_proxy is slower than WEBrick’s, though…
Matt
Thanks for that excellent elaboration. I guess some of the terms I just wasn’t as comfortable with, such as filtered proxy.
Keep up the creative thinking! It inspires me a great deal in my coding.
M.T.
Comments are closed for this entry.