hoodwink.d enhanced


PickPocket, a Marshal Ransack Hack #

by why in inspect

Today’s hack is a Marshal hack, which is a highly common (but quite untapped) language that also has no formal layout beyond what Ruby’s source code has to say. Most times you only hear about slight changes between major Ruby versions (1.6 -> 1.8) when something goes kaput. No Ruby books I know of go near dissecting it. And, strangely enough, Minero’s classic Ruby Hacking Guide doesn’t even touch it.

We are voodoo doctors. Take us to the center of the marsh.

Tasting Just a Few Sample Bytes

Marshal is the ultra-slim encoded bytespeak that Ruby can pull out when siphoning objects through a skinny straw. For when your Ruby shares a malt with someone else’s Ruby. Yeah, well, it’s actually very easy to pick up.

 >> Marshal.dump("Koichi")
 => "\004\010\"\vKoichi" 

Say, that’s not too bad. We dumped out a little Marshal and you can at least see plain old Koichi in there. Other than that there are just four other characters, starting with "\004\010", which is the Marshal 4.8 (current) marshal header.

The other characters are a quote ("), which means “look, a string is coming up”. And \v, which is a string length.

 >> "\v"[0] - 5
 => 6

Yeah, it takes a little math, but there you can see it: \v means a string length of 6. So, in summary: this is a Ruby 1.8.4 Marshal string containing a string with six characters and they are Koichi.

Skipping Bytes

Okay, so, it turns out that everything that gets Marshalled comes with these offset bytes (like \v above) which measure strings and hashes and arrays and floats. (So does Python’s pickle and many other binary serialization formats.)

The PickPocket hack is based on these two ideas:

  1. You can skip through marshalled objects pretty easily.
  2. By slapping a header in the middle of a marsh, you can load only certain fragments.

Take this array:

 >> Marshal.dump ["Goto80", "Treewave", "YMCK"]
 => "\004\010[\010\"\vGoto80\"\rTreewave\"\tYMCK" 

This Marshal reads out loud like this:

 Header, Array(3)[ String(6), String(8), String(4) ]

So, if we want to load the second element, we can do some math to find where that element lives in the marsh. Skip the header (2 bytes), skip the Array counter (2 bytes), skip the first string (2 bytes + 6 bytes)... which leaves us at position 12.

Now, will it let us load from the middle of the Array? Or what?

 >> str = "\004\010[\010\"\vGoto80\"\rTreewave\"\tYMCK"[12..-1]
 => "\"\rTreewave\"\tYMCK" 
 >> Marshal.load(str)
 TypeError: incompatible marshal file format (can't be read)
        format version 4.8 required; 34.13 given

Oh, wait! The header!

 >> Marshal.load("\004\010" + str)
 => "Treewave" 

Hey, klawboom!! That worked. It loaded the object and ignored anything after it.

Picking Pockets

The final part of this hack is to come up with the code for walking down into the marsh and coming up with the object we want. Here’s what I’ve got in mind.

Let’s use, as our sample corpus, a marshalled dump of the RubyGems repository. It’s of a nice, wieldsome size (2M) and it would be nice to reach in and grab one gem.

  >> PickPocket(File.read('rubygems.m')).gems['hpricot-0.4-mswin32'].get
  => #<Gem::Specification:0x811b124 @name="hpricot" ...>

Instead of actually loading all the objects in the dump, this query is executed when the get method is run. It’ll search the rubygems.m file for a gems instance variable. And then it’ll search that variable for an 'hpricot-0.4-mswin32' key.

So far, it all fits in about a hundred lines of code: pickpocket.rb. More marshal hacking tomorrow.

Update: The RubySpec wiki has started a page on the Marshal format which looks to be a good start.

said on 08 Nov 2006 at 03:24

Doesn’t it look like a plain old pointer ladder throwed just in the middle of our Ruby jewelry store ? Well, as far as it is done night-time (which is needed for pick-pocketing), that’s first class robbery.

said on 08 Nov 2006 at 04:09

If we are talking robbery and devious behaviour, I can’t help wondering about buffer overrun attacks, though unless this stuff gets executed I can’t see how to implement, or more importantly, detect such an attack.

(Oh, this interface has a spelling checker now. It doesn’t like my British spelling of /behavio(?:u?)r/.)

said on 08 Nov 2006 at 08:08

I’m curious what you think of the notion of Marshal.dump creating a subclass of String. See ruby-talk:76055 for more fo what I mean.

With a MarshalledString class, we could add methods that would make it easier (or at least more obvious) to do what you’re doing here.

said on 08 Nov 2006 at 17:36

Hmm, you think maybe rewriting marsh’d strings is the way forward with proxying in sandbox?

said on 08 Nov 2006 at 17:49

Clever! What I have come to expect!

said on 09 Nov 2006 at 01:48

Can you read on a certain part of a file? If you could, then you delay reading until the asking for something.

said on 09 Nov 2006 at 05:00

Re: update. So that’s the weirdness with packed integers! And the ; construct is half-way to Lempel-Ziv encoding. Impressive. Thank you for this.

11 Jul 2010 at 20:55

* do fancy stuff in your comment.