Speeding Up Builder 2.0 #
This week I’m trying to break the skids off of Markaby. A recent Camping audit has lit up Markaby as the slowpoke. And I happened to root up a bit of slowness in Builder::XChar. The Range.include?
is killing us because the Range gets cast to an Array every time!
My friends, in your lives, in your valuable time, do not, in any case, use Range#include? Use Range#===. (0x0000..0xFFFF) === 0xBABE
you know it’s true. I have been careless, too.
Update: I’m taking back what I said. Range#=== and Range#include? are one and the same. I guess the fault was the find
loop. Anyway, whatever it was, the case
is alot more efficient.
Here’s a replacement:
module XChar VALID = [ 0x9, 0xA, 0xD, (0x20..0xD7FF), (0xE000..0xFFFD), (0x10000..0x10FFFF) ] end class Fixnum # XML escaped version of chr def xchr n = XChar::CP1252[self] || self case n when *XChar::VALID XChar::PREDEFINED[n] or (n<128 ? n.chr : "&##{n};") else '*' end end end
Which gave me a 30% speed up on the tests Atrus was running. The UTF-8 unpack is also taking some time. I wonder.
Tom
Where is a good description of what === is supposed to mean semantically? It seems to do everything…
why
It’s not that different from just plain equality. But redefined for Module, Regexp, Range to offer some nice shortcuts in a
case..when
.chuck
Its purpose is to provide a comparison operation for case statements, where each class defines it in a manner appropriate for use in case statements. Its exact definition is class-dependent; for instance the regular expression defines it as a pattern match. So yeah, it does a lot of things, depending on class.
Mike Leddy
Most of my XML uses a very small subset of unicode, so memoizing has its advantages and it doesn’t cost too much memory:
pedo
I’m too drunk to know what you’re talking about but I love you.
MenTaLguY
why: Hmm, I wouldn’t ever compare
===
(triple) to==
(double). Unlike==
(double), it’s neither commutative nor transitive.Several times now, I’ve advised people to try
SomeClass === foo
, and after they insisted it didn’t work, I found they’d gone and writtenfoo === SomeClass
instead because it looked nicer, and it’s like==
, right?I think it’s best to think of
===
(triple) as the “case match operator” and leave it at that. Sets the right expectations.MenTaLguY
(It doesn’t help that
===
(triple) is “exact equals” in certain other languages, either. It might have been better to make it a named method rather than an operator, but I have no idea what it would reasonably be called…)hgs
MenTaLguY: like
File.exist?
, maybeObject.match?
?honk if foo.match?(SomeClass)
chris2
Is this really much faster than a few <, > and ||? (Probably due to method call costs…)
MenTaLguY
hgs: I think you really meant
honk if SomeClass.match?(foo)
See, we have the same sort of problem with
Object#match?
as we had with==
and===
—in this case, a false analogy withString#match
is too easy. It even tripped you up when you wrote your example.MenTaLguY
chris brings up an interesting point, too—will the picture the same under YARV ? Method call overhead should be much smaller there.
tirins.play
all of this is same, but in a really strange way….
tirins.play
all of this is sane, but in a really strange way….
FlashHater
ActiveRecord::StatementInvalid Mysql::Error: MySQL server has gone away:
hgs
MenTaLguY: I was writing it the way people expected it. Yes, the semantics not being symmetrical is a problem. Maybe
foo.instance_of?(Class)
is better than.kind_of?
in so far as it doesn’t match ancestors, but that’s still the other way round from what you wanted.MenTaLguY
Doesn’t matter what people expect—since Ruby dispatches on the lhs, for the case match operator the matching criterion has to go on the lhs, and the object to be tested has to go on the rhs.
hgs
MenTaLguY
That might be nice actually.
===
(triple) does do exactly that for classes, but as a whole it’s got a broader meaning than specifically membership.chuck
I was under the impression that ===’s purpose is pretty much interface-ual. It provides a standard comparison operator for case expressions, such that for your own classes you should implement it in such a way that case makes sense for them. In some cases that’s object identity, for others equality of value; for ranges it’s membership, for regexps it’s a pattern match. The point is to write your own === methods for your own objects so that they can be sensibly used in case expressions. Or at least, that’s what I sort of derived from pickaxe the first.