Nikolai's UTF-8 Lib is All Ready
Last week, in the comments, Nikolai Weibull brought up his UTF-8 lib, a lovely creature which meets my own needs much better than what’s already out there. I like it much better than my own efforts. Especially now that he’s had some time to flesh it out.
Namely: It’s small. It’s coded in C. It locks into Ruby’s existing string class. Therefore, it can be efficient with memory and use Ruby’s own regexps.
require 'encoding/character/utf-8' str = u"hëllö" str.length #=> 5 str.reverse.length #=> 5 str[/ël/] #=> "ël"
If you’d like to follow development, clone this (git-web.) I’ve also put up a gem:
gem install character-encodings --source code.whytheluckystiff.net, but obviously it’s not an official release or anything.