This post originated from an RSS feed registered with Ruby Buzz
by Eigen Class.
Original Post: About problem formulations and ordered permutations
Feed Title: Eigenclass
Feed URL: http://feeds.feedburner.com/eigenclass
Feed Description: Ruby stuff --- trying to stay away from triviality.
The mere formulation of a problem is far more essential than its solution,
which may be merely a matter of mathematical or experimental skills.
-- Albert Einstein
Here is an interesting coding challenge: write a counter function that
counts from 1 to max but only returns numbers whose digits don’t repeat(...)
Also display the biggest jump and the total count of numbers (...).
It suggests a brute-force solution, with max iterations and a predicate to
verify if the counter satisfies the "no repeated digits criterion".
A vastly better approach (with 1126 times fewer iterations when considering
numbers up to 10^10) becomes apparent if the problem is reworded as follows:
generate permutations of the digits '0' to '9' that do not start with '0'
and whose numerical value is smaller than 'max', in increasing order
The "do not start with '0'" restriction is required to avoid generating a
permutation twice (e.g. 12 and then 012), since a leading zero is ignored.
The easiest way to generate a permutation with m items from a set of n
elements is to pick one element and prepend it to a (m-1) permutation of items
from the resulting set of (n-1) elements. The "not starting with 0" criterion
is fulfilled simply by picking the initial element from the digits different
from zero and then putting zero back in the set used in the recursion.
This is expressed directly in OCaml as follows (you can find the full code of
the OCaml and Ruby versions below):
module S = Set.Make(struct type t = int let compare = (-) end)
let (--) set elm = S.remove elm set
let permutations f zero digits len =
let base = S.cardinal digits in
let rec aux n digits = function
0 -> f n
| len -> S.iter (fun s -> aux (n * base + s) (digits -- s) (len-1)) digits
in S.iter (fun s -> aux s (digits -- s) (len - 1)) (digits -- zero)
The function is general in the sense that it will find unique permutations in
any base (that is, by placing the numbers 10 to 15 in the initial set, a
sequence with the digits '0'..'9' + 'a'..'f' can be generated with the same
code).
This version is not as general as the OCaml one, but that's not its only
problem. This is how long it takes to find the 8877690 numbers under 10^10
with unique digits:
Ruby 1.8.6 (base 10)
73s
Ruby 1.9 (base 10)
63s
OCaml (any base)
0.640s
As usual, Ruby is 100 times slower.
The OCaml version is more general than the strict minimum required by the
original problem (it works with arbitrary bases), and thus presumedly slower
than a more specific one. It doesn't fare badly against the
fastest implementation I've found,
an equivalent solution in Java that uses a bitmask to represent the
sets:
Java (base 10, bitset)
0.680s without JVM startup overhead
The Ruby solution uses Array#-, also know as "the poor man's set
subtraction". Removing an element from an array is an O(n) operation.
In the above-mentioned Java version, iterating over the remaining digits is
O(all digits), instead of O(remaining ones). The OCaml version uses a set
based on balanced trees where the relevant operations are logarithmic (it has
a fairly large constant factor, so an integer set based on a bitset would be
faster in this case). This means that the OCaml version would perform nearly
as well with base-1000 permutations, as opposed to the other solutions which
would become nearly 100 times slower.
Ironically, the OCaml stdlib is richer than Ruby's when it comes down to data
structures. (The irony is that, while OCaml makes it very easy to build a data
structure with good performance, in Ruby you are effectively stuck with the
core classes implemented in C, which means essentially Array and Hash --- Ruby
is so slow that algorithmic improvements hardly make up for the
interpretation overhead.)