Summary
In this installment I will talk about tail call optimization, performance and the R6RS module system.
Advertisement
About tail call optimization (and the module system)
In order to speak of Scheme performance one needs to write at least a
few benchmarks. This is not completely trivial, for at least a couple
of reasons. The first reason is shallow, but it may be baffling for
beginners: since there is no for loop in Scheme, how are we
supposed to write a benchmark, which is usually based on running many
times the same instructions and measuring the total time spent? We
will see in a moment that there is an easy solution to this question
(use recursion). On the other hand, there is a second more serious
question: since there is no fully portable way to write a library in
Scheme, how can we write a benchmark library? There is no real answer,
so we will restrict ourselves to R6RS-compliant
Scheme implementations where there is a standard concept of library.
The for loop is missing in Scheme as a primitive construct since
it is useless in a language that guarantees tail call optimization.
If you studied the concept of tail call at college you know what
I am talking about; on the other hand, if you forgot it, or if you did
study Physics like me, it is worth spending a couple of words on
the subject. The ones who want to know more, may consult
this Wikipedia article.
The important point beyond the tail recursion concept
is that it is always possibile to convert a for
into a recursive function in tail call form, i.e. a recursive
function returning a value or a call to itself. For instance,
the Python loop:
# a loop printing 1 2 3
for i in range(1,4):
print i,
can be converted to a recursive function print_up_to_3:
def print_up_to_3(i):
if i == 4: return
print i,
return print_up_to_3(i+1)
print_up_to_3(1)
Here the last instruction of the function (the tail) is a call to
itself, hence the name tail call.
Tail call optimization is guaranteed by the
Scheme language. Scheme compilers/interpreters are able
to recognize recursive functions in tail call form and to convert
them internally in for loops. As a consequence, the programmer
has no need to write for loops directly: she can just use
recursive function. Our example would look as follows in Scheme:
(define (print-up-to-3 i)
(unless (= i 4)
(display i) (display " ")
(print-up-to-3 (+ i 1))))
(print-up-to-3 1)
This works, but it is not really readable; to improve the situation Scheme
provides a little syntactic sugar called named let:
(let loop ((i 1))
(unless (= i 4)
(display i) (display " ")
(loop (+ i 1))))
Traditionally the function in the named let construct is called loop
to make clear to the programmer that we are emulating a for loop.
In this example loop is exactly equivalent to print-up-to-3.
Let me point out two things before closing this paragraph:
there are other let forms, used to define local variables.
The simplest one is let:
> (let ((a 1) (b 2)) (+ a b)) ; => 3
The scope of a and b is limited to the current S-expression/form;
if a and b are defined outside the let form, internally
a and bshadow the outer names.
there is actually a do loop in the language, but it is cumbersome
to use and redundant because the named let allows you to perform
everything do does. I see it as a useless construct in a language
that would like to be minimalist but it is not.
As I have anticipated, libraries are the weak point of Scheme.
There are few libraries available and it is also difficult to
write portable ones. The reason is that historically Scheme
never had any standard module system until very recently, with
the R6RS document: that means nearly all current implementations
provide different and incompatible module systems.
In order to understand the reason for this serious lack, you must
understand the philosophy behind Scheme, i.e. the so called
MIT approach: things must be done well, or not at all.
For thirty years the Scheme community has not been able to converge on
a well done single module system. It is only in 2007 that a standard module
system has been blessed by the Scheme committee: but even that
was done with a lot of opposition and there are implementors who
said they will never support R6RS.
As a consequence of history and mentality, if you
want to write a library for implementation X, you need to do a lot of
boring and uninspiring work to port the library to implementations Y,
Z, W, ... (there are dozens of different implementations).
Moreover, a few implementations do not have a module system at all, so
you may be forced to solve name clashes issue by hand, changing the
names of the functions exported by your library, if they shadow names
coming from third party libraries (!)
Personally, I picked up Scheme 5 years ago, but never used it
because of the lack of a standardized module system. The main reason why
I have decided to go back to Scheme and to write this series is the
coming of the R6RS document last year. The R5RS standard has lots of defects,
but at least now I can write a library and I can have people using
different implementations install it and use it (nearly) seemlessly.
Since there is some hope for a large diffusion of R6RS module system
in the future, I have decided to use it and to ignore implementations
not supporting it. I should notice however that there are
solutions to run R6RS modules on top of R5RS implementations, like
the psyntax package, but I have not tried it, so I cannot comment
on its reliability.
As first example of usage
of the R6RS module system, I will define a repeat library exporting
a call function which is able to call a procedure n times.
Here is the code, that should be self-explanatory:
The export declaration corresponds to Python's __all__:
only the names listed in export are exported. In this case we
will export only the function (callnproc.args). Notice the dot in the argument list: that means that the functions
accept a variable number of arguments, which are collected in the list
args. In other words, .args is the moral equivalent of
*args in Python, with some difference that we will ignore for
the moment. The apply function applies the argument list to the
input procedure proc, which is called many times until the index
i reaches the value n.
(import(rnrs)) imports all the libraries of the current version of the
"Revised Report on Scheme", i.e. the R6RS report. At the REPL this is
automatically done by the system, but for batch scripts it is mandatory
(as Pythonistas say explicit is better than implicit). It is also
possible to import subsections of the whole library. For instance
(import(rnrsbase)) imports only the base library of the R6RS,
(import(rnrsio)) imports only the I/O libraries, et cetera.
The usage of the libray is trivial: it is enough to put the file
repeat.sls somewhere in the Ikarus search path (specified
by the environment variable IKARUS_LIBRARY_PATH). Then,
you can import the library as follows:
By default (import(repeat)) imports all the names exported by
the module repeat, something that a Pythonista would never do
(it is equivalent to a import*fromrepeat); fortunately it is
possible to list the names to be imported, or to add a custom prefix:
> (import (only (repeat) call)); import only call from repeat
call
#<procedure call>
> (import (prefix (repeat) repeat:)); import all with prefix repeat:
> repeat:call
#<procedure call>
The main advantage of Scheme with respect to Python is the performance.
In order to show the differences in performance I will go back to
the factorial example of episode 4. I will compare the following
Python script:
# fact.py
import sys, timeit
def fact(x):
if x == 0: return 1
else: return x * fact(x-1)
if __name__ == '__main__':
n = int(sys.argv[-1])
t = timeit.Timer('fact(%d)' % n, 'from fact import fact')
print t.repeat(1, number=10000000)
print fact(n)
with the following R6RS-compliant script (written in Ikarus Scheme):
Python manages to compute the factorial of 995, but then it faces
the stack wall and it raises a
RuntimeError:maximumrecursiondepthexceeded whereas Scheme
has no issues whatsoever;
In order to compute the factorial of 995 ten thousands times, Python
takes 15.2 seconds, whereas Ikarus takes 7.2 seconds.
Notice that since the factorial of 995 is a large number, the computation
time is spent in multiplication of large numbers, which are implemented
in C. Python has its own implementation of
long integers, whereas Ikarus uses the GNU Multiprecision library (gmp):
the times measured here mean that the gmp implementation of
long integers is more efficient than the Python one, but they
say nothing on the relative performances of the two languages.
It is more interesting to see what happens for small numbers.
For instance, in order to compute the factorial of 7 for 10 millions
of times, Python takes 30.5 seconds, whereas Ikarus takes
3.08 seconds and thus it is nearly ten times faster than Python.
This is not surprising at all, since function calls in Python
are especially slow whereas they are optimized in Scheme. Moreover
Ikarus is a native code compiler.
It means Ikarus' REPL works by compiling expressions to native code,
whereas Python's REPL compiles to bytecode. The technology is called
incremental compilation and it is commonly used in Lisp languages
from decades, even if it may look futuristic for C/C++ programmers.
The factorial example is not very practical (on purpose), but it
is significant, in the sense that it is legitimate to expect
good performances from Scheme compilers. The fastest
Scheme compiler out there is Stalin, but I would not recommend
it to beginners.
The next episodes will be devoted to the dangers of benchmarks,
do not miss it!
I'm sorry to hear that the next few articles are about benchmarking. What I was hoping to see was more about the Scheme language as seen from the point of view of a Python programmer. I hope you'll write more about that.
> Recursion is not considered the Pythonic way to do > things.
Yes, I say the same in the next episode ;)
> A faster definition of fact will be: > > from operator import mul > > def fact(n): > return reduce(mul, xrange(1, n + 1)) > > On my machine it is twice as fast compared to ikarus > (computing factorial of 995).
I cannot reproduce your results on my MacBook: reduce is indeed faster (from 15.2 to 10.5 seconds) but not as fast as Ikarus (7.2s). This is probably very dependent on the architecture and how the multiprecision library is compiled. Yet another case of tricky benchmark. For more, see the next episode!
Well, here you are comparing an iterative version of the algorithm with a pure recursive one. A more similar Scheme version is the following (tail-recursive):
(define (fact n) (define (fac n out) (if (< n 1) out (fac (- n 1) (* out n)))) (fac n 1))
The "MIT Approach" is *not* the reason that R5RS was so minimalistic and to your point, did not include a module system. Based on what I've read the reason is that the Scheme committee's members were just not interested in specifying things that exactly (there is supposedly undefined behavior all over R5RS, for example).
When you read into the history of the language you will find that the community was simply more interested in teaching and research than standardizing on a one-size-fits-all definition of the language.
Now look at the change in the committee between R5RS and R6RS, the differences between them don't explain the massive change in membership.
For contrast, read into the history of Haskell and look at how they addressed all of the different statically typed lazy languages variants (some incompatible) at the time!
Why do you say that "There is no portable module system" in big bold letters when the whole point of this article is to use R6RS which has a module system?
> Why do you say that "There is no portable module system" > in big bold letters when the whole point of this article > is to use R6RS which has a module system?
Because as of now most Scheme implementations are not R6RS-compliant. Notice that this article was originally written before PLT Scheme added support for R6RS, but even now AFAIK Chicken, Bigloo, Gambit, Gauche, STklos, ... are not supporting R6RS so you cannot write a library and port it everywhere. Moreover, the R6RS module system has unspecified dark corners, so even for R6RS-compliant implementations it is not trivial to write portable code. If you want to know more, I can refer you to recent threads in comp.lang.scheme and in the Ikarus mailing list.
I'm having trouble with Ikarus. I installed it with apt-get but when I try to "(import (only (ikarus) time))" it does it without error, but when I try to use "time" it comes up with an exception. I want to follow your posts closely but I can't without getting time to work. Also my IKARUS_LIBRARY_PATH variable hasn't been set??? I know how to set it but I don't know where the default location is.