Summary
I am exploring the possibility of defining
macros as patterns and transformation applied to an s-expression representation of the syntax. Maybe this is the path to writing a fully extendible language?
Advertisement
I want Heron macros to be completely unrecognizable from other language constructs.
For instance I want to be able to define a repeat (x) {...} construct within a library
So far the most powerful semantics I can come up with is to define macros
as patterns which operate on a typed s-expression representation of the parse tree.
Consider the following code:
repeat (n) { writeln("hello"); }
The first-level s-expression representation would be:
(repeat, (n), (writeln, ("hello"))));
The syntax for declaration of the repeat macro which I am considering would be:
macro ("repeat", list n, code x) = {
def f $x; // avoid potential name collision with i
int i = $n;
while (i > 0) {
f();
--i;
}
);
Now the syntax is a bit strange with "repeat" as part of the argument list.
This is because the macro is defined as a pattern to be matched and a transformation
to be applied to the s-expression when the pattern is found.
To understand why I did it that way consider the if/then/else macro, which I also
want to define in terms of a primitive construct called cond.
Coming back to the repeat macro, I would also want to provide a specialization of the pattern so that it can be unrolled if the parameter is a small compile-time constant:
macro ("repeat", 'int n, code x) =
(cond,
(n > 5), (repeat, $n, x), // too many iterations for unrolloing
(n == 1), $x,
(n == 0), {},
true, { $x; (repeat, (n - 1), x; }
);
If this idea has any merit, it may be conciveable to define very different syntactic structures for the same language. One possible application of this technology would be for defining domain specific languages.
I have been also looking into way to map symbols to macros while also expressing operator precedence. One approach I am considering is as follows:
macro (a, "=", b) = (invoke, (_eq, a, (b)));
macro ('type a, 'name b, "=", ...)
= ((let, (a, b)), (invoke, (a, _init, (...))));
macro (a, "+", b) = (invoke, (_plus, a, (b)))
| (a, "-", b) = (invoke, (_plus, a, (b)));
macro (a, "*", b) = (invoke, (_star, a, (b)))
| (a, "/", b) = (invoke, (_slash, a, (b)))
| (a, "%", b) = (invoke, (_slash, a, (b)));
The precedence of a macro is defined bottom-up (the most recently defined macro has the highest precedence), but a single macro can consist of multiple transformations separated by "|", thus representing transformations with similar precedence.
To try and clarify what the above syntax means:
"invoke" is a primitive construct which calls a function method. The first argument is the name of the method, the second argument is the object upon which to invoke the method, and the third argument is a list of arguments passed to the method.
"let" is a primitive construct which declares a variable. The first argument is the type, and the second argument is the name.
Why not few steps further - create macro language strong enough to be able to express rest of the language? Well I do not argue this may be afar from what you are trying to achieve with Heron. :)
> Why not few steps further - create macro language strong > enough to be able to express rest of the language? Well I > do not argue this may be afar from what you are trying to > achieve with Heron. :)
If I understand you correctly this is what I am reaching towards. Do you have any suggestions on how specifically I could strengthen the macro language?
I'm currently exploring possibilities to write extension languages for Python. I've chosen an AST transformer approach that seemed reasonable because it is easy to add new grammar rules to the existing grammar file and pure Python parsers are already available. Once you have parsed an extension language you can transform the AST* of your extension into an AST of the host language e.g. Python or Heron. The target AST can finally be compiled into native code or bytecodes ( depending on the language ). Allthough "AST surgery" can be awkward I try to find powerfull and simple operators for many common use cases. For this purpose I created a library of functions that enable creating ASTs from the scratch. The result will be a canonical meta-extension system that is completely defined by the underlying language grammar. I don't start thinking about representing the operations using S-expressions ( which may be natural if the language and the meta-language are one as in LISP and dervivaties ) but represent them simply as Python functions. Instead of defining the meta-system as a new language I'm creating just a library of transformers. Maybe this is the most direct and powerfull approach that works without changing the runtime which is clearly one of the requirements.