Summary
I thought I would share my latest attempt at writing a white paper describing the implicit behavioral subtyping mechanism I am using for the current Heron prototype (yes you can download it at http://www.heron-language.com/ ). Any comments would be appreciated.
Advertisement
Abstract
The primary problem with structural subtyping, semantic malformance (e.g. contract violations), can be alleviated by introducing behavioral constraints into the specification of the subtype. This paper describes a mechanism to implement implicit behavioral subtyping, by applying structural subtyping as well as precondition and postcondition assertions to traits.
Introduction
This work is demonstrated using the Heron programming language. Heron is a statically typed experimental programming language, which is implemented using a Heron to C++ translator, HeronFront. The HeronFront tool translates trait constructs written in Heron to classes with the implicit behavioral subtyping semantics described in this paper. Apart from the introduction of traits and some rather minor syntactic difference, the Heron langauge shares C++ syntax and semantics.
Traits
Traits were first described by Schärli et al. [Sch03] as a compositional model for structuring object oriented programs. A trait is a construct similar to Java interfaces in that it is used to characterize the function signatures (sometime described as a method dictionary) for a set of types, and can be used as a polymorphic reference which binds to a conforming object. The Heron trait has similar syntax to traits in Scala 1.0 [Ode05] which has since been replaced by mixin-classes. The function signatures that it provides are known as the required functions, and are sometimes referred to as "super-sends" because they delegate the implementation to the super type.
Unlike a Java interface, a trait may also have member functions which call other member functions or one of the required functions. The member functions of a trait are referred to as extension functions to clearly delineate them from class member functions.
Structural Subtyping
Structural subtyping (also known as implicit subtyping, signature based polymorphism, and colloquially as duck-typing), is a method of deducing a subtype relationship at either run-time or compile-time. Trait instances in Heron are associated with super types using a structural subtyping mechanism.
Implicit subtyping has several advantages:
separation of mechanisms of inheritance and subtyping - the problems with this are explored in depth with regards to the C++ language by Baumgartner and Russo [Bau95].
subtyping without recompilation - new subtype relationships can be created without requiring recompilation or rewriting of the super-type. This not only increases compilation times significantly, it makes binary-only distribution of components more feasable and useful, and cross-language development more feasable.
more flexible and extensible designs - all possible subtype relationships don't have to be decided a priori, and can be restricted in scope to particular modules.
mock objects - it is easier to create mock objects for testing, a crucial component of TDD methodologies
Implicit subtyping can be achieved, rather surprisingly, in the C++ programming language using only template meta-programming techniques [Dig04], and more conveniently with macros [Tur05].
Several dynamically typed languages support structural subtyping but only a few statically typed language provide support for it. The main languages being the various ML dialects. A now discontinued extension to the GCC C++ compiler, did support a similar structure also called signatures [Bau97].
Heron Syntax
Except where stated, Heron statements have precisely the same syntax and semantics as C++. HeronFront
does a literal mapping of most statements from Heron to C++ with no checking.
Heron traits are divided up into four sections (inherits, requires, public, private), and optional template parameters.
Heron also introduces two new keywords: _result and _override.
The _result keyword corresponds to a variable which is implicitly declared at the beginning of every non-void function in a trait.
Every non-void function in a trait also contains the implicit statement "return _result;" as a last line.
A return statement takes on a new meaning in Heron, it assigns the expression to _result, and then jumps to the end of the
function, so that the postconditions can still be evaluated.
The _override keyword corresponds to a delegation of the current function to the appropriate function in the super-type.
This means that a required function in a trait with no preconditions and posconditions can be written equivalently
either as:
1)
trait T {
requires {
def f() : int;
}
}
2)
trait T {
requires {
def f() : int {
_override;
}
}
}
The actual C++ code generated for the _override keyword would resemble more or less:
_result = _get()->f();
The _get() reserved member function returns a pointer to a class which forwards the function
to the bound object.
Trait Semantics
An instance of a trait in Heron can bound to any concrete object which provides the appropriate
function signatures, for example:
/*
Heron code which we assume is translated into a C++
header file using HeronFront and #included
trait MyTrait {
require {
def set(int n);
def get() : int;
}
}
*/
class MyClass {
public:
MyClass() : m(0) { }
void set(int n) {
data = n;
}
int get() {
return data;
}
private:
int data;
};
int main() {
MyTrait t;
MyClass c;
t = c;
printf("%d\n", t.get()); // outputs 0
printf("%d\n", c.get()); // outputs 0
t.set(1);
printf("%d\n", t.get()); // outputs 1
printf("%d\n", c.get()); // outputs 1
}
Heron Trait Grammar
statement ::==
any legal C++ statement
function-param ::==
type identifier
function-return-type ::==
: type
function ::==
def identifier ( function-param*) function-return-type?{ statement*}
function-or-type-or-macro ::==
function | class | trait | macro
template-parameters ::==
[ meta-type identifier ]
inherits-section ::==
inherits{ type-specifier*}
requires-section ::==
requires{ function*}
public-section ::==
public{ function-or-type-or-macro*}
private-section ::==
private{ function-or-type-or-macro*}
trait ::==
trait identifer template-parameters?{
inherits-section?
requires-section?
public-section?
private-section?}
Same Structure, Different Semantics
Implicit subtyping has the disadvantage that syntactic conformance can occur in types which do not share semantics. An
illustrative example can be found when comparing LIFO stacks and FIFO stacks. This is illustrated in Example X.
The problem occurs when we try to execute some algorithm which expects a FifoStack, but instead receives
a LifoStack.
The proposed solution to the problem of structural conformance with semantic malformance, is to introduce
contract verification, in the form of precondition and postcondition assertions. In other words implicit
behavioral subtyping.
Contract Verification
A contract is the name collectively given to the preconditions, postconditions, and invariants, which a
type is expected to satisfy. A contract can be used to characterize the behavior of a type [Mey97], [Lis99].
Rather than creating a brand new syntax, including keywords, Heron relies solely on the C++ language facilities
to enable the programmer to define how they choose to verify the preconditions and postconditions.
Contract verification can be achieved by writing the required function using the explicit syntax and the _override
keyword, and inserting precondition and postcondition checks before and after respectiviely the _override call,
as follows:
How the pre and post functions are implemented is up to the programmer, but they can be implemented as macros create
two new macros as follows:
Implicit behavioral subtyping is a stronger form of structural subtyping which introduces
precondition and postcondition assertions into the verification. The following example
shows the differentiation between a FIFO stack and LIFO stack
The difference in behavior between the two kinds of stack ADT's, can be expressed as whether or not push
affects the status of the top, in the case of a non-empty stack. This is a non-trivial assertion, and
to express it we need the full the range of the language. To achieve this, Heron allows any C++ statement
to be used before or after the super-send (i.e. the _override keyword).
Verifying Invariants
Heron currently provides no convenient syntax for verifying invariants, however this should change in the
next major release of HeronFront. See the section "Future Direction" for more information. For the time
being an invariant can be simply explicitly checked after each required function along with the postconditions.
External Function Dispatch versus Virtual Function
Common OOPL implementations implement runtime polymorphism using the mechanism of virtual functions.
One difference with Heron is that it implements runtime function dispatch externally to the class, rather
than through the use of virtual function. This means that a pointer to a function dispatch table is
associated with the trait instance rather than the object instance. This approach carries several advantages
intra-method calls can be more easily inlined
the memory and cpu overhead associated with polymorphism only occurs when the polymorphism is needed
rather than throughout the lifetime of every single object.
no more undiscovered bugs due to unanticipated virtual function overrides - testing of OO designs
frequently overlooks the various possibilities of virtual function override abuse.
The Generated C++ Code
HeronFront generates C++ classes which support signature matching semantics. This is accomplished using template
meta-programming techniques. Every trait is associated with a single class which has a template constructor, a
template assignment operator, and a dispatcher object along with the expected required and extension functions.
The dispatcher object is implemented as a concrete subtype inheriting from an abstract supertype which contains a pure
virtual function for each required function in the trait. The extension functions are implemented directly in the trait class.
Example X contains an example of code generated for a simple trait.
This code is quite complex, but the most important characteristics are:
it has the same memory footprint as two void pointers
it uses the C++ virtual function dispatch mechanism internally to dynamically dispatch functions rather
than the more unweildy function pointer table approach first described in [Dig04].
it uses placement new to construct the correct data type internally
It should be noted that the technique does make some assumptions about alignment of pointers, which will
affect the portability of the technique in some cases. The trade-off however is that the method used
is very compact and efficient.
Future Directions
The largest gap in this work is the inability to express a class invariant in a single location. Class invariants currently have to be verified with the preconditions and postconditions (arguably either one is sufficient). A class invariant is an example of a crosscutting concern. Direct support for class invariant verification can be added to a programming language such as Java or C++ with relative ease, but this ignores the more general problem of the inability to express crosscutting concerns. Ideally a language would provide such a mechanism which can then be used to implement invariant verification, along with other crosscutting concerns.
It is the author's desire to introduce an extremely minimal form of aspect oriented programming support [Kic97] into Heron in the form of macros which are automatically inserted around each member function. These macros would be ideally associated with a class, or a module, and can be inherited from one class to another.
There is a possibility of introducing a mutable state into traits in the context of Heron.
Acknowledgements
Special thanks to Peter Grogono. Thanks also to the many people at Artima.com who have patiently discussed the topics of structual and behavioral subtyping with me.
References
[Lis99] Behavioral Subtyping Using Invariants and Constraints, July 1999, Barabara H Liskov, Jeanette M Wing, CMU-CS-99-156
[Mey97] B. Meyer. Object-Oriented Software Construction, Second Edition. Prentice Hall, 1997.
[Bau95] Signatures: A Language Extension for Improving Type Abstraction and Subtype Polymorphism in C++ (1995) Gerald Baumgartner,
Vincent F. Russo Software--Practice \& Experience
[Bau97] Implementing signatures for C++, Gerald Baumgartner, Vincent F. Russo, ACM Transactions on Programming Languages and Systems
(TOPLAS) archive volume 19, Issue 1 (January 1997) Pages: 153 - 187
[Kic97] Kiczales, Gregor, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Lopes, Jean-Marc Loingtier, and John Irwin (1997).
"Aspect-Oriented Programming" Proceedings of the European Conference on Object-Oriented Programming, vol.1241, pp.220242.
[Sch03] Traits: Composable Units of Behavior,
Nathanael Schärli, Stéphane Ducasse, Oscar Nierstrasz, and Andrew P. Black.
University of Berne / OGI School of Science & Engineering, May 2003 Proceedings of ECOOP 2003
[Dig04] C++ with Interfaces, Christopher Diggins, September 2004 C++ Users Journal
Well, it's a nice article. I mind one thing, it gives reader the impression that traits solve some serious problems of multiple inheritance easily, which is not true. So in this reply I'm realy going to attack traits and "structural subtyping" (I hate structural subtyping).
> more flexible and extensible designs - all possible > subtype relationships don't have to be decided a priori, > and can be restricted in scope to particular modules.
I must comment on this one, because this is important. The other advantages outlined here can be easily added to nominal subtyping. I think that it is impossible to specify all subtype relations a priori because there are usually so many of them, and developers often specify only subtype relations they really need for program to work. Specifying all relations would be a big waste of development time. And also, when using libraries, it may be impossible to add new subtype relations, end that usually ends up in creation of (one,separate) wrapper for every library for every program that uses it.
I'm really sick of writing all those wrappers. And there are some things that wrappers can not do compared to original library.
First, I must say that traits are not a solution for this problem. Because they depend on name matches (same structure). What's the chance that two developers, working on the same project, will choose same names where it is required? Chances are very small. So they need to collaborate on which subtype relations they need, and consequently decide the names of methods. This is, again, subtyping a-priori. What are the chances that two library vendors will choose the "same structure" for classes where functionality of their libraries overlap (so that there are subtype relations)? Very small. So there go wrappers again. To make a comparison I'll say this: I might have won 500$ on lottery yesterday, but I dont plan my budget on being that lucky every day. Similarly, there might be some lucky "structure matches" in classes made by different developers, but I won't plan development of my programs on that happening each and every time I need a subtype relation. In short, subtyping with traits is subtyping a-priori. Oh, yes it is.
Second, people usually think that with nominal subtyping you need to specify all subtype relations a-priori. This is not true. The problem is that today's programming languages dont allow you to specify subtype relations a-posteriori. The problem isn't intrinsic to nominal subtyping. And I know a way how to implement subtyping a-posteriori with nominal subtyping. And I think that subtyping a-posteriori is great.
> The _result keyword corresponds to a variable which is > implicitly declared at the beginning of every non-void > function in a trait.
I always wanted something like _result keyword added to C++. I like this _result keyword. Great.
> The proposed solution to the problem of structural > conformance with semantic malformance, is to introduce > contract verification, in the form of precondition and > postcondition assertions. In other words implicit > behavioral subtyping.
I reject this "proposed solution". I will argue that this certainly doesn't solve the problem. It might be alleviating the problem a little, but just a little.
The problem with behavioral subtyping is that, in order for it to avoid "semantic malformance", the developer must specify ALL postconditions. This might not be possible given some specific interface. And has other problems.
For my argument, I will take your example from section "Implicit Behavioral Subtyping". Of course, you didn't specify ALL postconditions (think why you didn't. I'll answer: Because it's tedious). So it is easy to construct a concrete class that demonstrates "semantic malformance". Intentionally, I will make it artificial.
class StrangeLifoStack { int a[100]; int n; public: StrangeLifoStack() : n(0) {} void push(int x){ a[n++]=x; a[n++]=x; } int pop(){return a[--n];} int& top() {return a[n-1];} int& top() {return a[n-1];} bool is_empty(){ return n==0;} }
Now trait LifoStack can be easily added to class StrangeLifoStack. This is wrong because the user of LifoStack will (i guess) presume that push() pushes single element.
Now, I think that you are thinking :) something on the following lines: a) Postconditions and preconditions are a good thing. b) Every function should have postconditions and preconditions c) Since b) is true, then behavioral subtyping can alleviate "semantic malformance". This solves biggest problem of structural subtyping, so we can use it instead of nominal subtyping.
I agree on a). I dont agree on b). First, it's ambiguous. It's ambiguous because it doesn't say should function have all postconditions or some postconditions. I think that functions should have some postcondiotions, where it is convenient. But they shouldn't have all. And you need every function to have all postconditions for c) to be true.
Secondly, verifying all postconditions is nearly impossible, meaning its tedious. The class may not have interface required to check all postconditions. And also, writing all postconditions is practically equivalent of reimplementing the whole class, if not worst. Thus, you might need to reimplement the whole class in every interface that a class is subtype of. O-o.
I think b) is simply not viable. Then it looks like c) is not true.
....
I was also looking at how Heron is translated to C++. Trait has pointer to object. So, I was thinking, if someone needs mixin of mixin of mixin, that would require following pointer to pointer to pointer. This is very inefficient. Or maybe I'm wrong somewhere?
And did you actually compile those examples? I have noticed a possible error in trait Stack, where pop() method returns nothing?
> @Christopher, > > I still don't get the difference between public and > requires. You appear to be able to use _override in each > case. Can you enlighten me?
Sorry still for the confusion.
You can't use _override in a public function. A required function must occur in the super-type in order for the strcutural conformance to occur. A public function in a trait, which I refer to as extension functions, is a function which is not delegated to the super-type but instead calls required function, or other extension functions.
> I mind one thing, it gives > reader the impression that traits solve some serious > problems of multiple inheritance easily, which is not > true.
I really don't see how I do that, I don't even mention multiple inheritance. Could you explain how I give that impression?
> So in this reply I'm realy going to attack traits > and "structural subtyping" (I hate structural subtyping). > > > more flexible and extensible designs - all possible > > subtype relationships don't have to be decided a > priori, > > and can be restricted in scope to particular modules. > > I must comment on this one, because this is important. The > other advantages outlined here can be easily added to > nominal subtyping.
Ok.
> I think that it is impossible to > specify all subtype relations a priori because there are > usually so many of them
I don't agree that it is impossible.
>, and developers often specify only > subtype relations they really need for program to work. > Specifying all relations would be a big waste of > development time. And also, when using libraries, it may > be impossible to add new subtype relations, end that > usually ends up in creation of (one,separate) wrapper for > every library for every program that uses it. > > I'm really sick of writing all those wrappers. And there > are some things that wrappers can not do compared to > original library. > > First, I must say that traits are not a solution for this > problem. Because they depend on name matches (same > structure). What's the chance that two developers, working > on the same project, will choose same names where it is > required? Chances are very small. So they need to > collaborate on which subtype relations they need, and > consequently decide the names of methods. This is, again, > subtyping a-priori.
I see, this is a good argument.
> What are the chances that two library > vendors will choose the "same structure" for classes where > functionality of their libraries overlap (so that there > are subtype relations)? Very small. So there go wrappers > again. To make a comparison I'll say this: I might have > won 500$ on lottery yesterday, but I dont plan my budget > on being that lucky every day. Similarly, there might be > some lucky "structure matches" in classes made by > different developers, but I won't plan development of my > programs on that happening each and every time I need a > subtype relation. In short, subtyping with traits is > subtyping a-priori. Oh, yes it is.
You could use adapter classes, which repair mismatched names, and allow you to construct hierarchies after the fact. In fact I believe this would be similar to how any a-posteriori subtyping mechanism would have to work as well, to compensate for the same problem of name mismatches.
> Second, people usually think that with nominal subtyping > you need to specify all subtype relations a-priori. This > is not true. The problem is that today's programming > languages dont allow you to specify subtype relations > a-posteriori. The problem isn't intrinsic to nominal > subtyping. And I know a way how to implement subtyping > a-posteriori with nominal subtyping. And I think that > subtyping a-posteriori is great.
How do you overcome the method naming problem, where the supertyep and subtype names are matched.
> > The _result keyword corresponds to a variable which is > > implicitly declared at the beginning of every non-void > > function in a trait. > > I always wanted something like _result keyword added to > C++. I like this _result keyword. Great.
Thanks, I stole it from Pascal & Eiffel.
> > The proposed solution to the problem of structural > > conformance with semantic malformance, is to introduce > > contract verification, in the form of precondition and > > postcondition assertions. In other words implicit > > behavioral subtyping. > > I reject this "proposed solution". I will argue that this > certainly doesn't solve the problem. It might be > alleviating the problem a little, but just a little. > > The problem with behavioral subtyping is that, in order > for it to avoid "semantic malformance", the developer must > specify ALL postconditions. This might not be possible > given some specific interface. And has other problems.
Yes for it to work it depends on a sufficiently complete interface. The tiny interfaces we are using for examples are extremely poor. The more sophsticated a trait, the easier it is to specify pre/postconditions. This means that a programmer can work with the system to provide sufficient interfaces in order to specify the contract.
However, even at that, contract verification is not a complete solution. But I think it is a darn good partial solution.
> For my argument, I will take your example from section > "Implicit Behavioral Subtyping". Of course, you didn't > specify ALL postconditions (think why you didn't. I'll > answer: Because it's tedious).
And the interface is insufficient.
> So it is easy to construct > a concrete class that demonstrates "semantic malformance". > Intentionally, I will make it artificial.
[example cut]
I have to agree that your example is correct. However on a bigger scale question: is this really a significant enough problem to dismiss the idea of implicit behavioral subtyping altogether? The real issue is how likely are these examples in carefully designed software.
> Now, I think that you are thinking :) something on the > following lines: > a) Postconditions and preconditions are a good thing.
Yes.
> b) Every function should have postconditions and > preconditions
I assume you mean postcondition and precondition checks. The answer is yes, in an ideal world, but in practice it isn't always possible.
> c) Since b) is true, then behavioral subtyping can > alleviate "semantic malformance". This solves biggest > problem of structural subtyping, so we can use it instead > of nominal subtyping.
I would have state (c) in a weaker form, e.g. implicit behavioral subtyping, can help alleviate semantic malformance.
> I agree on a). I dont agree on b). First, it's ambiguous. > It's ambiguous because it doesn't say should function have > all postconditions or some postconditions. I think that > functions should have some postcondiotions, where it is > convenient. But they shouldn't have all. And you need > every function to have all postconditions for c) to be > true. > > Secondly, verifying all postconditions is nearly > impossible, meaning its tedious. The class may not have > interface required to check all postconditions. And also, > writing all postconditions is practically equivalent of > reimplementing the whole class, if not worst. Thus, you > might need to reimplement the whole class in every > interface that a class is subtype of. O-o. > > I think b) is simply not viable. Then it looks like c) is > not true.
In the strong form, I agree.
> I was also looking at how Heron is translated to C++. > Trait has pointer to object. So, I was thinking, if > someone needs mixin of mixin of mixin, that would require > following pointer to pointer to pointer. This is very > inefficient. Or maybe I'm wrong somewhere?
I would not call them mixins, but your point is correct. This is a limitation of the current implementation. This can be overcome with a relatively small amount of work on my part to fix the compiler.
> And did you actually compile those examples?
Yes.
> I have > noticed a possible error in trait Stack, where pop() > method returns nothing?
pop() does return nothing in all the examples consistently much like the C++ standard library. It is a more efficient implementation, where if a return value is desired, the user must call top().
I want to turn the tables on you a bit, AFAICS there is only one issue you have brought up against implicit behavioral subtyping which is the possibility of unchecked unintentional subtyping relationships. I admit this is a distinct possibility, however, mistakes of this nature can be made with a nominal system as well. So why do you feel so strongly about structural subtyping, and implicit behavioral subtyping.
Thanks for taking the time to discuss this at such length with me.
> > I mind one thing, it gives > > reader the impression that traits solve some serious > > problems of multiple inheritance easily, which is not > > true. > > I really don't see how I do that, I don't even mention > multiple inheritance. Could you explain how I give that > impression?
Maybe my statement is over exaggerated because of my personal bias against structural subtyping. Basically, I mind the section "Structural Subtyping", where you present it's advantages. And those advantages are based on implementations of multiple inheritance in contemporary programming languages.
> You could use adapter classes, which repair mismatched > names, and allow you to construct hierarchies after the > fact. In fact I believe this would be similar to how any > a-posteriori subtyping mechanism would have to work as > well, to compensate for the same problem of name > mismatches.
Adapter classes (I like to call them "wrappers") cannot solve the problem completely. There are some pathological cases where they simply cannot substitute original classes. And they might be inefficient (I'm mot talking about cost of delegation one pointer dereference and one function call, I'm talking about some serious inefficiency problems). The problem arises when trying to use value semantics. That means that structures from mathematics (integer, real, matrices etc.) are most endangered.
> How do you overcome the method naming problem, where the > supertyep and subtype names are matched.
The truth is that, in the end, some kind of adapter is required to fix the name mismatches and add missing functions. But the mechanism must be more powerful than simple inheritance as implemented in contemporary programming languages. It must be able to alias and rename methods in original class (not the adapter) and add functions to original class, all of that a-posteriori. It is possible.
The real problem isn't in method names. It's always easy to add renaming. The problem is in adding the subtype relation, since vtable of supertype and subtype will not be flexible enough (given the incremental implementation of inheritance in C++ for example).
Argh... this requires long discussion. But, in short, you dont compile class definitions. You only compile functions. This requires a very complicated linker. This might sound stupid at first glance, but think of this: why do contemporary programming languages compile class definitions? Class definition is basically: interface, function definitions and fields. There is no need to compile interface, its declaration only. Functions should be compiled. Fields should be accessed through getters and setters, making them (getters and setters) only thing that needs recompiling. And the vtable needs to be recompiled, of course, that's the whole point (meaning: subtype relationships need recompiling).
Also, not all classes need recompiling. The program could be divided into modules where some classes are visible only within their modules. And if you still think it's slow, I have a solution how to implement this ultra-fast (complicated solution).
There is also another way. Use the same mechanism that's available to dynamically typed languages. Build a big table of classes and functions. Compress it. For speed, use polymorphic inline cache. However, I dont like this because I think it's still too slow. That's probably because I'm a game developer and performance matters to me a lot.
> Yes for it to work it depends on a sufficiently complete > interface. The tiny interfaces we are using for examples > are extremely poor. The more sophsticated a trait, the > easier it is to specify pre/postconditions. This means > that > a programmer can work with the system to provide > sufficient interfaces in order to specify the contract. > > However, even at that, contract verification is not a > complete solution. But I think it is a darn good partial > solution.
> I have to agree that your example is correct. However on a > bigger scale question: is this really a significant enough > problem to dismiss the idea of implicit behavioral > subtyping altogether? The real issue is how likely are > these examples in carefully designed software.
At this moment, I'm not even considering how likely these examples are. My reasoning is the following: S=structural subtyping, N=nominal subtyping, but you can also generalize. - N can do all that S can (in equally nice, efficient manner) - S can do all that N can (in equally nice, efficient manner), except for issue X. Irregardless of how big X is, why implement S instead of N?
Now X here means a) switching some error checks from compile time to run time. This is bad, if not bad enough. b) Some other potential problems (efficiency, increased development time, type errors) which we dont know yet how bad they are, but they might turn out really bad. Why would anyone take the risk?
> I want to turn the tables on you a bit, AFAICS there is > only one issue you have brought up against implicit > behavioral subtyping which is the possibility of unchecked > unintentional subtyping relationships. I admit this is a > distinct possibility, however, mistakes of this nature can > be made with a nominal system as well. So why do you feel > so strongly about structural subtyping, and implicit > behavioral subtyping.
It's not the same thing. With nominal subtyping, you can make mistake only at one place, at class definition. With structural subtyping, you can make the mistake every time you use the object.
However, this made me wonder about another thing. Please, first consider following example:
When declaring variables in C++ (not when declaring functions or declaring pointers), there is keyword const. Like in: int main(){ const int x;
This "const" keyword here seems completely useless. The compiler can easily figure if variable x is const or not, depending on usage. The const keyword here is to remind programmer not to make a stupid mistake. Programmer puts this keyword as some kind of marker. Const used in this context is completely useless to compiler. But it is useful to programmer because compiler can find some serious errors for him. Accidentally, const here also makes code more readable because programmer's intentions are clear. To quote wikipedia: "Such proactive use of const makes values "easier to understand, track, and reason about,"[1] and thus, it increases the readability and comprehensibility of code and makes working in teams and maintaining code simpler because it communicates something about a value's intended use."
I was thinking: what if a compiler could, with 100% certainty, deduce (or guess, doesn't matter) all subtype relationships. Then there would be no need for nominal subtyping. But is there something else in nominal subtyping which makes it good? I came to conclusion that there might be just one thing: what if someone wants to specify subtype relation a-priori, before he even writes the classes/interfaces, maybe like some kind of marker or qualifyer. Like const example, where programmer specifies that he wants x to be constant, but he hasn't even used x yet. So with nominal subtyping, programmer can specify intended use, which might be good.
Continuing on the same lines of thought, I got to following conclusion: The subtype specification is actually part of the interface. Because it enables automatic type conversion (pointer-to-typeA to pointer-to-typeB). I will put that another way: subtype specification adds functionality to pointers, meaning that subtype specification modifies pointer's interface. If you leave out subtype specification someone may attempt to change your class, and in doing that he may break some subtype relations that are required in order for program to work. And with behavioral subtyping, compiler won't know that there is an error until program is run and program actually tries to use automatic conversion which is now missing (it will end up in failed postcondition check).
So, in conlusion, structural subtyping in neither a-priori, neither a-posteriori. Nominal subtyping is a-priori, and can also be made to work a-posteriori.
> > Thanks for taking the time to discuss this at such length > with me.
Thats because I like Heron, don't like structural subtyping, so maybe you will drop it, or I'll learn something new and start to like it.
You said that structural malformance might not be likely in well designed systems. Now, it looks to me that it might be likely. Just look at types like const unsigned, const int, const int8, const int16, const int32, const long, maybe even const float. They are all syntactically conformant This even reminds me of C programming language where you can do:
int a=0; unsigned b=0; char c=0; float d=0; a=b; b=a; c=b; a=d; //this one produces warning. // .... and others. It all works in my compiler.
I buy the argument that nominal subtyping is useful for a programmer to restrict certain sub-type relationships.
I am now seriously considering adding a feature to a later version to Heron, such as a "nominal" tag one may add to a trait, to assure that it can only be explicitly subtyped, and I would add a section to classes, such as "implements x", indicating that the compiler can assume the subtype relationship.
> I buy the argument that nominal subtyping is useful for a > programmer to restrict certain sub-type relationships.
> I am now seriously considering adding a feature to a later > version to Heron, such as a "nominal" tag one may add to a > trait, to assure that it can only be explicitly subtyped, and I > would add a section to classes, such as "implements x", indicating > that the compiler can assume the subtype relationship.
This looks ok to me. Developer can either choose to do nominal subtyping, or structural if he is too lazy to write subtype specification, doesn't care or it doesn't matter to him. Personally I would always use that nominal keyword.
With this addition, Heron does better than any contemporary programming language that I know. So thats good, at least good enough for me. But I want more. There are still some things bothering me.
I want to put up two more arguments against structural subtyping/traits. I hope I didn't bore you to death with all this requirements. To be honest, I don't expect you to accept this two because I noticed that most programming language designers don't share my opinion about them.
First thing is that I like renaming. My argument is that we give names to indentifiers based on real world objects that we perceive to have properties similar to structures that we implement in our programs. To do that, we use natural language. But it has synonyms, which can cause name mismatches, and homonyms which cause name conflicts. Also, meaning of words depends on context. For example, queue may have method pop, which is called pop_front when in deque and may be called next_patient in software for hospital. Renaming and aliasing can be used to resolve those conflicts, especially when there are multiple developer groups working on the same software and using libraries from multiple vendors. And even within the same developer group, there may be need to switch names as context changes, for example there could be multiply_matrix function in module for matrix operations which can be renamed to transform_coordinates in 3D geometry module.
My second example is one pathological case (but this case might be frequent in libraries). Let's say that there is a library L1 that manipulates some objects that cannot be copied (for example, large matrices). This library is some kind of container responsible for creation and destruction of those objects (nothing strange- all STL containers do that). Also, this is not a generic library, meaning that it needs matrices of type M1, which is defined by the library. This is nothing unusual, because this library may need some specific information from those matrices in order to manipulate them. Like: it may sort matrices by determinant. In order for developer to access the matrices, library provides pointer to them to avoid copying.
But, developer needs to use another library L2 that also manipulates large matrices of type M2, defined by library L2. M1 and M2 are actually the same type (I feel controversial today, so I'll say it this way: M1 is subtype of M2 and M2 is subtype of M1), but those libraries provide no subtype relations M2<M1 or M1<M2. L2 has methods to manipulate matrices that accept pointer-to-M2. This means that developer needs to create relation M1 is subtype of M2 a posteriori. Of course, he can not do that because M1 is defined by the library L1 and cannot be changed. To make things more interesting lets also say that method names of M1 and M2 don't match, but also M2 misses some functions that M1 has (that can be easily implemented given public interface of M2), and M1 misses some functions that M2 has (that can be easily implemented given public interface of M1).
What now?
First, my opinion is that there is nothing wrong with those libraries. Maybe the whole situation could be circumvented if L1 doesn't create objects itself, but delegates this task to developer. But here thats not the case, and also it's ugly (imagine that stl::vector delegates creation of it's objects to user- it would be ugly).
The whole situation is easily solved by subtyping a-posteriori that I described earlier.
So, my last problem is that it would be really hard to add renaming or subtyping a-posteriori to programming language having traits/structural subtyping, but with multiple inheritance that could be added relatively easy.
May I ask you that, if you dont agree with me about this two controversial ideas, to explain why because I would be very interested to hear some opinions.
> With this addition, Heron does better than any > contemporary programming language that I know. So thats > good, at least good enough for me. But I want more. There > are still some things bothering me.
That is nice of you to say.
[taken from bottom] > May I ask you that, if you dont agree with me about this > two controversial ideas, to explain why because I would be > very interested to hear some opinions.
Okay.
You keep talking about comparing the traits mechanism I am proposing with multiple inheritance. Inheritance and subtyping are fundamentally different ideas. Any subtyping mechanism which depends on inheritance is flawed because it results in undesirable structural dependencies between subtypes.
> First thing is that I like renaming.
Renaming is completely possible within the proposed Heron traits system:
> method names of M1 and M2 don't match, but also M2 misses > some functions that M1 has (that can be easily > implemented given public interface of M2), and M1 misses > some functions that M2 has (that can be easily > implemented given public interface of M1).
You said they are the same type, but with different methods can you really say they are the same type? Perhaps they share characteristics and internal structure, but I can't see how one can claim they are the same type, or subtypes of each other.
Anyway I see no problem here, I would use a wrapper trait like I describe above. A wrapper trait can change, or even introduce new behaviour.
class L1 { public { class M1 { public { def f1(); } } trait T1 { requires { def f1(); } } } def multiply(T1 x, T1 y) { ... } }
class L2 { public { class M2 { public { def f2(); } } trait T2 { requires { def f2(); } } } def multiply(T1 x, T1 y) { ... } }
I guess you probably assumed that the libraries would be written to manipulate classes and not traits. In my mind that would be far too constricted of a design. The rule of thumb for programming in Heron is to use traits wherever possible, and classes or templates only when needed.
If efficiency is of the utmost concern (because traits are only marginally slower than classes), one should really use generics, and the whole issue of subtyping relationships becomes moot.
There is this one thing I don't understand, and I cant reply until this is clear to me.
def main() { CA a; TB b = Renamer(a); }
Now, I understand what you are trying to do, but I'm thinking about how would this be translated by compiler. My understanding of this code is that there is object a, then you create trait of type "Renamer" that points to object a, then you create trait of type TB that points to that trait of type Renamer.
Now, as far as I know, heron won't have garbage collector? So, then, who is responsible for deallocating that trait of type Renamer?
With no garbage collector, programs will be plagued with this problem. So I must have misunderstood something.
> There is this one thing I don't understand, and I cant > reply until this is clear to me. > > def main() { > CA a; > TB b = Renamer(a); > } > > Now, I understand what you are trying to do, but I'm > thinking about how would this be translated by compiler. > My understanding of this code is that there is object a, > then you create trait of type "Renamer" that points to > object a, then you create trait of type TB that points to > that trait of type Renamer. > > Now, as far as I know, heron won't have garbage collector? > So, then, who is responsible for deallocating that trait > of type Renamer? > > With no garbage collector, programs will be plagued with > this problem. So I must have misunderstood something. > > Please, can you explain?
The object "a" will be deleted when it goes out of scope. Traits works more or less like references in C++.
/* Now, as far as I know, heron won't have garbage collector? So, then, who is responsible for deallocating that trait of type Renamer? */
Especially given the follow-up:
/* Let me ask the question this way ... would this work:
def main() { CA x; Renamer *y = new Renamer(x); TB z = *y;
delete y;
//will this work? z.b(); } */
Heron does not have garbage collection, *but* its memory management is a little more structured than C++'s. Basically, you have different kinds of pointers -- the two important ones here are weak refrences and strong references. The delete y operation will only work if y is declared as a strong reference, meaning "y owns the object pointed to, and can delete the object." Generally you won't want to have many strong/ownership references to objects. Instead you'd create weak references, which *cannot* delete the object (compile-time error, so the programmer is still around to decide if he really wants to delete the pointer).
I finally read the Traits thesis. As presented in the thesis, traits are stateless (although they can demand that the object being wrapped have get_y or set_z methods). Will Heron follow suit?
Flat View: This topic has 35 replies
on 3 pages
[
123
|
»
]