This post originated from an RSS feed registered with Agile Buzz
by James Robertson.
Original Post: Smalltalk code completion
Feed Title: Travis Griggs - Blog
Feed URL: http://www.cincomsmalltalk.com/rssBlog/travis-rss.xml
Feed Description: This TAG Line is Extra
People like bugging me about this. I wrote AutoComplete once upon a time, and I guess they hope I'd be interested in doing something more powerful. AutoComplete was inspired by Linux shell tab completion. It was never really meant to be anything more. I wanted it to work through all text editors in the system, because at the time, we still used simple text dialogs for browsing senders of selectors and classes, etc.
I am not immune to wondering how some sort of IntelliSense like thing for Smalltalk might work though. First and foremost, you have to specialize off of the editor, rather than a general approach. After that, it seems like there are four general styles of completion:
Variables. This is pretty straightforward. If you know the compilation environment of the method target, then it's just a hierarchical search upwards.
Messages that directly follow variables. If you know the type of the variable, then you can confine the search to just the messages the receiver would respond to.
Messages that have as their receiver the result of another message send (e.g. Object new yourself). Those so called "Law of Demeter violations."
Literals and other miscellany. There's nothing to really complete here, so we'll leave this out of consideration.
Roel wanted me to look at using his type inferencer at OOPSLA. I think the idea was that it could be used for #2. I've been noodling about this for a while now (way back burner). Indeed something would be better than nothing. But how much would that get us? I've been curious. What is the split between #'s 2 and 3?
So I slapped together a little RBProgramNodeVisitor subclass: ReceiverClassifier (two 'er' endings in one class name. shudder). I threw it in the Open Repository even, I'm kind of curious what others might find (inspect ReceiverClassifier parseSystem, take a break while it runs). Basically, it parses messages and looks at the receiver type for each message send and classifies it. Here's what I got for my normal development image (sorted highest to lowest).
message
145977
local
143469
self
84724
outer
47215
literal
17382
instance
17209
super
4031
block
3088
Another interesting statistic I tracked, is how often there was a side affect store to a variable (e.g. self foo: (bar := self compute)). I had 2789 of those.
Some observations I made while putting this together. There is some pure dagnasty code in the base image. Inlined uses of the results of a cascade for a test that moves a stream in the optional part of an and: expression are cool.
Messages to messages are the highest count when presented as above. Some of them are 'control' messages (e.g. ifTrue:, and:, etc). Or equality testers. I did not tabulate them, since I couldn't decide how to draw a line in the sand for messages that produce known results.
A number of the messages could be scoped without any sort of inferencing. Outers, blocks, selfs, supers, and literals. These have a total count of 156440. If one used a type inferencer for local and instance variables, one could scope the search for 160678 of the sends. And the remaining 145977 would require something more involved. Comes down to about a 1/3 for each kind.