Michael Lucas-Smith asks "Where do you put your Tests?" He blamed for some of the, er, uh, inspiration for this.
The arguments for doing so are some I've rehearsed myself to others. But the actual real reason that started it all is not listed. We have to step into the WayBackMachine for that.
Sometime around 1997, we started using VisualWorks 3.0. VisualWorks 3.0 added these new things called Parcels. They were kinda like VSE SLLs. But they did some cool things. They had pre/post load/save actions. They could store arbitrary objects. They had prerequisites. They had this new idea called overrides. And an analog to that, partial loading. These were like C DLLs on some serious steroids. I fell in love with them right away. They allowed us to finally start evolving a process that deployed applications as "built up" behavior, rather than stripped images. I was so excited about this. It was visionary (at least for Smalltalk). Run time imperative dynamic application construction.
In 1998, we we're converted to the fad^H^H^Hgospel of XP and were writing SUnit tests by early 1999. In those days, Store didn't exist (except maybe at Anderson Consulting), we used ENVY. There was some limited support for turning ENVY Applications into Parcels. We took it and robustified it, making it able to do incremental deployment (and thus going a lot faster). And somewhere in there, we stumbled upon the dilemma of how close to place our tests to the code. We found that true Unit tests--the type that test the Unit that is most natural to Smalltalk, an object--were best maintained and codeveloped with their unit when you could put them in the browser side by side.
On the other hand, it didn't feel Right(tm) to put TestCases into our production system. At least not then. Too foriegn of an idea. That's where I really came to think highly of whoever had "the vision" of what parcels were. Partial Loading. That was the answer. There was a notification or two we had to make conditional and that was it. We cut are Applications test and all to disk. We used a small bootstrap image that loaded parcels. And if there wasn't a TestCase super class, it just didn't instantiate those behaviors. The really cool thing was that if we wanted to sanity check a running system, we could telnet into our own little interpreter, tell it to load SUnit via command line, and then run all the tests, because they'd just "show up" at that point. It seemed like such an elegant solution. I still contend it was. Eventually, we asked ourselves the question though, what's the difference between latent loaded code, and just preloading it? The idea didn't seem so distateful anymore, machines were faster/had more more memory. So we just took to deploying the SUnit package as a parcel along with all of the others.
I still believe in the "dream" or the "vision" behind parcels. Things like partial loading and really doing interesting things with it. I think partial loading remains un unappreciated feature of parcels because Store doesn't have a graceful analog; heck it doesn't even have an ugly analog. Also, the mechanisms for dealing with partial loading are a bit obtuse/hard to get at.
Over time, a lof (just about all actually) of my zeal for following the partial loading vision has abated. Package after package I published with Tests, I found them split into two packages. The move to "build up" style deployment has been Glacial(TM). I gave up finally. So it was with ironic amusement that I read of Michaell's growing appreciation for the idea; turns out I'll be publishing another tool soon, and finally this time... I gave up and put the tests in a separate package.