Agile Buzz Forum - Lots of Little Processes in VW

One of the things people ask about a lot is the VW process model. The simple answer is that the VM is single threaded, so all VW processes are managed at the Smalltalk level - i.e., they aren't OS level threads. You can create OS level threads, but only in the context of threading an external API call. A good example of this can be seen in the various database connects that we ship - you'll note that we ship threaded and non-threaded (i.e., blocking) versions.

So given that, what are Smalltalk level processes good for? Well, bear in mind that you (as the developer) have full control over their semantics. That means that an application deployed on Windows will run exactly like one deployed on Linux (or Mac, or Unix) - a VW process is a Smalltalk artifact, so it's not going to be unpredictable. Let me walk through a simple example, using the BottomFeeder update loop. I subscribe to 315 feeds at the moment, so when the update loop fires, I get 315 VW level processes doing HTTP queries. If those were all OS level threads, the system would fall to its knees in seconds - I'd have to use a thread pool. Incidentally, I implemented one as an option for Bf - but I digress. Here's the main update loop (somewhat simplified for space reasons):

feedsToUpdate do: 
			[:aFeed | 
			| updater delay |
			updater := 
					[self 
						updateFeed: aFeed
						shouldForce: shouldForce
						totalFeeds: numberOfFeeds].
			self settings runThreadedUpdates 
				ifTrue: [self runThreadedUpdateFor: aFeed updateBlock: updater]
				ifFalse: [updater value].

			"other code here..." ].

If I have threading turned off (useful on slow connections, where I don't want the queries competing for bandwidth), I just iterate over the list. The interesting piece is in the threaded updates:

runThreadedUpdateFor: aFeed updateBlock: updater 
	self settings shouldThrottleThreads
		ifTrue: [self runWithThrottling: updater for: aFeed]
		ifFalse: [self runWithoutThrottling: updater for: aFeed]

That checks another setting, which controls whether the app should use a thread pool or not. The "throttling" code implements a pool, the non-throttled code just keeps forking off threads. That's how I run Bf, and it works fine (with a fast connection). Drilling to the throttled code:

runWithThrottling: updater for: aFeed
	self updateCounter addProcess: updater atPriority: self settings getUpdateLoopPriority.


addProcess: aBlock atPriority: aPriorityOrNil
	"add the process to the wait pool"

	self sem critical: [self waitingCollection add: aBlock->aPriorityOrNil]

That code simply adds the new process to a queue, which runs a limited number of processes at once. The non-throttled code?

runWithoutThrottling: updater for: aFeed 
	| proc |
	proc := updater newProcess.
	self updateCounter addThread: proc url: aFeed url.
	proc priority: self settings getUpdateLoopPriority.
	proc resume

Now that demonstrates something useful about the level of control you have over a VW process. I'm setting the priority of the process (by default, it's in a range from 1-100, with 8 "named" levels). Then I'm resuming the process. A VW process is defined simply as a block (the snippet all the way at the top) which later gets forked off. In this example, I'm setting the priority and then resuming (forking) the process. I'm also holding a reference to the process, so that it can be killed (for instance, if you take BottomFeeder offline, the system goes ahead and whacks all the in progress threads in that loop, along with the update loop itself).

The priority levels I mentioned are used by the default process scheduler - which is written in Smalltalk. What does that mean? It means that you have full control over the way processes run in Smalltalk. The default model runs the highest priority process that is ready to run, but - at a given priority level - no process will preempt another of the same priority. In other words, it's not time-slicing. Say you wanted it to be? Well, that's simple - to timeslice a given set of processes, you simply have a higher level process manage them (which is what my throttle does to some extent). If you want to timeslice the entire system? Have a look at class ProcessorScheduler and change the way it manages things.

It's a nice system, and it gives you a very high level of control over how your system runs.


	Web Artima.com