Agile Buzz Forum - Xtream sharing

I mentioned in my last blog post about Xtreams that I wanted to devise a way of sending object messages between images while also allowing file transfers and other large data transfers to happen without interruption. This is not something that Opentalk has typically been good at - if you have a giant collection of data and you start returning it on your connection, that connection is locked up until you're done. I've started a new Xtreams-Xperiments addition which I'm calling Shared Substreams.

The idea is this - you create your socket or pipe or whatever connection and then you start "sharing" it. As the write stream, you send #sharer to it, then you send #get to that and that gives you a new shared write substream that you can write to. Data you send down that substream will be identified by a substream id that is internal to the protocol. You can open as many substreams as you want (the actual limit is 2^32, however ids get recycled).

Next, on the read side you send #sharer to it too and when you send #get, it will wait for a new substream connection to be opened from the write side. Data transmitted to it while waiting for a substream to come through will be pushed in to the existing active substreams. Here's a simple file transfer example that sends all the files in a directory from the server to the client:

"Server code"
| writing |
writing := socket writing sharer.
(Filename currentDirectory filenamesMatching: '*') do: [:filename |
    | substream filestream |
    substream := writing get.

    ["Write out the file header"
     substream put: filename tail size.
     (substream encoding: #utf8) write: filename tail.

     "Write out the body"
     filestream := filename reading.
     substream write: filestream.
     filestream close.
     substream close] fork]

"Client code"
| reading |
reading := socket reading sharer.
reading do: [:substream |
    [ | filename filestream |
    "Read the header"
    filename := ((substream encoding: #utf8) read: substream get) asFilename.
    filestream := filename writing.
    filestream write: substream.
    filestream close] fork]

In both server and client code, each file is sent and received in parallel using one socket connection, but a shared binary stream. Not only that, but the substreams are handled in separate processes safely. It's possible that Processor yield statements may be needed to ensure that the files really do get to share the connection or not.

If I wanted a most sophisticated example, I'd also include in the header of each substream, the kind of substream it is, eg: a file transfer or an object message from the ObjectMarshaler or perhaps something else. This would allow me to have these two methods become top level dispatchers for whatever kinds of streams I want to write out.

In fact, may be we don't even need to have a protocol for specifying the contents of the substream. What if, instead, we use the object marshaler to transmit across a piece of code to interpret the data being sent? If we're on a trusted connection already and we're happy to accept code from a foreign source, why not let it control us as a dumb client?

"Server code"
| writing |
writing := socket writing sharer.
(Filename currentDirectory filenamesMatching: '*') do: [:filename |
    | substream filestream |
    substream := writing get.

    ["Write out interpreter"
     substream marshaler put: [:reading |
         | foreignFilename |
         foreignFilename := ((reading encoding: #utf8) read: reading get) asFilename.
         filestream := foreignFilename writing.
         filestream write: reading.
         filestream close].

     "Write out the file header"
     substream put: filename tail size.
     (substream encoding: #utf8) write: filename tail.

     "Write out the body"
     filestream := filename reading.
     substream write: filestream.
     filestream close.
     substream close] fork]

"Client code"
| reading |
reading := socket reading sharer.
reading do: [:substream | [ substream marshaling get value: substream ] fork]

The client code is notably short. It opens up substreams, reads a block off using the marshaler and then runs it with the remainder of the substream, which is the body of whatever message was sent to us from the server. But why stop there? we're doing marshaling of the filename the hard way, let's let block closures take care of it for us:

"Server code"
| writing |
writing := socket writing sharer.
(Filename currentDirectory filenamesMatching: '*') do: [:filename |
    [ | substream filestream filenameTail |
     substream := writing get.

     "Write out interpreter"
     filenameTail := filename tail.
     substream marshaler put: [:reading |
         filestream := filenameTail asFilename writing.
         filestream write: reading.
         filestream close].

     "Write out the body"
     filestream := filename reading.
     substream write: filestream.
     filestream close.
     substream close] fork]

"Client code"
| reading |
reading := socket reading sharer.
reading do: [:substream | [ substream marshaling get value: substream ] fork]

This is still an experiment, but so far it's worked out well. This could be the beginnings of Polycephaly or Grid computing or Opentalk-STST replacement. I'm not sure I'd go that far yet, but it is interesting to see how much power we can get out of a few simple concepts.


	Web Artima.com