Weblogs Forum - Parallel Python

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Weblogs Forum
Parallel Python

16 replies on 2 pages. Most recent reply: Oct 16, 2007 7:37 PM by Nathan Youngman

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 16 replies on 2 pages [ 1 2 | » ]

Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Parallel Python (View in Weblogs)

Posted: Sep 11, 2007 5:24 AM

Summary
My previous post led me to this library, which appears to solve the coarse-grained parallelism problem quite elegantly.

You can find the library here, which was written by Vitalii Vanovschi, a Russian chemist now doing graduate work at USC. It appears he created the library to serve his own computational needs, and designed it to be simple so his colleagues could use it.

Parallel Python is based on a functional model; you submit a function to a "job server" and then later fetch the result. It uses processes (just like I requested in my previous post) and IPC (InterProcess Communication) to execute the function, so there is no shared memory and thus no side effects.

The pp module will automatically figure out the number of processors available and by default create as many worker processes as there are processors.

You provide the function, a tuple of that function's arguments, and tuples with the dependent functions and modules that your function uses. You can also provide a callback function called when your function completes. Here's what the syntax looks like:

import pp
job_server = pp.Server() # Uses number of processors in system
f1 = job_server.submit(func1, args1, depfuncs1, modules1)
f2 = job_server.submit(func1, args2, depfuncs1, modules1)
f3 = job_server.submit(func2, args3, depfuncs2, modules2)
# Retrieve results:
r1 = f1()
r2 = f2()
r3 = f3()

What's even more interesting is that Vitalii has already solved the scaling problem. If you want to use a network of machines to solve your problem, the change is relatively minor. You start an instance of the Parallel Python server on each node machine:

node-1> ./ppserver.py
node-2> ./ppserver.py
node-3> ./ppserver.py

Then create the Server() by handing it a list of nodes in the cluster:

import pp
ppservers=("node-1", "node-2", "node-3")
job_server = pp.Server(ppservers=ppservers)

Submitting jobs and getting results is the same as before, thus switching from multicores to a cluster of computers is virtually effortless. Notice that it transparently handles the problem of distributing code to the remote machines. It was not clear, however, whether ppserver.py automatically makes use of multiple cores on the node machines, but you would think so.

This library allows you to stay within Python for everything you're doing, although you can easily do further optimizations by writing time-critical sections in C, C++, Fortran, etc. and effortlessly and efficiently linking to them using Python 2.5's ctypes.

This is an exciting development for anyone doing parallel processing work, and something I want to explore further once my dual-core machine comes online.

Colin McMillen

Posts: 1
Nickname: colinm
Registered: Sep, 2007

Re: Parallel Python

Posted: Sep 11, 2007 7:32 AM

The latest versions of IPython also have support for parallel computing that allows you to "stay within Python" to take advantage of parallelism.

http://ipython.scipy.org/moin/Parallel_Computing

Joe Winter

Posts: 1
Nickname: jwinter
Registered: Sep, 2007

Re: Parallel Python

Posted: Sep 11, 2007 8:22 AM

Doesn't Papyros do the same thing?
http://code.google.com/p/papyros/

Miguel Sousa Filipe

Posts: 1
Nickname: mfilipe
Registered: Sep, 2007

Re: Parallel Python

Posted: Sep 11, 2007 9:47 AM

I don't like parallel python all that much, its too "job oriented", feels too muich like a job dispatcher. Its clearly HPC or compute intensitive aplications.

When I want concurrency in a more multi purpose way (or more similar to classical threading), I prefer both this projects:

* Parallel http://cheeseshop.python.org/pypi/parallel

and

* Processing http://cheeseshop.python.org/pypi/processing

Both of these provide a api more closer to threading, but with the advantage of explicit shared data (has oposed to implicit, in traditional m-threaded applications)

there is also: http://code.google.com/p/papyros/

which look nice for distributed applications.

Nathan Youngman

Posts: 5
Nickname: nathany
Registered: Sep, 2007

Re: Parallel Python

Posted: Sep 11, 2007 11:45 AM

Hi,
I'm a little new to this whole concept... when I mentioned Erlang to a friend, he informed me of Stackless Python, which seems to have a fair history in this sort of thing, and is used for projects such as the EVE online game.

http://www.stackless.com/

I haven't seen any plans to enhance concurrency in Python 3k... seeing this library suggests that it doesn't necessarily have to be a language change.

Has anyone used both Stackless and PP, and could elaborate on how they compare?

Thanks,
- Nathan.

Posts: 6
Nickname: rhymes
Registered: Oct, 2003

Re: Parallel Python

Posted: Sep 11, 2007 12:34 PM

> Hi,
> I'm a little new to this whole concept... when I mentioned
> Erlang to a friend, he informed me of Stackless Python,

Stackless is nice but it can't take advantage of multiple cores

anthony boudouvas

Posts: 14
Nickname: objectref
Registered: Jun, 2007

Re: Parallel Python

Posted: Sep 11, 2007 1:43 PM

As me and other have mentioned PP here
http://www.artima.com/forums/flat.jsp?forum=106&thread=214235

finally, there are many people that use this module so to use multiple cores effectively.
I think it performs well, is stable enough and it is up to the person who is in charge to make it part of Python's distribution.

Just imagine this: With this solution in the STANDARD library, who will continue to fight for the GIL anymore ??

PS:posted to the correct blog this time...

Dmitry Cheryasov

Posts: 16
Nickname: dch
Registered: Apr, 2007

Re: Parallel Python

Posted: Sep 11, 2007 3:45 PM

A great find.

But no shared memory at all is often inconvenient; imagine a raytracing-like problem that parallelizes efficiently as long as you can share the scene graph and textures.

An easy way to have read-only shared memory could be a boon. It' be fast because of lack of any locking overhead. Most OSes have built-in support for this.

Again, this could probably be done at the library level, without changing the language; ditto for properly syncronizable therad-safe containers.

Daniel Watkins

Posts: 1
Nickname: oddbloke
Registered: Aug, 2005

Re: Parallel Python

Posted: Sep 11, 2007 5:33 PM

I've done some basic benchmarking of the library and posted the results on my blog at http://blogs.warwick.ac.uk/dwatkins/entry/benchmarking_parallel_python_1_2/

Hopefully they'll prove useful.

Matt Knox

Posts: 2
Nickname: mattknoxca
Registered: Sep, 2007

Re: Parallel Python

Posted: Sep 11, 2007 7:16 PM

> * Processing
> sing http://cheeseshop.python.org/pypi/processing

Now THAT is an impressive piece of work. Thank you very much for the link. I really love the approach of mirroring the existing threading module. Makes it very easy to use. Has all the nice features of the threading module like locks and such too. wow.

Paddy McCarthy

Posts: 12
Nickname: paddy3118
Registered: Dec, 2005

Re: Parallel Python

Posted: Sep 12, 2007 10:59 PM

Python is a scripting language, yet we seem to only envisage parallel processing solutions in one language. How about making it easy to execute in parallel and share data with programs written in other languages as well as Python, and with a common framework. Maybe PyLinda talking to Linda and sharing tuple space (http://www-users.cs.york.ac.uk/~aw/pylinda/doc/index.html, http://en.wikipedia.org/wiki/Linda_(coordination_language) ).

Fold in Googles MapReduce acting on cores as well as computers and maybe AMD & Nvidia Graphics accellerators, and IBM Cell then Python would be second to none :-) - Just dreaming.

- paddy.

Marco Fabbri

Posts: 2
Nickname: mfabbri
Registered: Dec, 2004

Re: Parallel Python

Posted: Sep 13, 2007 3:16 AM

I think this kind of architecture - using a "space" to integrate eterogenous services written in different languages - is at the core of the GigaSpaces infrastracture (and what is called "Space based computing). Talking about Python, maybe NetWorkSpaces for Python http://nws-py.sourceforge.net/ is of some interest; it is developed by Scientific Computing Associates (http://www.lindaspaces.com), the pristine Linda spin-off by Carriero and Gelernter.

Paddy McCarthy

Posts: 12
Nickname: paddy3118
Registered: Dec, 2005

Re: Parallel Python

Posted: Sep 13, 2007 1:42 PM

Sadly I now realize that for a rounded library you would also need to cater for those cases where you have very little data to exchange between jobs and good job control is paramount. You might as well embed the Sun Grid Engine as a library!

- Paddy.

Lateef Jackson

Posts: 1
Nickname: lateefj
Registered: Sep, 2007

Re: Parallel Python

Posted: Sep 14, 2007 6:49 PM

A few month ago I played around with a more complete example example. My goal was to figure out how many request per second I could get out sending events and processing them. http://hackingthought.blogspot.com/2007/05/event-architecture-experiment.html

Sharda Suresh

Posts: 1
Nickname: sureshvv
Registered: Sep, 2004

Re: Parallel Python

Posted: Sep 20, 2007 2:51 AM

>
> I prefer both this projects:
>
> * Parallel http://cheeseshop.python.org/pypi/parallel

This does not seem to exist anymore on pypi.

Flat View: This topic has 16 replies on 2 pages [ 1 2 | » ]

Previous Topic

Next Topic