Python Buzz Forum - Think, Sync and Wink (part two)

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Python Buzz Forum
Think, Sync and Wink (part two)

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Sidnei da Silva

Posts: 31
Nickname: dreamcatch
Registered: Aug, 2003

Sidnei da Silva is a dirty little brazilian python hacker.

Think, Sync and Wink (part two)

Posted: Oct 15, 2003 12:25 PM

This post originated from an RSS feed registered with Python Buzz by Sidnei da Silva.
Original Post: Think, Sync and Wink (part two) Feed Title: dreamcatcher.homeunix.org Feed URL: http://dreamcatcher.homeunix.org/categories.rdf?category=Python Feed Description: making your dreams come true	Latest Python Buzz Posts Latest Python Buzz Posts by Sidnei da Silva Latest Posts From dreamcatcher.homeunix.org

Think, Sync and Wink (part two)

One of the first things I noticed when looking at IndexedCatalog for the first time was that the fact that it stored the indexes as OOBTrees, where the value was a reference to the object would probably cause a significant slowdown when querying, cause it would potentially wake up lots of objects unnecessarily. This proved to be true when we made the first profile: there were around 2000 calls to __setstate__ on a normal query, which was responsible for around 75% of the total time. There was also a intersection between OOSets (containing object references) involved, which is undoubtly slower than a intersection using IISets.

So, we decided to go ahead with the plan of converting the OO*s to II*s and added a new feature to the plan, after a discussion over chinese food: we would try to delay loading the objects until it was strictly necessary. That would be possible because the objects are normally fetched from a search result, and using only OIDs on the indexes would allow us to return the object, given a OID when the user iterates through the search results.

So, the workflow is more or less like this now:

User does a query

Catalog delegates query to the indexes

Indexes returns a list of OIDs (actually a IISet)

Catalog builds a Result object with the intersection of the OIDs received from the Indexes

Result, when asked for an item, does a lookup by the OID and returns the actual object.

Needless to say, the improvement was overwhelming. Not only the query was blazingly faster, but the database, after replacing the indexes, was 20% smaller.

I must admit: the BTrees package its one of the most amazing ones I've used during all the time I've been involved with python, and when you deploy it the right way, it can make a world of difference.

Read: Think, Sync and Wink (part two)

Previous Topic

Next Topic


	Web Artima.com