The Artima Developer Community
Sponsored Link

Agile Buzz Forum
BerkeleyDB for VisualWorks

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
James Robertson

Posts: 29924
Nickname: jarober61
Registered: Jun, 2003

David Buck, Smalltalker at large
BerkeleyDB for VisualWorks Posted: Aug 2, 2006 5:55 PM
Reply to this message Reply

This post originated from an RSS feed registered with Agile Buzz by James Robertson.
Original Post: BerkeleyDB for VisualWorks
Feed Title: Michael Lucas-Smith
Feed URL: http://www.michaellucassmith.com/site.atom
Feed Description: Smalltalk and my misinterpretations of life
Latest Agile Buzz Posts
Latest Agile Buzz Posts by James Robertson
Latest Posts From Michael Lucas-Smith

Advertisement

BerkeleyDB is a neat little database built by Sleepcat Software which has recently been acquired by Oracle. BerkeleyDB is not a relational database or an object database, it is in fact just a regular old fashioned 'data' database. It works, essentially, as a key-value file on disk using different algorithms such as HASHing, BTree's, etc - which you, as a developer, get to pick and tune.

Over the years this database has picked up some pretty amazing tricks, such as the XA Transactions architecture, Replication, Secondary Database linking, etc. In fact, BerkeleyDB is so powerful that MySQL is built off it.

I needed to store word indexes on disk because I had too many of them to keep in memory, so I needed some place to put them that I could retrieve from fast - and be able to update the indexes fast too. This called for some sort of disk based database. BerkeleyDB was the perfect choice, suggested by a coworker, for this task.

I searched around and saw that once upon a time there was a BerkeleyDB implementation for Squeak, but that seems to have disappeared in to the netherworld? So, moving on, it was time to implement it myself in VisualWorks. As usual, the DLLCC header file parser was completely useless.

I build the structures and procedure definitions myself, which takes far too much effort, then wrapped up the instance based procedures, etc. I've published my efforts to Public Store under the name BerkeleyDB. It acts like a Dictionary (it's subclassed off KeyedCollection) so you can literally pretend it's just a regular in-memory Dictionary that runs slightly slowly because it gets stuff off disk.

To make sure it scales, I implemented the DB_MULTIPLE API's to a BulkCursor which is subclassed off the regular BerkeleyDB Cursor. This fetches data in chunks of 5mb's. If you need anymore than that, you can specialise it further by subclassing off my BulkCursor.

So, I've got Berkeley Cursor's in there too, for iterating over all the records. I've also got Stream API's in there for reading and writing to a record. These sit along side the regular Dictionary API as streamAt: key ifAbsent: [] and at: key putStream: aStream. BerkeleyDB lets you have up to 4gig of data in a single record and up to 256 terrabytes of data in the database all up. Very impressive.

I've not done any of the Replication, Secondary Database, Sequences, Transactions or Environment code - just enough so that I can treat a Hash or a BTree as if it were a Smalltalk Dictionary. Feel free to contribute further if you have a need.. though most of those functions are generally not required when making a simple disk db.

| hash btree |
hash := BerkeleyDB.Hash in: 'myhash.db'.
btree := BerkeleyDB.BTree in: 'mybtree.db'.
hash at: 'a' put: 'b'.
btree at: 'c' put: 'd'.
hash inspect.
btree inspect 

Read: BerkeleyDB for VisualWorks

Topic: Peter Drucker Quote Previous Topic   Next Topic Topic: Agile Surveys via Dr. Dobb's Journal: Raw Data, AUP, MSF Agile &

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use