This post originated from an RSS feed registered with Agile Buzz
by James Robertson.
Original Post: To Number Or Not To Number, that is the question
Feed Title: Travis Griggs - Blog
Feed URL: http://www.cincomsmalltalk.com/rssBlog/travis-rss.xml
Feed Description: This TAG Line is Extra
atoi() and its descendents, that's the topic. Any dabbling in C programming will give even the newbie exposure to this useful little function. It takes a string, and converts it to an integer. If you plan on using it for anything serious, you'll discover it's lacking in robustness, at which point you'll discover things like scanf(), etc. atoi() is doing a very common thing. Any program that derives numbers from either text files or user input fields will use such facilities. The problem you encounter with atoi() and friends, is what happens when you have a parse error. What happens for example, if you prompt the user for his age and he enter's 'abc'? The common approach seems to be return 0. For some cases, this works. Particularly cases where 0 is not in the expected number domain. In that case, you can make your program a little more robust by checking for 0, and notifying the user somehow that an invalid input was discovered. But what if 0 is a valid value? What if your app prompts the user for the amount of money he wants to donate to your cause? He means to type '100', but accidently types 'l00'. Depending on your font, you may not even notice that I put a lower case L in place of the 1. atoi() and friends would turn that into 0. Contributing 0 to your cause is a valid user input, but given this is your source of revenue, you'd really like it if this kind of error was in your favor (granted, any money donation app is going to be rife with confirmation stuff catching this further downstream, but work with me here). To get around this, you end up needing new string-to-number parsing functions, because you need to return basically two pieces of information: Did it parse ok? What was the value? A common approach is to pass holders to the function for it to place its numbers and have the function return whether it was succesful or not.
So what's this got to do with Smalltalk anyway? Well at least in VisualWorks (I'd like to know what other Smalltalks do), you basically have the same dumb approach to string-to-number conversion. The method is asNumber. And that's it. And it behaves basically the same way as atoi() does. 'abc' asNumber will happily hand you a zero. For C, I guess I can see why this happens. Fixed and primitive data types, etc. But this is Smalltalk. Over the years I've written a handful of methods of solving this problem. In Smalltalk, there are a variety of ways we can solve the problem.
The first way I used to do it, was to use the MetaNumerics VW package. This package adds a NotANumber object to the system. It can be tested for. Since it's Smalltalk, we can return multiple types of polymorphic objects from our asNumber method. So asNumberOrNan was born.
After doing that for a while, I got tired of having to load the MetaNumerics package just for that one thing, and decided that asking 'abc' to be a number, could semantically be argued to be "undefined", so returning the UndefinedObject, or nil, via the method asNumberOrNil was the next incarnation. Similiar approach, just a different object. I liked this a little better, there's strong support for doing nil testing throughout the system.
The last approach is to treat the attempt to turn 'abc' into a number as exceptional. The method asNumberOrError is born. This has been my latest and I think I like it best so far. In fact, you can find such an addition by loading NumericCollections from the open repository. This is not really the best place for it. It has nothing to do with treating collections as math vectors. But it was the closest I had at the time.
So what's your take? What's the right way to solve the problem? Sometimes I'd like to just take and change the base asNumber method to behave as above. Doing so would not be entirely backwards compatible though. Silently ignored errors would be no longer be ignored. Would it be worth having this as its own package? Should we expect more from the base image in the area of string-to-number conversion?
BTW, this marks my attempt to return to the land of blogging, I've decided however that I'm going to stick pretty strictly to just Smalltalk stuff in here.