Ruby Buzz Forum - Windows, Unicode and C programming tips

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Ruby Buzz Forum
Windows, Unicode and C programming tips

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Daniel Berger

Posts: 1383
Nickname: djberg96
Registered: Sep, 2004

Daniel Berger is a Ruby Programmer who also dabbles in C and Perl

Windows, Unicode and C programming tips

Posted: May 26, 2005 8:32 AM

This post originated from an RSS feed registered with Ruby Buzz by Daniel Berger.
Original Post: Windows, Unicode and C programming tips Feed Title: Testing 1,2,3... Feed URL: http://djberg96.livejournal.com/data/rss Feed Description: A blog on Ruby and other stuff.	Latest Ruby Buzz Posts Latest Ruby Buzz Posts by Daniel Berger Latest Posts From Testing 1,2,3...

I've recently been going back through my C extensions for Windows, updating them to be Unicode friendly. In part, this was inspired by Austin Ziegler, where he rightly points out that several of the current core classes choke if they come across a file that isn't ASCII.

Austin's vision (I think - correct me if I'm wrong Austin) is that extensions would be written in such a way as to use the ASCII or Wide versions of functions, based on a command line option. Let's say, a non-existant "-U". So, you as an extension writer, would be expected to write your (pseudo) code like so:

if("-U"){
   SomeFuncW();  // Wide character version
}
else{
   SomeFuncA();  // Standard version
}

This would work, but I have a problem with it. First, it's a pain in the arse to write code this way - it makes my code longer. Second, we would have to rewrite a *ton* of code (which we'll have to do anyway, though), and enforce this style on 3rd party developers. Lastly, there is no "-U" option. We'll have to add it to the core, or rely on something that already exists, such as -Ku, though that option is normally meant for Japanese encoding.

My solution recently, instead, has been to adopt an "always on" approach, meaning I define the UNICODE macro in my C extensions. Since the wide character versions of functions still work just fine with plain ASCII, I don't see a downside. I'm sure someone will jump in here and scold me for this, so I've worn my flame retardant underpants today, just in case.

Whether or not you agree with my approach, there are a couple of things you'll always want to do in your C extensions for Windows:

Always use TCHAR, not char
Wrap your Ruby to C string functions in the TEXT macro

How each of these behave depends on whether or not the UNICODE macro is set, and do the right thing either way. So, your Ruby extension should look something like this:

static VALUE some_func(VALUE self, VALUE rbString){
   TCHAR* string = TEXT(StringValuePtr(rbString));
   ...
}

That's about it, really, but a little can go a long way. :)

Read: Windows, Unicode and C programming tips

Previous Topic

Next Topic


	Web Artima.com