The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
Windows, Unicode and C programming tips

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Daniel Berger

Posts: 1383
Nickname: djberg96
Registered: Sep, 2004

Daniel Berger is a Ruby Programmer who also dabbles in C and Perl
Windows, Unicode and C programming tips Posted: May 26, 2005 8:32 AM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by Daniel Berger.
Original Post: Windows, Unicode and C programming tips
Feed Title: Testing 1,2,3...
Feed URL: http://djberg96.livejournal.com/data/rss
Feed Description: A blog on Ruby and other stuff.
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by Daniel Berger
Latest Posts From Testing 1,2,3...

Advertisement
I've recently been going back through my C extensions for Windows, updating them to be Unicode friendly. In part, this was inspired by Austin Ziegler, where he rightly points out that several of the current core classes choke if they come across a file that isn't ASCII.

Austin's vision (I think - correct me if I'm wrong Austin) is that extensions would be written in such a way as to use the ASCII or Wide versions of functions, based on a command line option. Let's say, a non-existant "-U". So, you as an extension writer, would be expected to write your (pseudo) code like so:
if("-U"){
   SomeFuncW();  // Wide character version
}
else{
   SomeFuncA();  // Standard version
}

This would work, but I have a problem with it. First, it's a pain in the arse to write code this way - it makes my code longer. Second, we would have to rewrite a *ton* of code (which we'll have to do anyway, though), and enforce this style on 3rd party developers. Lastly, there is no "-U" option. We'll have to add it to the core, or rely on something that already exists, such as -Ku, though that option is normally meant for Japanese encoding.

My solution recently, instead, has been to adopt an "always on" approach, meaning I define the UNICODE macro in my C extensions. Since the wide character versions of functions still work just fine with plain ASCII, I don't see a downside. I'm sure someone will jump in here and scold me for this, so I've worn my flame retardant underpants today, just in case.

Whether or not you agree with my approach, there are a couple of things you'll always want to do in your C extensions for Windows:

  • Always use TCHAR, not char
  • Wrap your Ruby to C string functions in the TEXT macro

How each of these behave depends on whether or not the UNICODE macro is set, and do the right thing either way. So, your Ruby extension should look something like this:
static VALUE some_func(VALUE self, VALUE rbString){
   TCHAR* string = TEXT(StringValuePtr(rbString));
   ...
}

That's about it, really, but a little can go a long way. :)

Read: Windows, Unicode and C programming tips

Topic: Ruby Stuff: The Ruby Store for Ruby Stuff Previous Topic   Next Topic Topic: “Enterprise Software” is a polite way of saying shitty legacy systems and overly complex...

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use