The Artima Developer Community
Sponsored Link

Python Buzz Forum
String Munging and Webware Munging

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Victor Ng

Posts: 112
Nickname: victorng
Registered: Aug, 2003

Victor Ng programs Python for money, but he'd be programming Python anyway if he was a bum.
String Munging and Webware Munging Posted: Apr 13, 2004 1:50 AM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Victor Ng.
Original Post: String Munging and Webware Munging
Feed Title: Victor Ng's Weblog
Feed URL: https://blog.crankycoder.com/feed/atom/
Feed Description: Python Feed
Latest Python Buzz Posts
Latest Python Buzz Posts by Victor Ng
Latest Posts From Victor Ng's Weblog

Advertisement

I had the wonderful chance to do some Unicode -> ASCII translation. I had thought that this was going to be easy, but I got confounded.

So - excercise to the reader:

Given a Unicode string with characters that cannot be represented as US-ASCII, how do you get all the valid ASCII characters out of the input string in Java?

The Python version is this:

someUnicodeString.encode('ascii', 'replace') or someUnicodeString.encode('ascii', 'ignore').

Luke pointed me out to a solution in Java, but as it turns out - I still don't get it. Which concrete implementation of OutputStream was I supposed to use?

On an aside - I've been busy refactoring Webware. The goal has been to get Webware into a distutils friendly state. There's a couple reasons for this:

  1. distutils would make installing Webware almost trivial to install
  2. I personally find writing unit tests to be a lot simpler in a distutils friendly setup. Test cases go in a proper directory and the Python path can be munged inside of a distutils 'test' custom class.
  3. Testing of Webware plugins is too hard right now. We've made modifications to FormKit using FormEncode, but testing the changes has been neglected.
  4. it's a good excuse to get my hands dirty with the Webware internals to see how it works

That last point is really my main reason - I've found that the best way to figure out how non-trivial code works is to gut it and put it back together. The paranoid freak in me never really trusts the API docs.

Current status:

Servlets basically work now - all the WebKit examples run. Minor "yay!".

I still need to redo the plugin packaging for PSP, MiddleKit, MiscUtils, and all that other stuff that sits as a peer to WebKit though. There's some code/data mingling going on with plugins and the examples in each plugin which is causing me some grief. An example is the WebKit/Testing/Main.py servlet which loads test cases from a data file. I'd really rather just push all the testing into the unit test suite and have some way of instrumenting Webware to load my testing servlet and then drive some tests against it.

Hopefully another two weekends of hacking and I should have something that isn't entirely embarassing to show off.

I've noticed that the way in which I refactor Python code is different than when I'm working on Java code. In Java - I tend to lean harder on the refactoring browsers built into Eclipse. In Python - I tend to lean harder on my unit tests, sed, grep and vim.

Truth told - I prefer the Java way of refactoring. Static type checking may be a pain in the ass when you're working on 'new' code - but I like the safety of an automated refactoring when I can get it.

Read: String Munging and Webware Munging

Topic: SMILES tokens Previous Topic   Next Topic Topic: Shtoom: but wait, there's more

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use