Java Buzz Forum - CP850 charset

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Buzz Forum
CP850 charset - still in use :(

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Marc Logemann

Posts: 594
Nickname: loge
Registered: Sep, 2002

Marc Logemann is founder of www.logentis.de a Java consultancy

CP850 charset - still in use :(

Posted: Mar 9, 2005 2:43 AM

This post originated from an RSS feed registered with Java Buzz by Marc Logemann.
Original Post: CP850 charset - still in use :( Feed Title: Marc's Java Blog Feed URL: http://www.logemann.org/day/index_java.xml Feed Description: Java related topics for all major areas. So you will see J2ME, J2SE and J2EE issues here.	Latest Java Buzz Posts Latest Java Buzz Posts by Marc Logemann Latest Posts From Marc's Java Blog

I am currently developing a program which interacts with data from Deutsche Post World Net. We speak of one of the largest logistic providers worldwide. For this program to work, i have to read in about 500MB of flatfile data i got on CD from Deutsche Post. I thought this is a perfect choice for NIO (i have not been using NIO so far and was excited).

So i ve written a small Testprogram to read in the data from filesystem and wondered why i dont get the german umlauts like öäü correctly. A first check with a HEX editor showed that the "ü" for example had a hex representation of 81. I was quite sure that in ISO-8859-1 the "ü" is not at 81. And indeed, it seems i am dealing with a different charset. After some more investigation i found out that they used CP850, a charset with its momentum at MS-DOS times. Great.

I though i can just switch the encoding in my sourcode, but then i realized that NIO doesnt support CP850, only plain java.io does. This is the end of the story regarding NIO usage and its even more frustrating because reading in 500mb of flatfile data would need any performance boost i can get, but ok.

It seems they didnt change the way of data distribution since the beginning of computing. I recently heard that they offer an alternative way of obtaining the data, perhaps via FTP and perhaps they can offer different charset of their files via this route. Let see. Dealing with encoding issues is allways a pleasure, because its never fast to solve and allways includes checking charset tables on some obsure sites in the internet.

Read: CP850 charset - still in use :(

Previous Topic

Next Topic


	Web Artima.com