Thursday 2 September 2010

ASCII escaping UTF-8 character encoded files.

Every now and again, I need to convert UTF-8 encoded files into their ASCII-escaped equivalents, and then I find myself reaching for the search engine to try to remember what it was that i used last time. So I'm writing it down this time ....

Specifically, I'm talking about converting natively encoded translations into ASCII encoded that that can be used as Java resource bundles.

There are lots of ways to do this, you can even roll your own code, but the simplest is to use the tool that ships with the JDK ! It's called native2ascii

For a full description of how the tool is used, go here: http://download.oracle.com/javase/1.5.0/docs/tooldocs/windows/native2ascii.html

As a quick reference, you just do this (assuming the $YOUR_JDK_HOME/bin is on your path):

native2ascii -encoding utf-8 translated_ru.txt messages_ru.properties