Showing posts with label Unicode. Show all posts
Showing posts with label Unicode. Show all posts

Tuesday, 25 June 2013

Go: Unicode on the Windows command prompt (golang)

Go (golang.org) has pretty good Unicode support on the Windows command line. I've written about Unicode and cmd.exe before in the context of C++, C# and Java.

Versions: Windows 7 (64-bit); go1.1.1 windows/amd64; Java 1.7.0_21

Thursday, 6 June 2013

Java: detecting JSON character encoding

JSON documents are generally encoded using UTF-8 but the format also supports four other encoding forms. This post covers the mechanics of character encoding detection for JSON parsers that don't provide handling for them - for example, Gson and JSON.simple.

EDIT: 2014; a version of this library has been published to Maven central.

Saturday, 18 July 2009

Java: effective Unicode

This is my attempt at a list of maxims to abide by when working with text in Java, in the vein of Effective Java or The Ten Commandments of Unicode. It is also a summary of another post on character encoding. The list is in no way comprehensive.

Friday, 1 May 2009

Java: a rough guide to character encoding

It can be tricky figuring out the difference between character handling code that works and code that just appears to work because testing did not encounter cases that exposed bugs. This is a post about some of the pitfalls of character handling in Java.

Friday, 10 April 2009

Java: Unicode on the Windows command line

By default, Java encodes Strings sent to System.out in the default code page. On Windows XP, this means a lossy conversion to an "ANSI" code page. This is unfortunate, because the Windows Command Prompt (cmd.exe) can read and write Unicode characters. This post describes how to use JNA to work round this problem.

This post is a follow-up to I18N: Unicode at the Windows command prompt (C++; .Net; Java), so you might want to read that first.

Thursday, 9 April 2009

I18N: Unicode at the Windows command prompt (C++; .Net; Java)

Strange things can happen when working with characters. It is important to understand why problems occur and what can be done about them. This post is about getting Unicode to work at the Windows command prompt (cmd.exe).