Using wchar_t to print out Japanese,Chinese,Russian in Console


Wchar_t type is use in Unicode and mainly Windows used them in their program.If you don’t know what is whar_t type then you can find some information here(wchar_t and wstring).In this post we will discuss some ways to print out Japanese,Russian,Chinese,Spanish,etc. in Linux and Windows Console application.

I)Linux
II)Windows

Limk:wchar_t type and wstring type



Linux

In Linux we don’t even need to use wchar_t or any special functions to print out Japanese,Chinese,Russian,etc. in our console/Terminal.Using char type we can print out the different languages script in our terminal.A program is given below.

#include <iostream>

using namespace std;

int main( )
{

char jap[]=”かんじ” ,
    chinese[]=”汉字” ,
    viet[]=”mương” ,
    korean[]=”한자” ,
    hindi[]=”बईबईसई” ;

cout << jap << endl
    << chinese << endl
    << viet << endl
    << korean << endl
    << hindi ;

cin.get( ) ;
return 0 ;

}






Windows

In windows there are two ways to obtain Unicode output in our console application:
i)Without changing the system locale in our Computer and
ii)With changing the system locale in our Computer.



Without changing the system locale

This is a preferred method to print out different languages character in Windows console.We can use either Code::blocks or Microsoft Visual Studio IDE but,the codes used in each of the IDE is different so we will discuss the codes used in code::blocks and Visual Studio separately.Firstly lets see the working codes that can be used in Visual Studio(if you are developing an apps for Windows this IDE is recommended).


Visual Studio

In Visual Studio you need to include some header’s file specific to the Windows so these header’s file will not work in code::blocks.The program given below will print out a Russian string.

***Note:To get the desired output you have to make Lucida Console as your console font.To change the font run the program given below first and Right Click on the upper left of the console window then select Properties->Font and change the font to Lucida Console.

The screen shot is given below.
 

 

#include “stdafx.h”
#include <iostream>
#include <io.h> ///Necessary
#include <fcntl.h> ///Necessary
#include <Windows.h> ///Necessary

using namespace std ;

int main( )
{
_setmode(_fileno(stdout), _O_U16TEXT) ;

wchar_t str[] = L”лдощдффыкйцн” ; ///Russian string

wcout << str ;

_setmode(_fileno(stdout), _O_TEXT); //without this it will cause debug assertion failed error

cin.get();
return 0;
}

You will see your console output as(If you do not see the Russian string as output try running the program again after changing the Console font).

If you want to print out Spanish,Greek,etc. assign their corresponding script string to str[].In case of Chinese,Japanese,Hindi,etc. you will get boxes as the output but if you copy the boxes and paste it in your Notepad++ you will see your string.This means that the console supports those languages script but cannot find a suitable font to map the characters and display the output.So to make the console able to display all the languages script you need to find a suitable font which the console will accept and make it available for the console.Note not all fonts are accepted by the console to know which fonts are accepted by the console you can go to this link(https://support.microsoft.com/en-us/kb/247815) .Actually I could not find any free fonts which the console would accept and making such font is a lengthy process so I gave up on this method.But,there is another way to make many fonts available for the Console to map to different writing system.This method is known as Font linking(explained below).


 



Font Linking

It is a method through which we can make many fonts available for the console to map the characters which are not found in the font accepted by the console.It simply means with Font linking we can make the console accept the font which it will not accept directly through the font which it will accept.For instance,Lucida Console font is accepted by the console so we will this font to link to many other fonts which the console will not accept directly.Doing this will make the font available to the console and so if an input character font is not found in Lucida Console the console will search for the font in the fonts linked to it.To link the font,first of all open the Registry Editor,you can do this by entering regedit in your Window’s search bar.After you have opened the Registry Editor go to this directory HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink .The screen shot is given below.

In my registry I have already added the Lucida Console so you can see it.To add it in your Registry editor in the upper left corner go to Edit->New->Multi_string Value and set the name as Lucida Console and double click on it and add the strings given below.

Cour.TTF, Courier New
LucidaTypewriterRegular.TTF,Lucida Sans Typewriter
MSGothic.TTC,MS Gothic
SimSun.TTC,NSimSun
Gulim.TTC,GulimChe
Consola.TTF,Consolas
Kokila.TTF,Kokila

 
The new Window will appear like this.
 

 
Click ‘OK’ and restart your computer.To the test the fonts linked to Lucida console add these codes to the program given above.
 

wchar_t jap[] = L”カイェウアンモン、ア” ,
    chinese[]=L”考试的空间的方式” ,
    hindi[]=L”ऐधभषझई” ,
    german[]=L”dflääbwejä#fgäö” ;

wcout << L”Japanese ” << jap << endl
    <<L”Chinese ” << chinese << endl
    <<L”Hindi ” << hindi << endl
    << L”German ” << german ;

Run the program,the output will appear as,
 

 
In the next topic we will how to output some languages character in Code::blocks.






Code::blocks

The codes given below may work for some languages and may not for some other so I cannot guarantee that the code will work for your native language even if your language have a writing system.With code::blocks in Windows by using code page we can print out some languages script in our console application.Here’s a Microsoft definition of code page from it’s official website “A code page is a mapping of 256 character codes to individual characters. Different code pages include different special characters, typically customized for a language or a group of languages.” ,you can find more information here(https://msdn.microsoft.com/en-us/library/windows/desktop/ms682064(v=vs.85).aspx).In our program below we will use a specific code page for every script say for Russian the code page is 866.Some of the working code page lists is given below

Code pageLanguage
866Russian
1253Greek
1251Cyrillic
860Español

You can find more code page value here(https://msdn.microsoft.com/en-us/library/windows/desktop/dd317754(v=vs.85).aspx) Copy and paste the following code below and substitute CODE_PAGE with the code page value given above.You can try out some strings like “лдощдффыкйцн“(for Russian),”Español“(Spanish),etc. and assign these string to s[] ,the program given below shows how to print out Spanish script.

Note:Set your console screen font to Lucida Console.To set the font you need the console screen so substitute the code page value and run the program first.Then go to upper left of and give Right click and go to Properties->Font and select Lucida Console.The screen shot is given below.(These steps is same with the Visual studio.)


 

#include <iostream>
#include <windows.h> ///inevitable

using namespace std;

int main( )
{
SetConsoleCP( CODE_PAGE ) ;
SetConsoleOutputCP( CODE_PAGE ) ;

wchar_t s[] = L”Español” ;

int bufferSize = WideCharToMultiByte(CODE_PAGE , 0, s, -1, NULL, 0, NULL, NULL) ;

char* m = new char[bufferSize] ;

WideCharToMultiByte(CODE_PAGE , 0, s, -1, m, bufferSize, NULL, NULL) ;

wcout << m << endl ;

delete[] m ;

cin.get();
return 0 ;
}

For Hebrew the code page is 1255 if you substitute CODE_PAGE with 1255 and run the program(assign a Hebrew string to s[] in the program above).You will see boxes as the output something like this.

Copy those boxes and paste it notepad++ or any other file editor which supports Unicode,you will see your Hebrew string.This means that the console supports Hebrew script but does not have a compatible font to output the characters.To solve this problem you can either get a suitable font for Hebrew or you can link fonts(Discussed under Visual Studio section >Font Linking ) .In case of Simplified Chinese,Japanese,Korean,Hindi,German,etc. you will also get some weird characters but if you copy and paste those weird characters in your notepad++ you will not see your string.This means we cannot output Chinese,Japanese,Hindi,etc. in your console even if you link fonts.However,there is a sure way to display these languages script in Windows console.We will discuss that in the next topic.



Changing the system locale.

Changing the system locale can output any languages script in your Visual Studio or Code::blocks without adding any new fonts or without linking any fonts.To change the system locale go to Control panel->Clock,Language,Region ->Region ,you will see a new Window select Administrative and Change System Locale.The screen shot is given below.

Then select Simplified Chinese from the list and press ‘OK’ and restart your computer.To print out simplified Chinese characters in code::blocks you can use the code given in the previous topic discussion and also for Visual Studio you can use the code given above.With this method we can print out only the characters of the language set at the System locale settings which means to print out Japanese we have to change the System Locale language to Japanese and restart the computer again.So switching between different languages takes a lot of time and also changing the system locale to some other language will change the language of some of your installed software.If you are testing out this method make sure that you change the system locale language to the original one.


Related Link

->Float or floating point internal format(IEEE 754 format)(explains why float supports decimal point value while int doesn’t).
 
->Represent int and char in binary format.
 
->Precedence and Associativity in C and C++.