Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here


How to Convert ASCII C program to UTF-8

How to Convert ASCII C program to UTF-8

How to Convert ASCII C program to UTF-8

I need to convert a program written in C (ASCII) to utf-8 encoding, running in Solaris 9 so as to accommodate Chinese/Japanese/Korean characters What areas should I pay attention to?  Should I convert all variables whose data types are char or char* to wchar_t?  What functions should I use to manipulate the string?  

Here is one more question.  I have a main entry point like

int main(int argc, char *argv[]){
    setlocale(LC_ALL, "en.US_UTF-8");

    //like to know which of the two below is correct
    //CASE I:
    char buf[256];
    sprintf(buf, "Here is the first arg value = %s\n", argv[1]);

    //CASE II:
    wchar_t wbuf[512];
    swprintf(buf, L"Here is the first arg value = %ls\n", argv[1]);

Should I declare wchar_t variable to hold values passed from argv[1]?  Currently, in my program, I used a char buffer for this as shown above.

I am really new to i18n.  I read and searched on internet, but still want to understand this.

Any of your help is greatly appreciated.


RE: How to Convert ASCII C program to UTF-8

Yes, change it to use the wide-char versions of your character types and functions.

You should also get the idea that number of characters != the number of bytes.  UTF-8 is a multi-byte encoding, so a single character could be 1, 2, or 3 bytes in length.  This has implications when you allocate memory, use loops, and pointer arithmetic.

You should also abstract any character literals in your code that get shown to the user into resource files.  This will facilite translation because the translation services company will only need the resource files, and won't need your full source tree.

Don't forget to make adjustments to your UI as well.  Foreign languages like German need a lot more screen real-estate because they have longer words.  

And don't forget to take cultural norms into account.  The typical US symbol for email is a mailbox mounted on a post by the street.  But this doesn't mean anything to Europeans, who receive their mail via a box or slot that is mounted on their house by the front door and looks nothing like a round-top metal box.

Chip H.

If you want to get the best response to a question, please read FAQ222-2244 first

RE: How to Convert ASCII C program to UTF-8

Thanks so much, Chip H.  In the program, there is no GUI.  All it does is to listen to a socket connection to receive data, and write data to an xml file.

Any more detailed advise to minimize the transform work?


RE: How to Convert ASCII C program to UTF-8

Just be careful with your memory allocations, and remember that chars != bytes.

Chip H.

If you want to get the best response to a question, please read FAQ222-2244 first

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close