ReadString not reading Unicode text properly 1

ProgrammerJoe · Mar 6, 2003

I have a Unicode text file. (I can tell because the characters 0xFF and 0xFE are the first two characters in the file and looking at it in a binary editor, I can see that each character is followed by a null character). My problem is this: when I use ReadString, it takes each individual byte from the file and expands it into a two-byte character. So each of the lines that I read in are only one character long, since the second character is a null character. How can I read the Unicode file? I've been tempted to read the file byte-by-byte and copy into memory typecast as a pointer to an array of wchar_t, since it's the only solution I can come up with. Surely there's got to be something more elegant than that.

Here's my code for reading the file (in two ways!):

Code:

void CodeSnippet()
{
	CStdioFile	MFCFile;
	CString		CurrLine;
	
	FILE*	RegularFile;
	wchar_t UnicodeLine[200];
		
	if (!MFCFile.Open(_T(&quot;c:\\temp\\unicode.txt&quot;), CFile::modeRead | CFile::typeText)) 
	{
		MessageBox(NULL, _T(&quot;Cannot open file&quot;), _T(&quot;Failure&quot;), MB_ICONEXCLAMATION);
	}

	while (MFCFile.ReadString(CurrLine))
	{
		MessageBox(NULL, CurrLine, _T(&quot;Line Read&quot;), MB_ICONINFORMATION);
	}
	MFCFile.Close();

        // alternate way
	RegularFile = fopen(&quot;c:\\temp\\unicode.txt&quot;, &quot;r&quot;);
	if (!RegularFile) {
		fgetws(UnicodeLine, 100, RegularFile);
		MessageBox(NULL, UnicodeLine, _T( &quot;Line Read&quot;), MB_ICONINFORMATION);
	}
	fclose(RegularFile);
}

LazyMe · Mar 6, 2003

You can read the file into memory using ReadFile, specifying it a WCHAR buffer.
Greetings,
Rick

palbano · Mar 6, 2003

do you have _UNICODE defined?

-pete

LazyMe · Mar 6, 2003

That would reverse the problem when trying to read a non-unicode file.
Greetings,
Rick

palbano · Mar 6, 2003

>> That would reverse the problem when trying to read a non-
>> unicode file.

DOH! [cannon]

.... (pete)

LOL

Thanks for the save Rick
-pete

ProgrammerJoe · Mar 6, 2003

Thank you for your help. That function, combined with a wchar_t string did the trick. Here's the test code I wrote:

Code:

HANDLE	NewFile;
wchar_t WCharBuffer[101];
DWORD	ReadBytes;

NewFile = CreateFile(_T(&quot;c:\\temp\\unicode.txt&quot;), GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 
FILE_FLAG_RANDOM_ACCESS, NULL);
ReadFile(NewFile, WCharBuffer, 200, &ReadBytes, NULL);
CloseHandle(NewFile);

novrain · Mar 9, 2003

you may try to open the file with CFile::typeBinary flag.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

ReadString not reading Unicode text properly 1

ProgrammerJoe

Programmer

LazyMe

Programmer

palbano

Programmer

LazyMe

Programmer

palbano

Programmer

ProgrammerJoe

Programmer

novrain

Programmer

Similar threads

Part and Inventory Search

Sponsor