Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Unicode end of line character 3

Status
Not open for further replies.

richardii

Programmer
Jan 8, 2001
104
GB
I'm having problems with the unicode end of line character. AWK doesn't seem to want to accept it as the RS. It thinks I'm dealing with one record. I can't paste the file I want to process up here, because pasting it interprets the end of lines as carriage return, and look then it looks fine ! However, when I open the file in notepad they show up as black squares, and the whole thing reads as one record.

Any help much appreciated.
 
A trick I use in vi is to specify this type of character as the control sequence ctrl-Vctrl-M ... you could try specifying that as your RS in the awk script.

Pressing ctrl-V tells it to expect a control character next, where ctrl-M is the control character.

Pressing ctrl-Vctrl-M should show up as ^M on the command line.

Greg.
 
Thanks for that. I'm running gawk on my pc, so all this talk of vi makes me shake!!

I used this code (from the guide):

function chr(c)
{
# force c to be numeric by adding 0
return sprintf("%c", c + 0)
}

and specified RS=chr(13)

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top