Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to handle hidden characters?

Status
Not open for further replies.

lcs01

Programmer
Joined
Aug 2, 2006
Messages
182
Location
US
Hi, Experts,

I am having trouble in parsing a client ascii file, because I don't know how to hanle hidden characters in the file.

Following are the detail info about two files, named as file1 & file2, respectively. File1 is a client file. File2 is edited in house using vi on linux (Ubuntu).

Code:
[% 569] => file file1 file2
text/plain; charset=us-ascii
text/plain; charset=us-ascii
[% 570] => ls -l file1 file2
-rw-r--r-- 1 nobody nobody 24 2006-11-30 10:41 file1
-rw-r--r-- 1 nobody nobody 22 2006-11-30 10:41 file2
[% 571] => cat file1
name=test
endCaseInfo
[% 572] => cat file2
name=test
endCaseInfo

So, as you can see, file1 has two more characters than file2.

I wrote a small and simple perl program named as 'tt.pl' to parse these two files:

Code:
#! /usr/bin/perl

my $srce = "file1";
open(FH, "$srce") || die "Can not open file '$srce': $!\n";
while(<FH>) {
  chomp($_);
  my $len = length($_);
  print "\$len = $len\n";
  print "$_\n";
  print "$_##\n";
}
close(FH);

print "\n";
my $srce = "file2";
open(FH, "$srce") || die "Can not open file '$srce': $!\n";
while(<FH>) {
  chomp($_);
  my $len = length($_);
  print "\$len = $len\n";
  print "$_\n";
  print "$_##\n";
}
close(FH);
exit;

Here is the output:

Code:
[% 573] => ./tt.pl
$len = 10
name=test
##me=test
$len = 12
endCaseInfo
##dCaseInfo

$len = 9
name=test
name=test##
$len = 11
endCaseInfo
endCaseInfo##

Could someone please tell me what kind of hidden variables are in file1 and how to handle them? Many thanks!
 
Looks like file1 is created on Windows, with CRLF at the end of each line. file2 is created on your linux box with vi, and hence uses the LF character to mark the end of the lines. To check, make a copy of file1 and run it through dos2unix, which you ought to have on your distro, to see if it fixes it...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::PerlDesignPatterns)[/small]
 
yea, probaly one file has '\r\n' line endings and the other has '\n'.

- Kevin, perl coder unexceptional!
 
It's indeed has '\r\n' line endings. I just learnt that I could use 'hexdump -c {$filename}' to see hidden characters.
 
dos2unix would also fix it. Thank you both.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top