Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

"Special" Binary stuff

Status
Not open for further replies.

drkestrel

MIS
Joined
Sep 25, 2000
Messages
439
Location
GB
I have some files which I need to read. These files have a checksum in a separate file which I obtain. I need to calculate the check sum of the files and check whether it is the same as that listed in a checksum file.
1) I need to read in the each 'byte' of the file and do some arithmetic with it
2) The file is a text file.

Questions
1) How do I read each of the byte value in a Text file. I saw the documentation for 'binmonde', but how do I obtain all the byte values? Would I have to worry about 'virtual' and 'physical' linefeeds?
2) Assuming I could obtain byte values from question 1), would they be in normal base10 notation or something else? Can I use the + operator for addition and trust that the result is 'correct'?
P.S.: the system in use is Solaris.
 
1. If you are reading a file in binary mode you don't have to worry about linefeed characters at all. You can use substr to get at each byte, or split on '' and it will split at each character.

2. The bytes would be in binary, just as they are in the file. If you need to do math on them you could use the ord function to get the numeric ascii value of the characters. Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.
 
I suppose the ASCII value is the same as the unsigned char byte value of a character.

My next question relates to binary arithmetic.

Say I use the ord function to get the Octal value 65.
(binary=1000001)

I could add/subtract 65 to anything. However, how could I do overflow,underflow?? I guess normal arrithmetic, won't get me into an overflow/underflow situation. But say I want to add 99 to another number (to be decided) and interpret the total as an unsigned integer of 1 byte. How could I do that in binary in Perl and convert the value back to Octal? Sorry, but I am not that 'binary' and I could merely manage this even with Window's Scientific calculator :-) !
 
DrKestrel,
did something similar recently. The way I read in from the file was to open it normally, and then read a 16k chunk of data. Then pushed all the data onto a list (I had some pre-processing to do which I won't go into)... anyway I was left with the bytecode in a list thus:
Code:
sub someFuncToReadBytesFromAFile($filename) {
  open (INF, "$_[0]") or die("Could not open file $_0]\n");
  read INF, $byteCode, 16384, 0;
  close INF;
  foreach $byte (unpack "C*", $byteCode)
  {
    push @bC, $byte;
  }
  print "Read byteCode from file:[";
  print "@bC]\n\n";
  return(@bC);
}
So now you should have the list @bC containing your bytecode. Next you can process it one byte at a time by popping/shifting the values off the list. Now you can do your arithmetic at leisure. Lastly to count up bytes but keep the value as an 8bit byte:
Code:
# data_len is the number of bytes
# current_byte starts at 0 usually
#this loop sums the values of all the bytes in the data
#and keeps the value in one byte 
while ($current_byte < $data_len)
{
  #add the next datum from the list to the checksum
  $chksum_val += shift(@cK);  
  #logical AND with %11111111 to keep only 1 bytes worth
  $chksum_val &= 0xFF;        
  #move onto the next datum
  $current_byte++;
}
And just in case, if you want to take your 8-bit number and express it as two bytes of hex (in ASCII):
Code:
my $bCcK1 = (($chksum_val & 0xF0) >> 4);   #hi-nibble
my $bCcK2 = ($chksum_val & 0x0F);          #lo-nibble
Hope that helps
Loon
 
Loon,
Thanks
I have the following code now (just modified a few minor bits). Note I am now keeping two bytes instead of one byte.

Wonder if you could explain a bit more, as I know nothing much about binary stuff..

unpack- I have read the docs, but what does the C* mean, I take that is a 'template'?? I saw in the docs that C means unsigned char, but what about the *??


Code:
  open (INF, &quot;$ARGV[0]&quot;) or die(&quot;Could not open file ARGV[0]\n&quot;);
   read INF, $byteCode, 9999999, 0;
  close INF;
  foreach $byte (unpack &quot;C*&quot;, $byteCode)
  {
    push @bC, $byte;
  }
    
 #So now you should have the list @bC containing your bytecode. Next you can process it one byte at a time by popping/shifting the values off the list. Now you can do your arithmetic at leisure. Lastly to count up bytes but keep the value as an 8bit byte:
 
 # data_len is the number of bytes
 # current_byte starts at 0 usually
 #this loop sums the values of all the bytes in the data
 #and keeps the value in one byte 
 $data_len = scalar(@bC);
 $current_byte=0;
 
 while ($current_byte < $data_len)
 {
   #add the next datum from the list to the checksum
   $chksum_val += shift(@bC);  
   #logical AND with %11111111 to keep only 1 bytes worth
   $chksum_val &= 0xFFFF;        
   #move onto the next datum
   $current_byte++;
}

print &quot;------------&quot; . $chksum_val . &quot;--------------&quot;;
 
How is linefeed handled under Active Perl 5.60.623 for Windows NT and Solaris??

On testing on some sample files where checksum computation is required, I noticed that for files with >4 lines, the computed checksum is < the expected checksum?

Any good suggestions for finding out what is going on?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top