Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Number of lines in a file 1

Status
Not open for further replies.

icrf

Programmer
Dec 4, 2001
1,300
US
Is there a way to find out the number of lines in a file other than:

@lines = <FILE>;
$count = scalar(@lines);

The file can be rather large and I'd like to not waste the system's I/O time and memory with such a thing. As the second part of this question, is there some way to seek to a line, not number of bytes? I'd like to read in the last five or so lines in from a file that has several thousand without going through the entire beginning.

If there was any doubt, a line here is defined as the characters between newline characters, a length which is variable. I imagine there is no other way to do this because the file would have to be searched for a delimiter(in this case, a newline). Would a possible solution be set a buffer of, say, five thousand bytes to seek from the end of the file, assuming that such a number is longer than the last x lines? I guess I'm just looking for someone to agree with me or tell me I'm stupid, possibly offering a better soln. ----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
 
&quot;I'd like to read in the last five&quot;

@lines = <FILE>;
@reverse(@lines);
# now if u call $lines[0] it will be the last line of the file
wmail.jpg


someone knowledge ends where
someone else knowledge starts
 
&quot;file can be rather large and I'd like to not waste the system's I/O time and memory&quot;

then read the file like
Code:
while (<FILE>) {
 # statement;
}

im looking for that &quot;
Code:
statement
&quot;....
wmail.jpg


someone knowledge ends where
someone else knowledge starts
 
Code:
NAME 
File::ReadBackwards.pm -- Read a file backwards by lines. 




--------------------------------------------------------------------------------

SYNOPSIS 
    use File::ReadBackwards ;


    # Object interface


    $bw = File::ReadBackwards->new( 'log_file' ) or
			die &quot;can't read 'log_file' $!&quot; ;


    while( defined( $log_line = $bw->readline ) ) {
	    print $log_line ;
    }


    # ... or the alternative way of reading


    until ( $bw->eof ) {
	    print $bw->readline ;
    }


    # Tied Handle Interface


    tie *BW, File::ReadBackwards, 'log_file' or
			die &quot;can't read 'log_file' $!&quot; ;


    while( <BW> ) {
	    print ;
    }




--------------------------------------------------------------------------------

DESCRIPTION 

This module reads a file backwards line by line. It is simple to use, memory efficient and fast. It supports both an object and a tied handle interface. 

It is intended for processing log and other similar text files which typically have their newest entries appended to them. By default files are assumed to be plain text and have a line ending appropriate to the OS. But you can set the input record separator string on a per file basis. 




--------------------------------------------------------------------------------

OBJECT INTERFACE 

There are only 2 methods in Backwards' object interface, new and readline. 

new( $file, [$rec_sep], [$sep_is_regex] ) 
New takes as arguments a filename, an optional record separator and an optional flag that marks the record separator as a regular expression. It either returns the object on a successful open or undef upon failure. $! is set to the error code if any. 

readline 
Readline takes no arguments and it returns the previous line in the file or undef when there are no more lines in the file. If the file is a non-seekable file (e.g. a pipe), then undef is returned. 

eof 
Eof takes no arguments and it returns true when readline() has iterated through the whole file. 

close 
close takes no arguments and it closes the handle 




--------------------------------------------------------------------------------

TIED HANDLE INTERFACE 
tie( *HANDLE, File::ReadBackwards, $file, [$rec_sep], [$sep_is_regex] ) 

The TIEHANDLE, READLINE, EOF and CLOSE methods are aliased to the new, readline, eof and close methods respectively so refer to them for their arguments and API. Once you have tied a handle to File::ReadBackwards the only I/O operation permissible is <> which will read the previous line. You can call eof() and close() on the tied handle as well. All other tied handle operations will generate an unknown method error. Do not seek, write or perform any other unsupported operations on the tied handle. 



--------------------------------------------------------------------------------

LINE AND RECORD ENDINGS 

Since this module needs to use low level I/O for efficiency, it can't portably seek and do block I/O without managing line ending conversions. This module supports the default record separators of normal line ending strings used by the OS. You can also set the separator on a per file basis. 

The record separator is a regular expression by default, which differs from the behavior of $/. 

Only if the record separator is not specified and it defaults to CR/LF (e.g, VMS, redmondware) will it will be converted to a single newline. Unix and MacOS files systems use only a single character for line endings and the lines are left unchanged. This means that for native text files, you should be able to process their lines backwards without any problems with line endings. If you specify a record separator, no conversions will be done and you will get the records as if you read them in binary mode. 



--------------------------------------------------------------------------------

DESIGN 
It works by reading a large (8kb) block of data from the end of the file. It then splits them on the record separator and stores a list of records in the object. Each call to readline returns the top record of the list and if the list is empty it refills it by reading the previous block from the file and splitting it. When the beginning of the file is reached and there are no more lines, undef is returned. All boundary conditions are handled correctly i.e. if there is a trailing partial line (no newline) it will be the first line returned and lines larger than the read buffer size are handled properly. 




--------------------------------------------------------------------------------

NOTES 

There is no support for list context in either the object or tied interfaces. If you want to slurp all of the lines into an array in backwards order (and you don't care about memory usage) just do: 

	@back_lines = reverse <FH>.


This module is only intended to read one line at a time from the end of a file to the beginning. 



--------------------------------------------------------------------------------

AUTHOR 

Uri Guttman, uri@sysarch.com 



--------------------------------------------------------------------------------

COPYRIGHT 

Copyright (C) 2000 by Uri Guttman. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
wmail.jpg


someone knowledge ends where
someone else knowledge starts
 
Whoa, that's neato. Thanks. ----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top