Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Split Large Files in to Several Small Files

Status
Not open for further replies.

V00D00

Technical User
Jul 11, 2005
78
US
I have a need to split up large comma quote delimited text files of varying length on an on going basis. I would like to take this large file and dump it out in exactly the same format, but create several smaller files with 4000 lines, having the original file name adding A, B, C in sequence to the file name.

Any and all help is greatly appreciated.

P
 
I should add that this is on WinTel

P
 
ok, typical.
Well I wrote a quick script for you. Hope it's of some use:
Code:
#!/usr/bin/perl -w
use strict;
use diagnostics;
my $sourcefile = shift;
my $rowlimit   = 11;
open(IN, $sourcefile) or die "Failed to open $sourcefile";
my $outrecno = 1;
my @file_endings = qw(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z);
my $suffix;
while(<IN>) {
  if($outrecno == 1) {
    die "Ran out of file extensions" unless(@file_endings);
    $suffix = shift @file_endings;
    open OUT, ">$sourcefile$suffix" or die "Failed to create $sourcefile$suffix";
  }
  print OUT $_;
  if($outrecno++ == $rowlimit) {
    $outrecno = 1;
    close FH;
  }
}
close FH;

BTW: This better not be school work!


Trojan.
 
Thank you very much for helping. Your understanding of my WinTel situation (Sucks) is greatly appreciated.

BTW: It is not school work. :)
 
Why don't you get a copy of Knoppix.
You don't need to install that and has most of the unix tools you could ever wish for. Also, it will allow you access to your windows filesystems AND you don't even have to install it. It boots straight from the CD.



Trojan.

 
There's no need to pre-create a file suffix cache. You can use Perl's string increment to create them on the fly. This also removes the extension number limit imposed by the array cache. Additionally, the current line number for a file handle is stored in '$.' so no need to track it manually.

Also, Trojan, you seem to be closing a FH filehandle which isn't in your snippet...perhaps you meant to close OUT instead. It isn't strictly necessary, though, to close a file handle just to re-open on another file because Perl will close the old one for you when the new is opened (this is how <> works).

with modifications:
Code:
my $rowlimit = 11;

my $sfx = 'A';

while(<>) {
    unless( ($.-1) % $rowlimit ) {
        open OUT, ">$ARGV$sfx" or die "$ARGV$sfx: $!";
        $sfx++;
    }
    print OUT $_;
}

regards

jaa
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top