Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help with getting data into MySQL with DBI

Status
Not open for further replies.

cmarchman

Programmer
Jun 18, 2004
56
US
Here's the situation:

I've got a file that has been arranged in the following manner;

Session| (this is on a single line)
Issue|3508|IP Disclosure|20|<summary of issue>|<execution of issue>|<implication of issue>|<solution to issue>|<references (this is one a long single line)
Issue|45|Directory found (<directory>)|25||<summary of issue>|<execution of issue>|<implication of issue>|<solution to issue>|<references ( and this is on a long single line )

Session|Issue|<etc.>
Issue|<etc.>
Issue|<etc.>

Session| (I think you get the picture)

I am trying to write a script in Perl DBI to place the data of the Session line into a table "Sessions", then have the data of the Issue line put into a table "Issues" and have an association between the Sessions and Issues tables as per the " which is the URL that was tested.

The problem is whether to use nested 'if' statements or 'while' statements.

Any ideas are welcomed.

Thanks,

cmarchman
 
I would start by creating an array based on splitting on each of the 'session' lines i.e.
Code:
open (FILE,$file) || die "ERROR: Opening $file \n$!\n";
@lines = <FILE>;
foreach $line(@lines)  {
    $string = $string.$line;
}
close FILE;

@sessions = split/Session\|/i,$string;
Then you can cycle through each of the sessions:
Code:
foreach $session(@sessions) {
    # split the issues up and put contents into database
}

Rob Waite
 
Code:
$s = -1;
$i = 0;

open(IN, "< $file") || die "Message: $!\n";
while (<IN>) {
     chomp;
     if (/^Session\|/) {
          $s++;
          $i = 0;
          $sessions[$s] = $_;
     } elsif (/^Issue\|/) {
          $issue[$s][$i] = $_;
          $i++;
     } else {
          next;
     }
}
close(IN);

foreach $s (0 .. $#sessions) {
     # Load each session record using $s as the primary key.
     foreach $i (0 .. $#{$issue[$s]}) {
          # Load each issue record using $s and $i as the primary key where $s ties to the session record in your session table.
     }
}


Michael Libeson
 
Thanks guys, I'll try them out and let you know which one I went with.

cmarchman
 
I'd assume that this is a 'large' logfile? use of arrays could be a serious overhead on the system.

based on Michael's code
Code:
&prepare_DBHandles;
open(IN, "< $file") || die "Message: $!\n";
while (<IN>) {
     chomp;
     if (/^Session\|/) {
        ($session, $session_id)=split (/\|/, $_);
        $db_sess_id=&create_session_rec($session_id);
     } elsif (/^Issue\|/) {
        (@fields)=split (/\|/, $_);
        $rc=&create_session_values_rec($db_sess_id, $session_id, @fields);
     } else {
          next;
     }
}
close(IN);
sub prepare_DBHandles {
$dbh1= ... ;
$dbh2= ... ;
$sth1=$dbh1->prepare('insert into session_id 
   ( session_id, time, date) 
      values
   ( ?,          ?,    ?)');
$sth2=$dbh1->prepare('insert into sessions 
   ( session_id, time, date, type, desc, anothernum, summary, execution, implication, solution, reference) 
      values
   ( ?,          ?,    ?,    ?,    ?,    ?,          
?,       ?,         ?,           ?,        ?)');
  
}
sub create_session_rec {
  $sess_id = @_[0];
  $sth1->execute($sess_id);
  #might want a auto id returned here in case you hit dupes
  #return $db_sess_id -> you'll have to get this first
}
sub create_session_values_rec{
  ($base_id, $sess_id, @data)=@_;
  $rc=$sth2->execute($base_id, $sess_id, $data[0], $data[1], $data[2] .... $data[n]) or die "DBI Update Failure".DBI->errstr;
  return $rc;
}

Hope this at least forms a framework

Regards
--Paul

Nancy Griffith - songstress extraordinaire,
and composer of the snipers anthem "From a distance ...
 
PaulTEG is correct, but if you have the resources, you are better off processing first then loading so the the load can be implemented more efficiently with less calls to the database. The code was written to point you in the direction you were asking. What you can do instead of storing the records and then processing them for load is to create your SQL code on the fly. Store your code until you have processed the entire file and then load your database. It will take longer if you process some data, load some data, process more data, load some more data. It will be more efficient if you process all the data, then load all the data. You can even write your SQL code to flat files in case something occurs during the load so that you can redo the load or continue where you left off.



Michael Libeson
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top