Regex multiline

stevio · Feb 1, 2009

All tried performing a regex over multi-line text like so

Code:

while (<DATA>){

print "$1\t$2" if /Name=(.+)Work=(.+)/s;

}


__DATA__
      Name=Paul
      Phone=555-555-555
      DOB=09/09/08
      License=456332
      Work=Doctor
      .....loop through list

Data has spaces at the beginning

Basically trying to regex over mutliple lines, but regex returns nothing, works fine if it was just /Name=(.+)/

Also tried the /m modifier with /^\s+Name=(.+)^\s+Work=(.+)/

Am I understanding the /s and /m modifiers correctly?

prex1 · Feb 1, 2009

With [tt]while(<DATA>){[/tt] you read a single line at a time and, for that match to work, you need a multiple line read instead.
For that the first version of the regex is correct, however note that the [tt]\n[/tt] chars will be included into your [tt]$1[/tt] and [tt]$2[/tt] vars.
On how to read multiple lines, that depends on how your data is organized: you'll need some type of end of record separator line to recognize the end of each group of lines. You can anyway do what you are trying to do also with reading a line at a time, if you know that every [tt]Name=[/tt] line is followed by a [tt]Work=[/tt] line.
Come back with the structure of your data and a description of your goal if you need more help.

Franco

http://www.xcalcs.com

: Online engineering calculations

http://www.megamag.it

: Magnetic brakes for fun rides

http://www.levitans.com

: Air bearing pads

stevio · Feb 2, 2009

The structure of the data is as follows

Name=
Phone=
DOB=
License=
Work=
some_more_fields=
some_more_fields=
Name=
Phone=
DOB=
License=
Work=

So basically repeats. Note the spaces at the beginning before the fields. OK, so revised code, using a file and foreach loop

Code:

my $data_file = "data.txt";

   # Open the file for reading.
open DATA, "$data_file" or die "can't open $data_file $!";
my @array_of_data = <DATA>;

foreach my $line(@array_of_data){
    if ($line=~ /Name=(.*)Work=(.*)/s) {
    print "$1\t$2\n";
    } 
}

I'm having trouble with the matching the second field and subsequent fields

So, goal is to obtain something like

Name Phone Work
Paul 555-555 Doctor
Nick 456-667 Driver

PaulTEG · Feb 2, 2009

Code:

while (<DATA>) {
   ($setting, $value)=split(/=/, $_);
   $hash{$setting}.="$value||";
}

This will build a list of strings in the hash, which you can subsequently split, or simply transpose ala

Code:

print "<tr>\n\t<td>\n";
$hash{$setting}=~tr/||/</td><td>/;
print $hash{$setting};
print "\t</td>\n</tr>\n</table>\n";

HTH

Paul
------------------------------------
Spend an hour a week on CPAN, helps cure all known programming ailments ;-)

prex1 · Feb 2, 2009

As I already told you, you need a record separator to recognize each new record.
Assuming as an example the the line [tt]Name=[/tt] is always the first line of each new record, so that it can work as the separator, you can do so:

Code:

for(@array_of_data){
  s/^\s+//;
  s/\s+$//;
  my($field,$content)=split/=/,$_,2;
  if($field eq'Name'){
    print "\n$content\t";
  }elsif($field eq'Work'){
    print$content;
  } 
}

This will work even if no [tt]Work=[/tt] line follows every [tt]Name=[/tt] line at the cost of one additional newline printed before the first record.

Franco

http://www.xcalcs.com

: Online engineering calculations

http://www.megamag.it

: Magnetic brakes for fun rides

http://www.levitans.com

: Air bearing pads

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Regex multiline

stevio

Vendor

prex1

Programmer

stevio

Vendor

PaulTEG

Technical User

prex1

Programmer

Similar threads

Part and Inventory Search

Sponsor