Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Shaun E on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Matching Pattern to string of letters 2

Status
Not open for further replies.

sdslrn123

Technical User
Jan 17, 2006
78
GB
Hi there. If you can help you are a star.

I want user to:
Enter text into box A
Enter a pattern into box B

Box B is matched against the text in Box A.

Believe it or not, I know how to do the above! But, I am getting stuck on a configuration of this cgi program.

I need to configure the asterix button so that if someone enters the text: abcdefg
and the pattern: c*f

My program will find the first match (in this case cdef) as asterix needs to represent one or more letters.

I only come to this forum as a last ditch hope, please do not be offended but I have learnt if I am going to learn any kinda programming I have to keep on trying by myself! It is just I can configure the asterix to be replaced by one letter but not more than one.

Any help is much appreciated.


$newtext is just what was taken from parameter
---------------

my $patternit = $pattern ;
my @values = split(/\*/, $patternit);

foreach my $value (@values) {
}
print $query ->p("(8)$values[0]");
print $query ->p("(9)$values[1]");
$finalpattern = join '', $values[0],"[a-z]",$values[1];
print $finalpattern;
if ( $newtext =~ /$finalpattern/)
{
print $query ->p("An Important Match Was Found.");
}
else{
print $query ->p("No match was found");
}
$position2 = index($newtext,$finalpattern);
print "The match was found at position $position2<br>";
}
}
}
 
I haven't completely thought this through, but maybe something like this?

$pattern = "a*f*g";
$text = "abcdefg";

($regex = $pattern) =~ s/\*(.)/[^$1]*$1/g;

print "match" if ($text =~ /$regex/);
 
Hi

Thanks for the speedy reply.

It does seem to work very well.
The only problem is printing out the position:

I am currently using the following code which works well for pattern sequences without a star but for those with a star it always prints -1 even when there is definitely a match.

$position = index($newtext,$regex,$start);
print $query ->p("The match was found at position $position");

Once again, thanks for any help.
 
Hi

#index STR,SUBSTR,POSITION

The Perl function "index" doesn't accept regular expression as SUBSTR. All SUBSTR will assume to be normal text.

In another word, there isn't any occurrence of 'a*f*g' in 'abcdefg';

Following is a possible solution:
Code:
$pattern = "a*f*g";
$text = "abcdefg";

($regex = $pattern) =~ s/\*(.)/[^$1]*$1/g;

if ($text =~ /($regex)/)   add a brace
{
   $substr = $1;
   $position = index($text,$substr,$start);
   print $query ->p("The match was found at position $position");
}
 
Thank you so much it works.

I love receiving solutions but sometimes I do not feel I am learning unless I understand what exactly I have just done.

1) Declaration of Variables
$pattern = "a*f*g";
$text = "abcdefg";

2) I am replacing pattern variable with regex variable so that any * are replaced by...
apologies I just don't understand the significance of [$1]
($regex = $pattern) =~ s/\*(.)/[^$1]*$1/g;

3) If I can find $regex within text then do the following
if ($text =~ /($regex)/)
{

4) I am a bit confused, I am converting $1 to $substr?
$substr = $1;
$position = index($text,$substr,$start);
print $query ->p("The match was found at position $position");
}

Plus, do I have to make significant number of changes to this program to prevent greedy matching such as
text = abcdefgrtdfrg
pattern = f*g

How would I prevent position being read as fgrtdfrg, if I want it to be read as frg.

Any help is again much appreciated

 
Hi, sdslrn123


2) I am replacing pattern variable with regex variable so that any * are replaced by...
apologies I just don't understand the significance of [$1]
($regex = $pattern) =~ s/\*(.)/[^$1]*$1/g;


It is a simple concept, but very hard to explain here. It is better to look for some documentation or tutorial.

Anyway, try to give a simple explanation.
for
'a*d' =~ s/\*(.)/[^$1]*$1/g; we have
1) $1 = 'd'; or the first character right behind the '*';
2) [^$1] = anything other than 'd';

Then,
'a*d' is matched and is replaced with 'a[^d]d';

'a*d' ===> 'a[^d]*d';
which means 'a' followed by some or none of character other than 'd', and followed with a 'd'.


3) If I can find $regex within text then do the following
if ($text =~ /($regex)/)

4) I am a bit confused, I am converting $1 to $substr?
$substr = $1;
$position = index($text,$substr,$start);
print $query ->p("The match was found at position $position");
}


When a regex is matched, Those match the regex pattern in the first brace is saved in $1, while second in $2 .....

'ABC123DEF456' =~ /(\d+)(\w+)(\d+)/;
then $1 = 123; $2 = DEF; $3 = 456; :) hope you get what i means

When $text =~ /($regex)/ is matched, those characters that match the $regex will be store in $1.

So now, $1 equal to the exact characters in the string instead of $regex.

To prevent greedy matching :
Change the
($regex = $pattern) =~ s/\*(.)/[^$1]*$1/g;
to
($regex = $pattern) =~ s/\*(.)/[^$1]*?$1/g;



 
There's no need to use the index() function to find the position the match occurred in. That's already held in the special variable $-[0], i.e.:
Code:
if ($text =~ /($regex)/)
{
   print $query->p("The match was found at position $-[0]");
}
 
I must be the most annoying person on this thread because I just don't go away... sorry!

I can't say how grateful I am for that amazing and patient answer eewah, I understood all of it. And, thank you Ishnid for that example it works.

Unfortunately, I did try the greedy matching code and it did not work.
For instance with the following example:
Text = abcgccgtaft
Pattern = a*t
It is still picking up position 1 (greedy) as opposed to position 9.

Could it be something to do with making the part for * in text represent everything other than the letter before the * (thinking about it that problem should have been solved). Grrr, who would want to be a programmer!

Well, if you have any suggestions, I would be very grateful, otherwise another late night for me then and some light O'Reilly reading!! :eek:)
 
Hi,

About the greedy matching.

I misunderstand your definition of greedy. Mine one is:
Code:
Greedy: 'AuuuuuuuYnnnnnnnnY' =~ /(A\w+Y)/
        matched : 'AuuuuuuuYnnnnnnnnY';
 
Non Greedy: 'AuuuuuuuYnnnnnnnnY' =~ /(A\w+?Y)/
        matched : 'AuuuuuuuY';

But you mean
Code:
Greedy: 'B++++++YB------Y' 
        matched : 'B++++++Y';
 
Non Greedy: 'B++++++YB------Y' 
        matched : 'B------Y';

For your case you can try,
Code:
if ($text =~ /($regex)(?!.*?$regex)/)
{
   print $query->p("The match was found at position $-[0]");
}

The (?!.*?$regex) is to make sure there isn't any occurence of $regex after the matched $regex
 
You are one patient guy, eewah, thank you.

Unfortunately, I have tried that against the following:
Text: agctaaaacag
Pattern: c*g
(Is there a way to make variable $1 only consist of certain letters such as vowels/numbers?)

It still matches the greedy ctaaaacag as opposed to the smaller cag.

Plus, is my understanding of greedy matching incorrect (less commonly used definition). I just don't want to get confused early on in my programming career!

Thanks again I really appreciate it
 
Hi,
Code:
$text = '???+c_gc_g_a_ac_g_hc_g_hc_c_c_c_g_g_g_g_h-ZZZ';
$pattern = 'c*g*h';

print "          11111111112222222222333333333344444444\n";
print "012345678901234567890123456789012345678901234567\n";
print "$text\n\n";

($regex = $pattern) =~ s/\*(.)/[^$1]*?$1/g;

if ($text =~ /($regex)(?!.*?$regex)/)
{
    my $index = $-[0];    
    
    $text = $1;
    
    while ($text =~ /.($regex)(?!.*?$regex)/)
    {
        $index += $-[0] + 1;
        $text = $1;
    };

    print ("The match was found at position $index: ($text)\n");       

    $text =~ s/[^\daeiou]//ig;

    print ("Showe only vowels and numbers: ($text)\n");        
}

The code works. But I think it should be something better than this.
 
eewah deserves a more than a star but it's all I can give :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top