Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regex Question - non-greedy truncation 2

Status
Not open for further replies.

sloppyhack

Technical User
Apr 17, 2001
111
US
I need to be able to truncate at least a specific number of characters to make a string meet a character length requirement. But..I have to truncate on a delimiter (a pipe in this instance) so the string still makes sense.

Example:

I need to remove at least 30 characters from the end of this string..up to a pipe. The regex I would think would work is "[|].{30,}?$". According to documentation..the "?" after the "{30,}" should make it "non-greedy" and match at least 30 characters until it reaches the first pipe. It doesn't seem to be working. It's being greedy and matching up to the first pipe instead of where I would expect.

TEST|TESTTESTTEST|TEST|TESTTESTTEST|TESTTESTTEST
^Matching here ^Want to match here

Any help would be greatly appreciated.

Cheers,

Sloppyhack
 
maybe something like this would be better?

Code:
my $string = 'TEST|TESTTESTTEST|TEST|TESTTESTTEST|TESTTESTTEST';
$string = substr($string,0,-30);
$string .= '|' if ($string !~ /\|$/);
 
You could use index to find the first, and second occurences of the pipe symbol, and then substr the bit you need

It's not an answer to your regex, but TIMTOWTDI
--Paul

cigless ...
 
This will do what you want, with your variable
Code:
my $aaa= "TEST|TESTTESTTEST|TEST|TESTTESTTEST|TESTTESTTEST";
$aaa =~ s/\|?.{30}$//;
print $aaa;

you take this

Code:
TEST|TESTTESTTEST


``The wise man doesn't give the right answers,
he poses the right questions.''
TIMTOWTDI
 
Actually you dont need the '?' so it is much better this way
Code:
$aaa =~ s/\|.{30}$//;
and if you dont want to take of the last '|' then do it like this
Code:
$aaa =~ s/.{30}$//;
and if your variable is from a file, and it contains a '\n' character in it and you dont use 'chomp($aaa)' then do it like this
Code:
$aaa =~ s/.{30}\n$//;


``The wise man doesn't give the right answers,
he poses the right questions.''
TIMTOWTDI
 
The reason the original regexp doesn't work is because of a common misunderstanding. The non-greediness doesn't mean that the regexp engine will search everywhere in the string for the shortest possible match of `30-or-more-characters'. The way it works is that, firstly, the engine tries to match the pipe `|' symbol, which it finds at position 4 in the string (I'm starting at 0). It then sees if it can find `at least 30 characters of any type, followed by the end of the line'. After the pipe it's matched, it *can* find what it's looking for, so it doesn't try to find it at another position in the string.

Pengo - your regexp is looking for *exactly* 30 characters, which happens to work with the particular string that's given as an example but breaks if you add one more character to the string.

This should do what you want:
Code:
$aaa =~ s/\|[^|]*.{30}$//;
 
... or another way, split up the original string, and add the elements back while the accumulated length is less than your character length requirement.

Code:
my $max_len = 75;
my $str = 'TEST|TESTTESTTEST|TEST|TESTTESTTEST|TESTTESTTEST';
$str .= '|TEST|TESTTESTTEST|TEST|TESTTESTTEST|TESTTESTTEST';
my $count = 0;
my @temp;

foreach (split /\|/, $str) {
    if (($count + length($_) + $#temp) < $max_len) {
        push @temp, $_;
        $count += length($_);
    }
}
print join("\|", @temp), "\n";
 
Thanks everyone for all of the great posts! This site is such a great resource. I ended using ishnid's suggested regex...

"\|[^|]*.{30}$"

I did a bit of testing and it seems to do exactly what I needed. It's simple, elegant, and fast. Condider this thread closed.

Thanks again to all that posted and to all that continue to offer help at this site. It's a beautiful thing!!!

Cheers,

Sloppyhack
 
Status
Not open for further replies.

Similar threads

Part and Inventory Search

Sponsor

Back
Top