hmmm, regexes don't really work well in this situation. what you could do is the technique lisp programmers use to make sure their parentheses all match up: you start at zero, then add one for every '(' and subtract one for every ')', and you do this linearly, that is, start at the beginning and assign to each '(' or ')' the number that results from the addition or subtraction. then, to find the set of matching parens, you just find the first occurance of two equal numbers. this shouldn't be too hard:[tt]
my $text = "junk{hi{how{are{you}}}}junk";
my $bc = 0;
my @ba;
while ($text =~ /(\{|\}).*?(?=(\{|\}))/g)
{
if ($1 eq '{') {$bc += 1} else {$bc -= 1}
last if ($bc <= 0);
@ba[1..$bc] = map {$_ . $&} @ba[1..$bc];
}
@ba = map {$_ . '}'} @ba[1..$#ba];
print join("\n", @ba);[/tt]
this may not be the absolute best way to do it, both in efficiency and logicality, but it will do what you're looking for. @ba will be an array where $ba[1] will be the data and brackets that occur first, $ba[2] will be the second set, &c.
if this were cleaned up a bit, it would make a nice addition to the perl cookbook.
stillflame out. "If you think you're too small to make a difference, try spending a night in a closed tent with a mosquito."
# or, you can escape all your braces with '\' or use hex values for your braces
# this is a little ugly, but it works.
$str = '{{{thisFunciton{{{}{}}}}}}';
$str =~ s/(\x7B{3}thisFunciton\x7B)(\x7B{2}\x7D\x7B\x7D{2})(\x7D{4})/$1start$2end$3/;
print "\n$str\n";
hmmm. maybe i should have read your question more than just in my imagination before i answered it
first of all i'm assuming that the text your dealing with won't be known in detail before you search through it, that you'll only have the location within the brackets you're looking for, not the text that exists in them already. if my assumption is off, goBoating's code will work perfectly.
my code doesn't really do what you want yet, but it's a start. one thing it's failing to do is properly split up the data - an array is not the structure that should be used here. probably better would be a tree structure of some sort - root node being the full string, each subsequent level being the next outermost set of brackets, with a node for each set (as in '{{1}{2}}' becoming:[tt]
root: {{1}{2}}
/ \
first: {1} {2}[/tt]
(if that makes sense)). this is going to be similar to *ML (as in HTML and XML) parsing, so i'm going to do a bit more research before i commit to a method.
stillflame "If you think you're too small to make a difference, try spending a night in a closed tent with a mosquito."
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.