Perl regex --need help 2

riya0707 · Oct 27, 2009

I am a perl beginner and need some help with regex
Input for this code is :
aatgatgataaggtaaggtatgatgatgatgatgatagtagannnnnnnnnatgcatga'/atgca.atgactagca/atgactagcaaggtaaggtaaggtaaggtaaggtatgatgatgannnn./atgatgactagactgacaaggtaaggtaaggtatgatgatgatcgatgacgat... and so on

Here i am trying to assign input file as a scalar variable and trying to find a match of "aaggtaaggt" and then skip some 100 characters whether they are alphabets or symbols or any wildcard characters after skipping exact 100 characters, i am again asking code to start search for match again and every time it finds a match , i am counting and asking to print..

As of my knowledge i have tried using substr of match as pos1 and had set offset for 100 and then assigned that as initial pos for reading second match , but failed to get the correct output, then tried here post match as $' but doubt whether it is correct or not.

Code:

#!/usr/bin/perl 
$count1 = 0; 
open (FILE, "INPUT") || die "cannot open $!\n"; 
while ($line = <FILE>){ 

if(($line=~m/(aaggt){2}/ig)&&($'=~m/([atgcn]{100,})/i)) { 
$count1 ++ ; 
print " " ,$1, "\t"; 
print "count1 \t " ,$count1 , "\n" ; 
} 

}

Please, help me figuring out this task. Thanks

parkers · Oct 28, 2009

Hi,

Try implementing the following in your code:

Code:

my $line = "aatgatgataaggtaaggtatgatgatgatgatgatagtagannnnnnnnnatgcatga'/atgca.atgactagca/atgactagcaaggtaaggtaaggtaaggtaaggtatgatgatgannnn./atgatgactagactgacaaggtaaggtaaggtatgatgatgatcgatgacgat";

my $count1 = 0;

while ($line =~ s/((aaggt){2})(.{100})?//) {
	++$count1;
}

To summarise this basically matches your target string and optionally the next 100 characters... if it makes a match then it will remove it from the target string and attempt the match again.

Hope this helps.

Thanks,

Steven Parker

http://www.stevenp1974.co.uk

parkers · Oct 28, 2009

or as implemented within your code:

Code:

#!/usr/bin/perl
$count1 = 0;
open (FILE, "INPUT") || die "cannot open $!\n";
while ($line = <FILE>){

    while ($line =~ s/((aaggt){2})(.{100})?//) {
        ++$count1;
    }

print "count1 \t " ,$count1 , "\n" ;
}

}

Steven Parker

http://www.stevenp1974.co.uk

riya0707 · Oct 29, 2009

Hi Parkers

Thanks for the reply and the solution was helpful in figuring out this task, but i have a question here in ur code as
while ($line =~ s/((aaggt){2})(.{100})?//)
why did u use two backslashes at the end after 100, i mean usually match is between / /, i didn't get clear idea abt this.Can u help me out.
Anyway thanks once again.

Annihilannic · Oct 29, 2009

s/// takes two "parameters" to do a search-and-replace. It is based on a command which uses a similar syntax in the Unix ed, ex and vi editors. For example s/water/wine/ would replace the first occurrence of "water" in the string with "wine"... in Parkers version it is replacing the matched expression with nothing, i.e. removing it from the original string.

parkers said:
... if it makes a match then it will remove it from the target string and attempt the match again.

See perldoc perlop for details (search for "REPLACEMENT" to find the relevant section).

Annihilannic.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Perl regex --need help 2

riya0707

Technical User

parkers

Vendor

parkers

Vendor

riya0707

Technical User

Annihilannic

MIS

Similar threads

Part and Inventory Search

Sponsor