Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Perl regex --need help 2

Status
Not open for further replies.

riya0707

Technical User
Oct 27, 2009
2
US
I am a perl beginner and need some help with regex
Input for this code is :
aatgatgataaggtaaggtatgatgatgatgatgatagtagannnnnnnnnatgcatga'/atgca.atgactagca/atgactagcaaggtaaggtaaggtaaggtaaggtatgatgatgannnn./atgatgactagactgacaaggtaaggtaaggtatgatgatgatcgatgacgat... and so on

Here i am trying to assign input file as a scalar variable and trying to find a match of "aaggtaaggt" and then skip some 100 characters whether they are alphabets or symbols or any wildcard characters after skipping exact 100 characters, i am again asking code to start search for match again and every time it finds a match , i am counting and asking to print..

As of my knowledge i have tried using substr of match as pos1 and had set offset for 100 and then assigned that as initial pos for reading second match , but failed to get the correct output, then tried here post match as $' but doubt whether it is correct or not.
Code:
#!/usr/bin/perl 
$count1 = 0; 
open (FILE, "INPUT") || die "cannot open $!\n"; 
while ($line = <FILE>){ 

if(($line=~m/(aaggt){2}/ig)&&($'=~m/([atgcn]{100,})/i)) { 
$count1 ++ ; 
print " " ,$1, "\t"; 
print "count1 \t " ,$count1 , "\n" ; 
} 

}


Please, help me figuring out this task. Thanks
 
Hi,

Try implementing the following in your code:

Code:
my $line = "aatgatgataaggtaaggtatgatgatgatgatgatagtagannnnnnnnnatgcatga'/atgca.atgactagca/atgactagcaaggtaaggtaaggtaaggtaaggtatgatgatgannnn./atgatgactagactgacaaggtaaggtaaggtatgatgatgatcgatgacgat";

my $count1 = 0;

while ($line =~ s/((aaggt){2})(.{100})?//) {
	++$count1;
}

To summarise this basically matches your target string and optionally the next 100 characters... if it makes a match then it will remove it from the target string and attempt the match again.

Hope this helps.

Thanks,


Steven Parker
 
or as implemented within your code:

Code:
#!/usr/bin/perl
$count1 = 0;
open (FILE, "INPUT") || die "cannot open $!\n";
while ($line = <FILE>){

    while ($line =~ s/((aaggt){2})(.{100})?//) {
        ++$count1;
    }

print "count1 \t " ,$count1 , "\n" ;
}

}

Steven Parker
 
Hi Parkers

Thanks for the reply and the solution was helpful in figuring out this task, but i have a question here in ur code as
while ($line =~ s/((aaggt){2})(.{100})?//)
why did u use two backslashes at the end after 100, i mean usually match is between / /, i didn't get clear idea abt this.Can u help me out.
Anyway thanks once again.
 
s/// takes two "parameters" to do a search-and-replace. It is based on a command which uses a similar syntax in the Unix ed, ex and vi editors. For example s/water/wine/ would replace the first occurrence of "water" in the string with "wine"... in Parkers version it is replacing the matched expression with nothing, i.e. removing it from the original string.

parkers said:
... if it makes a match then it will remove it from the target string and attempt the match again.

See perldoc perlop for details (search for "REPLACEMENT" to find the relevant section).

Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top