Code:
use strict;
#open the directory and than read all the files with *.pdb
opendir(DIR,"/home/shafiq/Desktop/parser/max-cluster-prepration/") or die "$!";
my @all_pdb_files = grep {/\.pdb$/} readdir DIR;
close DIR;
#open a new output file
#reterive desired data from the PDB file
foreach (@all_pdb_files){
open (SP, "$_");
my $test;
while (<SP>){
#match PDB Id in the file header
if ($_=~/^HEADER[\s\S]+(....)..............$/)
{
$test=$1;
#chomp $test;
#print "$test";
$test= "$test" . ".pdb";
open (OUTPUT,">./result/$test");
}
# open new out put files
# remove chain B-Z
unless (($_=~/^ATOM\s+\S+\s+\S+\s+\S+\s+[B-Z]/) ||($_=~/^TER\s+\S+\s+\S+\s+[B-Z]/) ||($_=~/^HETATM\s+\S+\s+\S+\s+\S+\s+[B-Z]/))
{
my $test2=$_;
print(OUTPUT "$test2");
#print " This is resolution $test2"
}
}
}
The code written above is working fine as many of my files those have A chain and I want to remove all the other chains from B-Z.
I nead only one chain at a time and that is first chain. It could be either A, B, C, D etc
but with some of my files where i have no A chain and my first chain is B I want to remove chain C-Z and similarly where there is no A and B I want to remove D-Z by having C in my results.
Similarly If I have no A, B, C, D, E, F, I want to delete next H-Z and just by having only one chain G