Print duplicates from hash 1

tonykent · Jan 28, 2010

Hello again, I have several lines of data that I am reading into a hash, as per the code below. I would like to print out only those lines where the key is duplicated. The key, in this case, is the data in the first TWO columns on each line (for example '310715 W'):

Code:

#!/usr/bin/perl
use strict;
my %file_hash;  

while (<DATA>) { 
	chomp;
	next unless $_ =~ m/^\d/; 
	my ($key, $status, $value) = split(/\s+/,$_,3); 
	$file_hash{"$key\t$status"} = $value; 
	} 

foreach my $key ( keys %file_hash ) {        
	print "$key -- $file_hash{$key}\n";
}    

__DATA__ 
301625 W 322500 1 07/31/2009 
305671 C 155900 1 07/31/2009 
306526 W 69900 1 07/31/2009 
308064 W 895000 1 07/31/2009 
308548 H 89000 0 07/31/2009 
309245 H 88000 1 07/31/2009 
310708 W 199900 1 07/31/2009 
310715 W 199900 0 07/31/2009 
311018 H 142500 0 07/31/2009 
311018 W 137900 0 07/31/2009 
312911 H 53900 0 07/31/2009 
313984 W 554900 1 07/31/2009 
314303 X 47000 0 07/31/2009 
314303 X 69300 0 07/31/2009 
314303 X 69900 0 09/21/2009 
314306 W 146000 0 07/31/2009 
314389 H 90000 0 07/31/2009 
315309 W 149900 0 07/31/2009 
315671 W 150000 1 07/31/2009

What I would like to see printed is just the following line which have the same key (i.e. 314303 X):

Code:

314303 X 47000 0 07/31/2009 
314303 X 69300 0 07/31/2009 
314303 X 69900 0 09/21/2009

Can anyone suggest a way of achieving this?

I have a working script to do this using an array (lines are compared in pairs) but I am sure that a hash solution must be more efficient?

feherke · Jan 28, 2010

Hi

Hash keys are unique, so your $file_hash will not contain the required information. ( 2^nd and 3^rd occurrence of the same key will overwrite the previous value. ) I would write it like this :

Perl:

[gray]#!/usr/bin/perl[/gray]
[b]use[/b] strict[teal];[/teal]
[b]my[/b] [navy]%file_hash[/navy][teal];[/teal]

[b]while[/b] [teal]([/teal][green][i]<DATA>[/i][/green][teal])[/teal] [teal]{[/teal]
  [b]chomp[/b][teal];[/teal]
  [b]next[/b] [b]unless[/b] [navy]$_[/navy] [teal]=~[/teal] [b]m[/b][fuchsia]/^\d/[/fuchsia][teal];[/teal]
  [b]my[/b] [teal]([/teal][navy]$key[/navy][teal],[/teal] [navy]$status[/navy][teal],[/teal] [navy]$value[/navy][teal])[/teal] [teal]=[/teal] [b]split[/b][teal]([/teal][green][i]/\s+/[/i][/green][teal],[/teal][navy]$_[/navy][teal],[/teal][purple]3[/purple][teal]);[/teal]
  [b]push[/b] @[teal]{[/teal][navy]$file_hash[/navy][teal]{[/teal][green][i]"$key\t$status"[/i][/green][teal]}}[/teal][teal],[/teal][navy]$value[/navy][teal];[/teal]
[teal]}[/teal]

[b]foreach[/b] [b]my[/b] [navy]$key[/navy] [teal]([/teal] [b]keys[/b] [navy]%file_hash[/navy] [teal])[/teal] [teal]{[/teal]
  [b]if[/b] [teal]([/teal]scalar @[teal]{[/teal][navy]$file_hash[/navy][teal]{[/teal][navy]$key[/navy][teal]}}[/teal][teal]>[/teal][purple]1[/purple][teal])[/teal] [teal]{[/teal]
    [b]foreach[/b] [b]my[/b] [navy]$value[/navy] [teal]([/teal] @[teal]{[/teal][navy]$file_hash[/navy][teal]{[/teal][navy]$key[/navy][teal]}}[/teal] [teal])[/teal] [teal]{[/teal]
      [b]print[/b] [green][i]"$key -- $value\n"[/i][/green][teal];[/teal]
    [teal]}[/teal]
  [teal]}[/teal]
[teal]}[/teal]

Feherke.

http://free.rootshell.be/~feherke/

tonykent · Jan 28, 2010

Excellent Feherke. Thank you very much, that works perfectly.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Print duplicates from hash 1

tonykent

IS-IT--Management

feherke

Programmer

tonykent

IS-IT--Management

Similar threads

Part and Inventory Search

Sponsor