Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

perl eqivalend of uniq -c by key

Status
Not open for further replies.

pp56825

Programmer
Feb 20, 2004
2
PL
I have file with some data ~ delimiter
Like :
291207~12:13:12~abcd 112233~aaaaa
291207~12:14:12~abcd 112233~aaaaa
291207~12:14:12~abdf 112233~aaaaa
291207~12:24:12~abdf 112233~bbbb

I want to have on output lines count but only for uniq column 3 , other columns are not important but I want to display all columns for latest lines based on sort for column 2 [ time ]

Something like this :
2 291207~12:24:12~abdf 112233~bbbb
2 291207~12:14:12~abcd 112233~aaaaa

I use some of your code but it is not that what I need:
while(<INLOG>) {
$lines{$_}++;
}
foreach $line (sort keys %lines) {
printf OUTLOG "%7d %s", $lines{$line}, $line;
}
 
Assuming the time field is in 24 hour format with leading zeros for times less than 10. This is not well tested and may produce warnings. The "chomp" may not be necessary.

Code:
[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]strict[/green][red];[/red]
[black][b]use[/b][/black] [green]warnings[/green][red];[/red]
[gray][i]#use Data::Dumper;[/i][/gray]
[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%lines[/blue] = [red]([/red][red])[/red][red];[/red]
[olive][b]while[/b][/olive][red]([/red]<DATA>[red])[/red][red]{[/red]
    [url=http://perldoc.perl.org/functions/chomp.html][black][b]chomp[/b][/black][/url][red];[/red]
    [black][b]my[/b][/black] [blue]@fields[/blue] = [url=http://perldoc.perl.org/functions/split.html][black][b]split[/b][/black][/url][red]([/red][red]'[/red][purple]~[/purple][red]'[/red][red])[/red][red];[/red]
    [blue]$lines[/blue][red]{[/red][blue]$fields[/blue][red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red][red]{[/red]count[red]}[/red]++[red];[/red]
    [olive][b]if[/b][/olive] [red]([/red] [blue]$fields[/blue][red][[/red][fuchsia]1[/fuchsia][red]][/red] gt [blue]$lines[/blue][red]{[/red][blue]$fields[/blue][red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red][red]{[/red][red]'[/red][purple]time[/purple][red]'[/red][red]}[/red] [red])[/red] [red]{[/red]
        [blue]$lines[/blue][red]{[/red][blue]$fields[/blue][red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red][red]{[/red][red]'[/red][purple]time[/purple][red]'[/red][red]}[/red] = [blue]$fields[/blue][red][[/red][fuchsia]1[/fuchsia][red]][/red][red];[/red]
        [blue]$lines[/blue][red]{[/red][blue]$fields[/blue][red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red][red]{[/red]line[red]}[/red] = [blue]$_[/blue][red];[/red]
    [red]}[/red]
[red]}[/red]
[gray][i]#print Dumper \%lines;[/i][/gray]
[olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$d[/blue] [red]([/red][url=http://perldoc.perl.org/functions/sort.html][black][b]sort[/b][/black][/url] [red]{[/red] [blue]$lines[/blue][red]{[/red][blue]$b[/blue][red]}[/red][red]{[/red][red]'[/red][purple]time[/purple][red]'[/red][red]}[/red] cmp [blue]$lines[/blue][red]{[/red][blue]$a[/blue][red]}[/red][red]{[/red][red]'[/red][purple]time[/purple][red]'[/red][red]}[/red] [red]}[/red] [url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%lines[/blue] [red])[/red] [red]{[/red]
    [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$lines[/blue]{[blue]$d[/blue]}{line}[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]	 
[teal]__DATA__[/teal]
[teal]291207~12:13:12~abcd 112233~aaaaa[/teal]
[teal]291207~12:24:12~abdf 112233~bbbb[/teal]
[teal]291207~12:14:12~abcd 112233~aaaaa[/teal]
[teal]291207~12:14:12~abdf 112233~aaaaa[/teal]
[teal]291207~13:13:12~abcd 112233~aaaaa[/teal]
[teal]291207~00:24:12~abdf 112233~bbbb[/teal]
[teal]291207~23:14:12~abcd 112233~aaaaa[/teal]
[teal]291207~12:24:13~abdf 112233~aaaaa[/teal]
[tt]------------------------------------------------------------
Pragmas (perl 5.8.8) used :
[ul]
[li]strict - Perl pragma to restrict unsafe constructs[/li]
[li]warnings - Perl pragma to control optional warnings[/li]
[/ul]
[/tt]

You can add the formatting you desire when printing to file.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
If this is a really big file you may need to try more efficient methods of hashing and sorting the data.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
If you are new to perl, there are a few techniques that you need to learn to be able to do this problem. Nevertheless, the below code should work, or at least get you further on your path.

Code:
[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]strict[/green][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]$file[/blue] = [red]'[/red][purple]data.txt[/purple][red]'[/red][red];[/red]

[gray][i]# Simple array of parsed lines (in order of arrival)[/i][/gray]
[black][b]my[/b][/black] [blue]@data[/blue] = [red]([/red][red])[/red][red];[/red]

[gray][i]# Marking first seen place for each third key[/i][/gray]
[black][b]my[/b][/black] [blue]%first_seen[/blue] = [red]([/red][red])[/red][red];[/red]

[url=http://perldoc.perl.org/functions/open.html][black][b]open[/b][/black][/url][red]([/red][black][b]my[/b][/black] [blue]$fh[/blue], [blue]$file[/blue][red])[/red] or [url=http://perldoc.perl.org/functions/die.html][black][b]die[/b][/black][/url] [red]"[/red][purple]Can't open [blue]$file[/blue]: [blue]$![/blue][/purple][red]"[/red][red];[/red]
[olive][b]while[/b][/olive] [red]([/red]<[blue]$fh[/blue]>[red])[/red] [red]{[/red]
	[url=http://perldoc.perl.org/functions/chomp.html][black][b]chomp[/b][/black][/url][red];[/red]
	[url=http://perldoc.perl.org/functions/push.html][black][b]push[/b][/black][/url] [blue]@data[/blue], [red][[/red][url=http://perldoc.perl.org/functions/split.html][black][b]split[/b][/black][/url] [red]'[/red][purple]~[/purple][red]'[/red][red]][/red][red];[/red]
	
	[gray][i]# Format time (assume military time, but not necessarily[/i][/gray]
	[gray][i]# with hours as padded 2 digits).  This is important in[/i][/gray]
	[gray][i]# order to sort using simple string comparison[/i][/gray]
	[blue]$data[/blue][red][[/red]-[fuchsia]1[/fuchsia][red]][/red][red][[/red][fuchsia]1[/fuchsia][red]][/red] = [url=http://perldoc.perl.org/functions/sprintf.html][black][b]sprintf[/b][/black][/url] [red]'[/red][purple]%08s[/purple][red]'[/red], [blue]$data[/blue][red][[/red]-[fuchsia]1[/fuchsia][red]][/red][red][[/red][fuchsia]1[/fuchsia][red]][/red][red];[/red]
	
	[blue]$first_seen[/blue][red]{[/red][blue]$data[/blue][red][[/red]-[fuchsia]1[/fuchsia][red]][/red][red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red] ||= [blue]@data[/blue][red];[/red]
[red]}[/red]

[gray][i]# Sort data[/i][/gray]
[blue]@data[/blue] =
	[url=http://perldoc.perl.org/functions/sort.html][black][b]sort[/b][/black][/url] [red]{[/red][blue]$first_seen[/blue][red]{[/red][blue]$a[/blue]->[red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red] <=> [blue]$first_seen[/blue][red]{[/red][blue]$b[/blue]->[red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red] || [blue]$b[/blue]->[red][[/red][fuchsia]1[/fuchsia][red]][/red] cmp [blue]$a[/blue]->[red][[/red][fuchsia]1[/fuchsia][red]][/red][red]}[/red]
	[blue]@data[/blue][red];[/red]

[gray][i]# Filter Unique[/i][/gray]
[black][b]my[/b][/black] [blue]%unique[/blue][red];[/red]
[blue]@data[/blue] = [url=http://perldoc.perl.org/functions/grep.html][black][b]grep[/b][/black][/url] [red]{[/red]! [blue]$unique[/blue][red]{[/red][blue]$_[/blue]->[red][[/red][fuchsia]2[/fuchsia][red]][/red][red]}[/red]++[red]}[/red] [blue]@data[/blue][red];[/red]

[olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$record[/blue] [red]([/red][blue]@data[/blue][red])[/red] [red]{[/red]
	[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [url=http://perldoc.perl.org/functions/join.html][black][b]join[/b][/black][/url] [red]'[/red][purple]~[/purple][red]'[/red], [blue]@$record[/blue][red];[/red]
	[black][b]print[/b][/black] [red]"[/red][purple][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]

[fuchsia]1[/fuchsia][red];[/red]

[teal]__END__[/teal]
[tt]------------------------------------------------------------
Pragmas (perl 5.8.8) used :
[ul]
[li]strict - Perl pragma to restrict unsafe constructs[/li]
[/ul]
[/tt]

- Miller
 
and it just shows how confusing perl can be, two very different approaches to the same problem.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top