Finding and deleting of entries from a huge text file..

jcon11 · Jul 15, 2007

Hello there,

I am trying to write a perl script which helps me to delete duplicate records from a text file. The problem is, that the duplicate records are not 'completely' duplicate only in specific fields of the record...

Sample DATA of text file:

__DATA__
2131123 677778 152707011 9293821001011 8171719 1002
8272911 729191 173501010 617111231510101 2381719 0002
8137718 677778 152707011 9928382002933 8171719 1005

In this example, I want to delete the third or first row, because the position @data[8..13] "677778" and @data[40..47] "8171719" are the same.

If I use split // I am able to select a specific position.. problem is how do I compare it with all the different Lines?

Thanks.
Any help would be greatly appreciated.

MillerH · Jul 15, 2007

Put these two faq questions together:

perlfaq5 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?

http://tinyurl.com/g6l3q

perlfaq4 How can I remove duplicate elements from a list or array?

http://tinyurl.com/a7dqr

- Miller

KevinADC · Jul 15, 2007

jcon11,

what have you tried so far? Also, your explanation is confusing:

In this example, I want to delete the third or first row, because the position @data[8..13] "677778" and @data[40..47] "8171719" are the same.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Finding and deleting of entries from a huge text file..

jcon11

ISP

MillerH

Programmer

KevinADC

Technical User

Similar threads

Part and Inventory Search

Sponsor