I have a huge (500mg and growing) text file which is fixed field length with no delimiters.
I need to determine is the 28-->77 characters are ever repeated in a subsequent row/record, and if so, spit them and the entire record it is contained in out to another file leaving only one occurence of a record with the first occurence of the 48.
It seems like a very tedious process to grab the string and search the entire file for another occurence of record containing it, but maybe this is the only way?
Basically, data cleanup prior to use in a datawarehouse application.
I had been told I could do all kinds of magical tricks with SQL temp files and such, but all of it has left a bad taste. I was hoping someone over here who works with text files could point me in a cleaner and better solution.
Thanks.
I need to determine is the 28-->77 characters are ever repeated in a subsequent row/record, and if so, spit them and the entire record it is contained in out to another file leaving only one occurence of a record with the first occurence of the 48.
It seems like a very tedious process to grab the string and search the entire file for another occurence of record containing it, but maybe this is the only way?
Basically, data cleanup prior to use in a datawarehouse application.
I had been told I could do all kinds of magical tricks with SQL temp files and such, but all of it has left a bad taste. I was hoping someone over here who works with text files could point me in a cleaner and better solution.
Thanks.