using offset within ascii text file instead off reading from start

rambleon · Sep 19, 2005

Hi
I have an ascii text file of monthly payslips, the 1st line of each payslip starts with an '*' followed by 131 spaces and CRLF, the 3rd line of the payslip contains a 9 digit ID number.
I need to build a file of the ID numbers and the start address (offset) within the file of each payslip, and then
using this file retreive the payslip for any given ID number
according to the offset within the file.
I'm new to vb.net so if anyone can give me some pointers or even better some basic coding for achieving these 2 tasks or
the address of any tutorials on the web it would be much appriciated.

Thanks

ThatRickGuy · Sep 19, 2005

If the data you want is always on the 3rd line, but in different positions, you could track the specific character position of the value on the 3rd line for each file. Then when you retrieve that value, use 3 readline commands. Store the 3rd value in a string. Then use the .substring(Position,9) method to grab the value.

-Rick

VB.Net Forum forum796 forum855 ASP.NET Forum
[monkey]

I believe in killer coding ninja monkeys. [monkey]

Ruffnekk · Sep 19, 2005

As far as I know there's no way to jump to a certain position within a textfile using a textreader/streamreader. You will have to browse through the file starting at position 0, until you reach whatever value you require and read from there on.

Regards, Ruffnekk
---
Is it my imagination or do buffalo wings taste just like chicken?

rambleon · Sep 19, 2005

Hi Ruffnekk
Seems strange that you can't open a file give an offset of n bytes or n lines and arrive directly to the n + 1 byte/line as your starting point for reading. There must be a function for doing this.

ThatRickGuy · Sep 19, 2005

You could open the file as a binary stream and grab the binary data based off of bit position. But I think it would be excessive overkill when compared to:

ts.readline
ts.readline
mystring = ts.readline
myvalue = mystring.substring(StartLoc,9)

-Rick

VB.Net Forum forum796 forum855 ASP.NET Forum
[monkey]

I believe in killer coding ninja monkeys. [monkey]

Ruffnekk · Sep 19, 2005

Like Rick suggests, you would really have to jump to the line by reading the other lines first and discarding those (skipping them). There really is no function to jump to line n and read from there on. You could off course easily write a function that skips a certain amount of lines by looping a series of readline statements, but there's no standard function available to jump to the nth byte immediately.

Regards, Ruffnekk
---
Is it my imagination or do buffalo wings taste just like chicken?

rambleon · Sep 20, 2005

Thanks for the help.
I have a couple of choices - I've done this sort of thing before, but I've put the whole print file into an mdb file
which is ok if I have 50 - 60 payslips per month, this time I have around 5000 per month which is 250,000 lines in a couple of years that's 6,000,000 records. Which is why I was thinking of keeping it as a text file. Any suggestions on what method to use to access the payslips directly ??

chrissie1 · Sep 20, 2005

only 6 million.

Christiaan Baes
Belgium

I just like this --> [Wiggle] [Wiggle]

ThatRickGuy · Sep 20, 2005

Put a solid database engine under it. Either SQL Server, Oracle, or what ever microsoft's desktop DB software is (I can't remember the acronym). Not sure how MySQL scales up, but it's another (free) option.

And remember proper design. There is no reason for a user to ever see all 6 million records at once. By only showing the user the record they are looking for you can greatly reduce the overhead on the client. And using a good DB engine can reduce the stress and inprove performance on querying the data for one specific element.

-Rick

VB.Net Forum forum796 forum855 ASP.NET Forum
[monkey]

I believe in killer coding ninja monkeys. [monkey]

rambleon · Sep 20, 2005

Rick thanks for your input, my question was which way to go, to put the whole print file into a database and retreive the lines as needed or to put only the ID Number, month, year and offset of the record within the print file, then like you suggest use a readline loop until I arrive to the offset for the ID Number?
I'd store each months print file seperatley so I wouldn't end up looping thru 6000000 records.

ThatRickGuy · Sep 20, 2005

I'd pull just the needed data from the provided files. I'm assuming you recieve a payslip text file every month. And that you need to track the ID of each payslip, and some information about it (amount, hours, employee, etc).

Personally, I would loop through the file, parse out each pay slip, and the information they contained, and store it all in a database.

That way, instead of looking at it as 250k lines or 6M records over time, you can look at it as 60k records per year, which can be stored in a much smaller file size.

-Rick

VB.Net Forum forum796 forum855 ASP.NET Forum
[monkey]

I believe in killer coding ninja monkeys. [monkey]

chrissie1 · Sep 20, 2005

store them in a dataset and then write them out as xml if you can't use a database. not that xml will be any faster it just makes you look good.

"We store our data in xml"

Christiaan Baes
Belgium

I just like this --> [Wiggle] [Wiggle]

Ruffnekk · Sep 20, 2005

You could also use a directory structure with text files for each month/year, thus reducing the size of each file. If you name the directories and files appropiately, it should be quite simple to determine which file is needed at a given time and you wouldn't have to loop through 6 million+ lines.

Regards, Ruffnekk
---
Is it my imagination or do buffalo wings taste just like chicken?

rambleon · Sep 21, 2005

Hi I appriciate your help.
The system is for employees to by able to retreive their past payslips, I have the print file for each month and I just wrap the requested lines - without any changes to the format (so I don't want to parse the records and rebuild them) - in the appropriate html and css and show it in a browser using a payslip gif image as background.
If I can open the files and loop through 200K records per month within a tea-break (say 3-5 seconds), that's OK. If not I'll just have to put the whole print file into a database which seems a bit of an overkill.
If I load it into a database does it make a difference in perfomance if I store each month in a seperate table, or combine all the months together in one table.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

using offset within ascii text file instead off reading from start

rambleon

Programmer

ThatRickGuy

Programmer

Ruffnekk

Programmer

rambleon

Programmer

ThatRickGuy

Programmer

Ruffnekk

Programmer

rambleon

Programmer

chrissie1

Programmer

ThatRickGuy

Programmer

rambleon

Programmer

ThatRickGuy

Programmer

chrissie1

Programmer

Ruffnekk

Programmer

rambleon

Programmer

Similar threads

Part and Inventory Search

Sponsor