Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

advice on string handling

Status
Not open for further replies.

CADTenchy

Technical User
Dec 12, 2007
237
GB
I'm looking for advice on strategy and methods, hence selection of data type for a new project.

I'm starting an app that has to deal with two similar lists.

List 1 is small (of the order 100-1000 lines).
List 2 is large in the order of 10,000 lines and up.

All list data is strings (alphanumerics).

List 1 is generated by data entry into the app, and list 1 needs to be saved after each new line is entered, for data loss prevention.
Also, list 1 needs to be loaded from disk in such cases, to pick up where last were.

List 2 is a database essentially and only needs to be loaded at startup and referred to.

Typical list 1 lines could look like so:

13/03/08 A1AAA 10 234 10 021 cabbage
14/03/08 A1BB 11 234 10 329 sprouts
14/03/08 Z1ZAA 10 534 10 321 milk
15/03/08 A2ZAA 11 234 10 123 bread
18/03/08 C9AAA 23 994 10 371 pork

Typical list 2 lines could look like so:

A1AAA cabbage
A1BB sprouts
A2ZAA bread
C9AAA pork
Z1ZAA milk

The 'key' field is the A1AAA style of data.

Note list 2 is sorted by 'key' field, list 1 is sorted by entry date.


The user enters in the data to list 1 manually, except for the date which is inserted from the PC clock.
Loaded and entered data is displayed all the time, in a scrollable list.

As the user enters the 'key' field I want two similtaneous things to occur:

a. In a 'box' will appear matches that the entered data might be, filtering more accurately as more data is entered:

Entry Edit 'box'
A 13/03/08 A1AAA 10 234 10 021 cabbage
14/03/08 A1BB 11 234 10 329 sprouts
15/03/08 A2ZAA 11 234 10 123 bread

A1 13/03/08 A1AAA 10 234 10 021 cabbage
14/03/08 A1BB 11 234 10 329 sprouts

A1A 13/03/08 A1AAA 10 234 10 021 cabbage

A1AC no match

A1ACV no match


b. In another 'box' for each key press in 1 the box would show:

A1AAA cabbage
A1BB sprouts
A2ZAA bread

A1AAA cabbage
A1BB sprouts

A1AAA cabbage

no match


Currently I am loading the data from files, using loadfromfile to a stringlist, then populating a stringgrid and displaying the entered data there.
The striggrid is used to keep the fixed column headings as the list scrolls up.
New entered data complete lines are added to the stringgrid and stringlist, then the stringlist used to do a savetofile.

List 2 I plan to load into a stringlist.

The two 'boxes' in a. and b. I plan to use listboxes.

My questions are:

1. Am I using the correct string containers (the stringlists, stringgrid, and listboxes) or should I use/do something more suitable? (What?)
2. What is the best search routines and method for b. ? (I can see only a sequential 'for beginning to end' working for a.)
3. Is the shell short the most suitable algorithm for sorting list 2 data?

(apologies for long post!)

Steve (Delphi 2007 & XP)
 
1. This should be in a database. Using TStringLists will work of course, and is possibly easier to work with, but you're more likely to have issues with scaling, robustness and multi-user access.

2. B-tree, or the QuickFind algorithm. The list needs to be sorted on your key field, you essentially interrogate records starting at the midway point in your list. If that record is 'above' the key you're searching for, ignore the top half of the list (and vice-versa), and interrogate the new midway point in the remainder. Carry on until you find a match, or you only have one element left that doesn't match (not found).

3. QuickSort is the fastest sorting algorithm I know of for large lists. 10,000 elements qualifies.
 
Thanks for the reply Griffyn.

There won't be any multi user access in this application.

What do you mean exactly by scaling?

Regarding quicksort, this comment on delphi.about.com does concern me:
Note: in practice, the QuickSort becomes very slow when the array passed to it is already close to being sorted.

I think I need to better explain what will happen to list 2 data.
It will be set up with say 5-10K lines, and sorted.

Subsequently it will have blocks of say 100-1000 unsorted lines added to it.
It then needs resorting.

Bearing in mind the 90% of the list will already be sorted, do you think quicksort is still best choice?


Steve (Delphi 2007 & XP)
 
Hi again Griffyn.

Forget my question about quicksort and partially sorted files, as I don't think it's gonna be a problem.

I've implemented a quicksort function now (from here for those interested: )
and although I am on a CAD workstation with plenty of processing power admittedly, I'm more than happy with a 16mS sort time for 3500 test records.

Thanks for putting me straight on the sort tho...


Steve (Delphi 2007 & XP)
 
Regarding quicksort, this comment on delphi.about.com does concern me: Note: in practice, the QuickSort becomes very slow when the array passed to it is already close to being sorted.

This is true. Primarily, the sort is doing a lot of thrashing to get to the point where it will do any useful work if the data are almost sorted. It won't necessarily be of concern, though, for what you are doing.

By scaling, I think was meant the effectiveness of what you are doing if you increase the amounts of data involved.

I will make a suggestion regarding your sort and your application, though. As was stated, quicksort isn't the best thing for almost sorted data. Generally, insert sort is the champ for that one. And the approach can potentially be much better when it comes to a scenario like yours.

Think of it like if you're in a card game. You get dealt your hand, and then you sort it so you can easily work out what you have in your hand. Now let's say you get dealt another card. You don't drop your hand and start all over in sorting it. You search the cards you already sorted out (for suit or face value or whatever the rules of the game call for), and then you insert the card in its proper place. This is the general approach when it comes to insert sort.

While it seems you have a solution already at hand, it might be a worthwhile exercise to try for learning's sake to sort the records you have to add with your quicksort algorithm and then apply what I described to place them in your main group of records. You might end up with something faster. You might not. But always worth trying if you don't know.
 
This is best suited to a database application since you can have the database and components do the lookup/incremental search tasks for you. If you want to get into the mechanics of sorting by all means do it. You'll learn something valuable. I consider myself an 'application' developer and not a 'nuts and bolts' programmer. I basically work on the business logic and let other components, function libraries take care of what I don't know. I let faster and faster hardware pick up the slack. :)

Even single user applications can use databases. Your data entry sounds like a DB grid would do the trick and your data could be saved to the hard drive after each record is added by the user.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top