Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

I'm writing a little search engine.

Status
Not open for further replies.

transparent

Programmer
Sep 15, 2001
333
GB
I'm writing a little search engine.

I have a collection of names (an arraylist) which I want to compare with user input.

So, for example my list may contain:

Sarah
Susie
Rick
Rob
Frank
Ted

So a user types in a name, and the system compares the user input with the list. So if the user types in Ric or Rik or Rick, I want it to find Rick

This will not be sql based, but .net code based.

Are there any general methodologies that I should be looking into?
 
Sort the list first Alphabetically. Then use some sort of search/sort on the list to find the records closest to your match.

For instance:

Sarah
Susie
Rick
Rob
Frank
Ted

Sort Alphabetically

Frank
Rick
Rob
Sarah
Susie
Ted

Perform a sort/search. My preference is the UP/Down Bubble(?) Sort.

Search For "Sar"

Frank
Rick
Rob <- Before or after this record
Sarah
Susie
Ted

After

Frank
Rick
Rob -----------------
Sarah
Susie <- Before or after
Ted

Before

Frank
Rick
Rob -----------------
Sarah <- Found it
Susie -----------------
Ted

So take the length of the whole list and divide it by 2 and make that your start position. If the record comes before, then divide the first half by 2 again. If it's after, divide the first half by 2 again and add the original half value.

Hope that gets you started.
 
Store the list alphabetically in hidden list box and on keyup event in the textbox where the user inputs get the character you want to search and when he clicks the search button retrieve the name from the hidden list box
 
why would you put it in a list box? That's just added memory to deal with. Stick with your arraylist.
 
Use two lists. One with all possible names plus their common misspellings that points to an item in the 2nd list, which contains the real data.

Chip H.


____________________________________________________________________
If you want to get the best response to a question, please read FAQ222-2244 first
 
Have implemented the Levenshtein Distance fuzzy logic method.

Works quite well.

Thanks for everyone input.

Rick
 
I just looked up that logic. It personally would not have been my choice but I'm glad it's working for you.
 
Any particular reason?

Anyway, I've moved on to a Jaro Winkler distance metric

Much better.
 
That one looks a little better.

I didn't like the amount of work that was happening with the Levenshtein Distance method. It just seemed like a waste. The idea of looping through an entire list to find results is generally a bad idea. Could you imagine if a database with 2 million records went through every record to try and find results? Ebay would take an hour to do a single search.
 
True, but thankfully I know that my record set will be limited to about 10,000 records.

Because I'm coding with direct memory access (a bit unsafe but what the hey), I managed to get execution time for 10,000 records to 0.016 seconds.

Which isn't bad :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top