Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Loops That Kill

Status
Not open for further replies.

TheRambler

Programmer
Jan 23, 2003
523
BO
Searching a String for the occurance of specific words...
thread184-961186

In that thread tjc240e had a list of 14 words he wanted to check for their occurrance in a multiple word field. Many suggestions were given like: use GETWORDCOUNT() and GETWORDNUM(), INLIST(), $ operator, AT(), or ATC(). In the end tjc240e got his solution but a dangling thought remained in my mind, should I compare the list of words against the words in the field? or the other way around?
Stella said:
...it may be not wise to look up your words in the list, much better would be to check if the words from the list are contained in your Last Name string.
That made sense to me, so I decided to test it. I created a reference table with the list of words we are looking for, and a clients table with cliend_id n(7), client_nam c(45), party_type c(1) fields. All the names (words) are in uppercase and don't contain periods, commas, quotation marks or multiple spaces between words. I'm using VFP7.
Code:
[COLOR=green]* using $ operator[/color]
SET TALK OFF
USE reference
cStrOrg = "#"
SCAN
   cStrOrg = cStrOrg + TRIM(word) + "#"
ENDSCAN
USE test
SCAN
   = ALINES(aWords, TRIM(client_nam), .t., " ")
   FOR i = 1 TO ALEN(aWords,1)
      IF "#"+aWords(i)+"#"$cStrOrg
         REPLACE party_type WITH "O"
         EXIT
      ENDIF
   ENDFOR
ENDSCAN
The $ operator seems to be the best method, is accurate and fast.

Code:
[COLOR=green]* using AT() function[/color]
SET TALK OFF
SELECT word FROM reference INTO ARRAY aWords
nWords = ALEN(aWords,1)
USE test
SCAN
   FOR i = 1 TO nWords
      IF AT(TRIM(aWords(i)),client_nam) > 0
         REPLACE party_type WITH "O"
         EXIT
      ENDIF
   ENDFOR
ENDSCAN
AT() function is also fast but produces incorrect results, as dbMark advised, checking for "CHURCH" would return "WINSTON CHURCHILL" as an organization.

Code:
[COLOR=green]* using OCCURS() function[/color]
SET TALK OFF
SET EXACT ON && must be set ON
SELECT word FROM reference INTO ARRAY aWords
nWords = ALEN(aWords,1)
USE test
SCAN
   FOR i = 1 TO nWords
      IF OCCURS(TRIM(aWords(i)),client_nam) > 0
         REPLACE party_type WITH "O"
         EXIT
      ENDIF
   ENDFOR
ENDSCAN
Similar to using AT(), takes longer and yields wrong results for the same reason.

Code:
[COLOR=green]* using INLIST() function[/color]
SET TALK OFF
SET EXACT ON && must be set ON
USE reference
cStrOrg = '"'
SCAN
   cStrOrg = cStrOrg + TRIM(word) + '","'
ENDSCAN
cStrOrg = LEFT(cStrOrg, LEN(cStrOrg)-2)
USE test
SCAN
   = ALINES(aWords, TRIM(client_nam), .t., " ")
   FOR i = 1 TO ALEN(aWords,1)
      IF INLIST(aWords(i), &cStrOrg)
         REPLACE party_type WITH "O"
         EXIT
      ENDIF
   ENDFOR
ENDSCAN
Takes twice as much time than OCCURS() and is limited to 24 words.

Code:
[COLOR=green]* using GETWORDCOUNT() and GETWORDNUM functions[/color]
SET TALK OFF
SELECT word FROM reference INTO ARRAY aWords
nWords = ALEN(aWords,1)
USE test
SCAN
   FOR i = 1 TO nWords
      FOR j = 1 TO GETWORDCOUNT(client_nam)
         IF GETWORDNUM(client_nam,j) == TRIM(aWords(i))
            REPLACE party_type WITH "O"
            EXIT
         ENDIF
      ENDFOR
   ENDFOR
ENDSCAN
This is the slowest of all, I have tried with 10, 100 and 1000 reference words and it gets really slow.

So my conclusion is that Ramani's suggestion was the best after all, unless you can come up with a better option, maybe I didn't implement all suggestions properly. Thanks for sharing your thoughts and sorry for a long post.
 
Now I realize I forgot about ASCAN().
Code:
[COLOR=green]* using GETWORDCOUNT() and GETWORDNUM() functions[/color]
SELECT word FROM reference INTO ARRAY aWords
nWords = ALEN(aWords,1)
USE test
SCAN
  FOR i = 1 TO GETWORDCOUNT(client_nam)
    IF ASCAN(aWords,GETWORDNUM(client_nam,i)) != 0
      REPLACE party_type WITH "O"
      EXIT
    ENDIF
  ENDFOR
ENDSCAN
Using ASCAN() solves the double loop problem, but still the $ operand is much faster.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top