Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Name Parsing

Status
Not open for further replies.

DChalom

Programmer
Jun 8, 2002
59
US
I need a name parsing routine to pull out the last name out of a name string.
The name could be as simple as "Joe Smith".
Or as long as "Mr. Joe R. Smith Jr. and Mrs. Sue L. Allen"
And in both these cases I need atleast "Smith" and it would be nice if i could get "SmithJR".


Dorian C. Chalom
 
Yes i did and the one thread i found someone suggested i start a new thread on this. The thread you gave me is this thread....

Dorian C. Chalom
 
Hi DChalom

These are purely at random via the old keyword search (I notice there are links within the posts on the threads)

Hope it helps you


Good luck
Lee......

VisFox Version 6 User / Windows ME
 
Dorian,

Obviously, there will be no 100 percent reliable way of doing this. However, a simple strategy would go something like this:

lnCount = GETWORDCOUNT(lcFullName) && Gets number of words
lcLast = GETWORDNUM(lcFullName,lnCount) && Gets the last word
IF INLIST(lcLast, "Jnr", "Sr", "I", "II", "III", "PhD")
lcLast = GETWORDNUM(lcFullName, lnCount - 1)
ENDIF

Will need a lot of fine-tuning, but it might be a good start.

Mike


Mike Lewis
Edinburgh, Scotland

My Visual Foxpro web site: My Crystal Reports web site:
 
I am using VFP 6.0 so i do not know if GetWordCount() is a function of it. But I understand what it does so it can't be that hard to create. Mainly counts the number of spaces in a string and adds 1 to it.

Dorian C. Chalom
 
Check out ParseWare at the link below. ParseWare is a parseing library written in native Delphi code. New with version 1.5 is the ParseWare API and the FoxPro wrapper to the API. Parsing of name, address, city/state/zip and case conversion are all supported.

Good Luck

 
Perhaps this is just my simple mind thinking out loud again, but here is a solution that may be pretty simple.

Assuming that the only suffix of a name will be Jr., III, IV, V, VI, VII, VIII, IX, or X (and I don't know very many people who are named John Doe IX!). You can count the characters from the right to the first space. You can then sparse out the characters to the right of the space and check to see if they match any of the afore mentioned suffixes. If they do, you can count to the second space from the right and sparse out all characters to the right of that space. Trim the value of that and you will have DoeJR. If the first sparse doesn't match any of the suffixes then it is safe to assume that is the last name.

Is this making any sense?

If you still need help let me know and I'll post the code.

-Kevin
 
Dorian,

Here's a routine that I've used for name parsing. Depending upon the content of your files you may need to do some tweaking of the Prefix and Suffix sections.

Steve

** FILE: PARSE_IT

** Split out the pieces of a name into title, first, middle, last, suffix
LPARAMETERS tcfullname

LOCAL lcThisName, lcAmpersand, nSpaceCount, nSpacePos, nLastSpacePos
** Presumes that 5 public variables have been previously defined as follows:
*- gcPrefix (Mr., Mrs., etc.), gcSuffix (Jr., Sr., etc.), gcFirst, gcMiddle, gcLast

*- Initialize the variables
STORE "" TO gcPrefix, gcSuffix, gcFirst, gcMiddle, gcLast
STORE 0 TO nSpaceCount, nSpacePos, nLastSpacePos
lcAmpersand = CHR(38)
lcThisName = ALLTRIM(tcfullname)

** Isolate any suffixes first.
DO CASE
** Trap for some professional/business suffixes.
CASE RIGHT(UPPER(lcThisName),4) = 'M.D.'
gcSuffix = 'M.D.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-4)
CASE RIGHT(UPPER(lcThisName),2) = 'MD'
gcSuffix = 'M.D.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-2)
CASE RIGHT(UPPER(lcThisName),4) = 'D.O.'
gcSuffix = 'D.O.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-4)
CASE RIGHT(UPPER(lcThisName),3) = 'CPA'
gcSuffix = 'CPA'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)
CASE RIGHT(UPPER(lcThisName),6) = 'C.P.A.'
gcSuffix = 'CPA'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-6)
CASE RIGHT(UPPER(lcThisName),4) = 'ESQ.'
gcSuffix = 'Esq.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-4)
CASE RIGHT(UPPER(lcThisName),3) = 'ESQ'
gcSuffix = 'Esq.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)
CASE RIGHT(UPPER(lcThisName),3) = 'DDS'
gcSuffix = 'DDS'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)

** Trap for some comon familial continuation suffixes.
CASE RIGHT(UPPER(lcThisName),3) = 'SR.'
gcSuffix = 'Sr.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)
CASE RIGHT(UPPER(lcThisName),3) = ' SR'
gcSuffix = 'Sr.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)
CASE RIGHT(UPPER(lcThisName),3) = 'Jr.'
gcSuffix = 'Jr.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)
CASE RIGHT(UPPER(lcThisName),3) = ' JR'
gcSuffix = 'Jr.'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)
CASE RIGHT(UPPER(lcThisName),3) = ' II'
gcSuffix = 'II'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)
CASE RIGHT(UPPER(lcThisName),4) = ' III'
gcSuffix = 'III'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-4)
CASE RIGHT(UPPER(lcThisName),3) = ' IV'
gcSuffix = 'IV'
lcThisName = LEFT(lcThisName,LEN(lcThisName)-3)

OTHERWISE
gcSuffix = ''
ENDCASE

lcThisName = ALLTRIM(lcThisName)
** Trim off any trailing comma
IF RIGHT(lcThisName,1) = ','
m.name_len = LEN(lcThisName)
lcThisName = LEFT(lcThisName,m.name_len-1)
ENDIF

** Now, isolate any prefixes
DO CASE
CASE OCCURS(lcampersand,lcThisName)#0
nSpacePos = AT(" ",lcThisName,3)
gcPrefix = ALLTRIM(LEFT(lcThisName,nSpacePos))
lcThisName = ALLTRIM(SUBSTR(lcThisName,nSpacePos))
CASE LEFT(UPPER(lcThisName),3) = 'DR.' OR LEFT(UPPER(lcThisName),3) = 'DR '
gcPrefix = 'Dr.'
lcThisName = ALLTRIM(SUBSTR(lcThisName,4))
CASE LEFT(UPPER(lcThisName),4) = 'Rev.' OR LEFT(UPPER(lcThisName),4) = 'Rev '
gcPrefix = 'Rev.'
lcThisName = ALLTRIM(SUBSTR(lcThisName,5))
CASE LEFT(UPPER(lcThisName),3) = 'Fr.' OR LEFT(UPPER(lcThisName),3) = 'Fr '
gcPrefix = 'Fr.'
lcThisName = ALLTRIM(SUBSTR(lcThisName,4))
CASE LEFT(UPPER(lcThisName),3) = 'MR.' OR LEFT(UPPER(lcThisName),3) = 'MR '
gcPrefix = 'Mr.'
lcThisName = ALLTRIM(SUBSTR(lcThisName,4))
CASE LEFT(UPPER(lcThisName),3) = 'MS.' OR LEFT(UPPER(lcThisName),3) = 'MS '
gcPrefix = 'Ms.'
lcThisName = ALLTRIM(SUBSTR(lcThisName,4))
CASE LEFT(UPPER(lcThisName),4) = 'MRS.' OR LEFT(UPPER(lcThisName),4) = 'MRS '
gcPrefix = 'Mrs.'
lcThisName = ALLTRIM(SUBSTR(lcThisName,5))
CASE LEFT(UPPER(lcThisName),5) = 'MISS '
gcPrefix = 'Miss'
lcThisName = ALLTRIM(SUBSTR(lcThisName,6))
ENDCASE

** Isolate the first name
nSpacePos = AT(' ',lcThisName)
IF nSpacePos#0
gcFirst = LEFT(lcThisName,nSpacePos-1)
lcThisName = ALLTRIM(SUBSTR(lcThisName,nSpacePos+1))
ENDIF

** Now separate the last name from any middle name(s)
nSpaceCount = OCCURS(' ',lcThisName)
DO CASE
CASE nSpaceCount = 0
** A single last name, if that
gcLast = lcThisName
CASE nSpaceCount = 1
** Just a middle and last name
nSpacePos = AT(' ',lcThisName)
gcMiddle = LEFT(lcThisName,nSpacePos-1)
gcLast = SUBSTR(lcThisName,nSpacePos+1)
OTHERWISE
** Presume Multiple Middle Names
nLastSpacePos = AT(' ',lcThisName,nSpaceCount)
gcMiddle = ALLTRIM(LEFT(lcThisName,nLastSpacePos))
gcLast = ALLTRIM(SUBSTR(lcThisName,nLastSpacePos))
ENDCASE

** If the value passed in was in all upper case, convert any prefix or suffix values to upper case.
IF AT(UPPER(gcFirst),tcFullName) # 0
gcPrefix = UPPER(gcPrefix)
gcSuffix = UPPER(gcSuffix)
ENDIF
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top