Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Split combined Name column 5

Status
Not open for further replies.

sparkbyte

Technical User
Sep 20, 2002
879
US
I need to split a name field.

Contacts
ContactID ContactName ContactPhone ContactFax ContactCellPhone ContactEmail OfficeCode Program
5041 Smith, Jane M 866-555-1212 866-555-1313 ADM USR
7389 John.Jones@somewhere.com 866-555-2345 866-555-5678 ITS PGP


basicaly something like split [Smith, Jane M] into
lastname = Smith ; firstname = Jane M
or if
Name like *@* then split Jane.smith@somewhere.com
into ContactLastName = Jones ; ContactFirstName = John
AND
ContactEmail = John.Jones@somewhere.com

Can someone assist??

I can do this quite easily in VBS but just need a start on how to do this in SQL.

Code:
Declare @Temp Table(ContactID  int, ContactName VarChar(50))
Insert Into @Temp Values(5041,'Alfred, Leslie M')
Insert Into @Temp Values(7389,'Jane.smith@somewhere.com')

Select  ContactId,
        ContactName,
        Case When ContactName Like '%@%' 
                  Then Left(ContactName, CharIndex('.', ContactName)-1)
             When ContactName Like '%,%' Then Left(ContactName, CharIndex(',', ContactName)-1)
             End As FirstName,
        Case When ContactName Like '%@%'
                  Then SubString(ContactName, CharIndex('.', ContactName)+1, CharIndex('@', ContactName)-CharIndex('.', ContactName)-1)
             When ContactName Like '%,%'
                  Then Right(ContactName,Len(ContactName)-CharIndex(',', ContactName)-1)
             End As LastName,
        Case When ContactName Like '%@%' Then ContactName End As EmailAddress
From    @Temp

This worked Great.

Then found email addresses without periods.

Fix

Code:
Select  
        ContactName,
        Case When ContactName Like '%@%' and ContactName Like '%.%'
                  Then Left(ContactName, CharIndex('.', ContactName)-1)
             When ContactName Like '%,%' Then Left(ContactName, CharIndex(',', ContactName)-1)
             End As ContactFirstName,

        Case When CharIndex('@', ContactName) > CharIndex('.', ContactName)
                  Then SubString(ContactName, CharIndex('.', ContactName)+1, CharIndex('@', ContactName)-CharIndex('.', ContactName)-1)
             When ContactName Like '%,%'
                  Then Right(ContactName,Len(ContactName)-CharIndex(',', ContactName)-1)
             End As ContactLastName,

        Case When ContactName Like '%@%' Then ContactName End As ContactEmail

from dbo.Contacts

Then needed a way to update the table with the split data.


suggested updating one column at a time as so.

Code:
--  Extract eMaiil Addresses from ContactName
Update  dbo.Contacts
Set     ContactEmail = Case When ContactName Like '%@%' Then ContactName End
                       Where   ContactEmail is NULL or ContactEmail = '






Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
Now I seem to missing how to properly convert the ContactFirstName and ContactLastName case statements into an Update.

Code:
-- Split ContactName and get the ContactFirstName
Update  dbo.Contacts
	Set ContactFirstName =	Case When ContactName Like '%@%' and ContactName Like '%.%'
							Then Left(ContactName, CharIndex('.', ContactName)-1) 
							When ContactName Like '%,%' Then Left(ContactName, CharIndex(',', ContactName)-1) 
							End

-- Split ContactName and get the ContactLastName
Update  dbo.Contacts
	Set ContactLastName =	Case When CharIndex('@', ContactName) > CharIndex('.', ContactName)
								Then SubString(ContactName, CharIndex('.', ContactName)+1, 
								CharIndex('@', ContactName)-CharIndex('.', ContactName)-1)
							When ContactName Like '%,%'
								Then Right(ContactName,Len(ContactName)-CharIndex(',', ContactName)-1)
							End

Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
Seems ok to me. Are you getting an error, or is it just not updating anything?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
1st - Give you back your star...


I get this..

Code:
Command(s) completed successfully.


[red]Msg 8152, Level 16, State 6, Line 2
String or binary data would be truncated.[/red]
The statement has been terminated.

(0 row(s) affected)



Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
Over looking the obvious.

needed to verify the column width.



Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
How can I pull a middle initial when I have data like this?

John J (287G)
Douglas (287G)
John Theodore "Teddy"
John
Janie NMN


This gets close.
Code:
Select  ContactName, ContactFirstName,
        substring(ContactFirstName,charindex(' ',ContactFirstName)+1,
		len(ContactFirstName)) as ContactMiddleName
from dbo.Contacts
where ContactFirstName <> substring(ContactFirstName,
			  charindex(' ',ContactFirstName)+1,
			  len(ContactFirstName))
and	substring(ContactFirstName,charindex(' ',ContactFirstName)+1,
	len(ContactFirstName))	Not Like '(%)'  
and	substring(ContactFirstName,charindex(' ',ContactFirstName)+1,
	len(ContactFirstName))	Not Like 'sr'
and	substring(ContactFirstName,charindex(' ',ContactFirstName)+1,
	len(ContactFirstName))	Not Like 'jr'
and	substring(ContactFirstName,charindex(' ',ContactFirstName)+1,
	len(ContactFirstName))	Not Like 'iii'
and	substring(ContactFirstName,charindex(' ',ContactFirstName)+1,
	len(ContactFirstName))	Not Like 'waiver'

but is still have entries like this.
Code:
ContactFirstName	ContactMiddleName
Devetta E (287G)	E (287G)
John W.			W.

How can I strip the entries like (287G) and the trailing period from the ContactMiddleName?


Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
Time to learn about PatIndex.

PatIndex is similar to charindex in some ways, but different in others.

Both functions will return the position of the search term within a string.

The differences...

CharIndex has an optional 3 parameter to identify where (within the string) to start looking for a match.

PatIndex allows you to perform a 'fuzzy' search where rudimentary pattern matching is used. This pattern matching is similar to the LIKE clause. In fact, if you want to learn what type of matching is possible, look up LIKE in books on line.

PatIndex is useful here because it allows us to find the first character in the string that is not alphabetic (a through z). We can use this to our advantage by returning data to the left of the first non-alphabetic character.

Ex:

Select PatIndex('%[^a-z]%', 'E (287G)')

The query above returns 2 because it is the 2nd character in 'E (287G)' that is not alphabetic. Now, combining this with the LEFT function...

[tt][blue]
Select [!]Left('E (287G)', [/!]PatIndex('%[^a-z]%', 'E (287G)')[!])[/!]
[/blue][/tt]
The query above actually returns E<space>. To remove the space, we need to subtract 1.

[tt][blue]
Select Left('E (287G)', PatIndex('%[^a-z]%', 'E (287G)')[!]-1[/!])
[/blue][/tt]

Looks pretty cool, huh? Unfortunately there is a problem with this code, too. If the string does not contain any non-alphabetic characters, PatIndex returns 0.

[tt][blue]
Select PatIndex('%[^a-z]%', 'John')
[/blue][/tt]

Now, when we use it with the LEFT function...

[tt][blue]
Select [!]Left('John', [/!]PatIndex('%[^a-z]%', 'John')[!]-1)[/!]
[/blue][/tt]

We get an error because PatIndex evaluates to 0 and you end up with... Left(String, 0-1). We can fix this problem by forcing the PatIndex to "find" something. Like this...

[tt][blue]
Select Left('John', PatIndex('%[^a-z]%', 'John' [!]+ '.'[/!])-1)
[/blue][/tt]

So... putting it all together...

Code:
Declare @Temp Table(ContactMiddleName VarChar(20))

Insert Into @Temp Values('E (287G)')
Insert Into @Temp Values('W.')
Insert Into @Temp Values('John')
Insert Into @Temp Values(' George')


Select Left(ContactMiddleName, PatIndex('%[^a-z]%', ContactMiddleName + '.')-1)
From   @Temp

I encourage you to copy/paste the code above in to a query window. There is still a minor problem with it. See where I added a row for [tt]' George'[/tt] Since the data starts with a non-alpha character, the code above actually returns an empty string. Hopefully your data doesn't contain anything like that. But if it does....


I wrote the blog above to extract numbers from a string, even if the numbers are buried in the middle of it. Your task is the opposite, but if you understand the principles of how this works, you should be able to modify the code to suit your needs.


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
How about this.

Use left and right trim to remove any spaces pre or aft.

Code:
Declare @Temp Table(ContactMiddleName VarChar(20))

Insert Into @Temp Values('E (287G)')
Insert Into @Temp Values('W.')
Insert Into @Temp Values('John')
Insert Into @Temp Values(' George')
--Update @Temp 
--set ContactMiddleName = dbo.trim(ContactMiddleName)

Select Left(LTRIM(RTRIM(ContactMiddleName)), PatIndex('%[^a-z]%', LTRIM(RTRIM(ContactMiddleName)) + '.')-1)
From   @Temp



Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
Well sure... spaces are easy. And, actually, the RTrims are not necessary.

Insert Into @Temp Values('/George')

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Seems to work for everything now. I will have to test it against the dev cantacts table now..

Code:
Declare @Temp Table(ContactMiddleName VarChar(20))

Insert Into @Temp Values('E (287G)')
Insert Into @Temp Values('W.')
Insert Into @Temp Values('John')
Insert Into @Temp Values(' George')
Insert Into @Temp Values('Theadore"Ted"')
Insert Into @Temp Values('Theadore "Ted"')
Insert Into @Temp Values('John ')

Select Left(LTRIM(RTRIM(ContactMiddleName)), PatIndex('%[^a-z]%', LTRIM(RTRIM(ContactMiddleName)) + '.')-1)
From   @Temp

Code:
E
W
John
George
Theadore
Theadore
John


Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
Figures?!?

Found ContactName

John.Q.Public@domain.com

so 1st Name was entered as Q. and LastName as John

Bummer.

Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
I am not offering help here but I wanted to LOL at how hard we all have to work, and how creative we have to be in querying/reporting because of poor data submission standards....


-- Jason
"It's Just Ones and Zeros
 
@jdemmi,

I completely agree. Often times the reality is... we have no choice in the matter. Sometimes this data comes from a 3rd party that we have no control over, and sometimes we inherit crappy databases. It sounds like sparkbyte is attempting to correct this problem. Hopefully he will also be able to "fix" the source of the problem too.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
gmmastros...Oh yes...that's exactly what made me LOL...I support a couple of applications who have some great business logic and super robust functionality, but then drop the ball at perhaps the most inopportune time....data entry.

Thank goodness for forums/sites like this one!

-- Jason
"It's Just Ones and Zeros
 
George, this is EXACTLY what has happened!!!!

Inherited really awfull Acess Database that has been outgrown. I am attempting to upsize it to SQL.

The darn thing has EVERYTHING in 2 tables.

Activity and Contacts.

I have to get the data workable then normalize the tables as much as I can.

I have 'found' another puzzle.

I would like to fix the capitalization issues also.

I found a function that does a good job of correcting most of the data entry laziness.

ie. JOHN Q PUBLIC, John Q PUBLIC, and many other variations.

Problem is this. (Leave it to the women out there to trip me up) – No offence intended!!

The script caps the first letter and after a space. This works for everything except…
JOHANSON-SMITH
After the script it becomes, Johanson-smith when it should be Johanson-Smith

Here is the Code for the Function

Code:
--**************************************
--     
-- Name: Capitalize First Letter
-- Description:This is a User Defined Fu
--     nction (UDF) that will capitalize the fi
--     rst letter of a character after a space.
--      It will also lowercase all capitalized 
--     words. This will only capitalize the fir
--     st letter after the any space. Example: 
--     'A CAT RAN DOWN' --> 'A Cat Ran Down'
--      or 'a cat ran down' --> 'A Cat Ran D
--     own'. Usage Select CapFirst('String')
-- By: Adonis Villanueva
--
-- Inputs:String input
--
-- Returns:Formatted String
--
--This code is copyrighted and has
-- limited warranties.Please see http://
--     [URL unfurl="true"]www.Planet-Source-Code.com/vb/scripts/Sh[/URL]
--     owCode.asp?txtCodeId=663&lngWId=5
--for details.
--**************************************
--     

CREATE function dbo.CapFirst (@String varchar(255))
returns varchar(255)
as


    BEGIN --begin function PROCEDURE
    DECLARE @StringCount int
    SET @string = lower(@string)
    SET @string = stuff(@string,1,1,left(upper(@string),1)) --Capitalize the first letter 
    SET @StringCount = 0
    WHILE @StringCount < len(@string)


        BEGIN --begin WHILE
         IF substring(@string,charindex(space(1),@string,@StringCount),1) = space(1)


             BEGIN --begin IF	
            SET @string = stuff(@string,charindex(space(1),@string,@StringCount)+1,1,substring(upper(@string),charindex(' ',@string,@StringCount)+1,1)) 
         END --end IF

        SET @StringCount = @StringCount + 1
    END --end WHILE

    RETURN @string --return the formatted string
END --end function PROCEDURE

Any suggestions to update this so that it can handle sir names with dashes??


Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
OK George!!

Insert Into @Temp Values('/George')

I'll have to ponder on the complete bunk entry possibility.... [medal]

Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
I have a user defined function that does a really good job of returning a proper case. I have this at the office, and I am currently sitting in a restaurant waiting for steak fajitas. If nobody else posts one b tommorow, I will post mine.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Mark cool functions for SQL 2005 string manipulations.

I will certainly make use of them.

George I would love to see the string UDF you use.

Thanks

John Fuhrman
faq329-6766
faq329-7301
thread329-1334328
thread329-1424438
 
Well... I used to have a function that I got from the internet somewhere, but then a friend told me that the version from Brad Schulz was "probably" faster. Not to be outdone, I wrote my own. [bigsmile]

Code:
Create Function dbo.Proper(@Data VarChar(8000))
Returns VarChar(8000)
As
Begin
  Declare @Position Int

  Select @Data = Stuff(Lower(@Data), 1, 1, Upper(Left(@Data, 1))),
         @Position = PatIndex('%[^a-zA-Z][a-z]%', @Data COLLATE Latin1_General_Bin)

  While @Position > 0
    Select @Data = Stuff(@Data, @Position, 2, Upper(SubString(@Data, @Position, 2))),
           @Position = PatIndex('%[^a-zA-Z][a-z]%', @Data COLLATE Latin1_General_Bin)

  Return Coalesce(@Data, '')
End

This code looks pretty ugly, and it may be difficult to understand. I'm going to write a blog about this function sometime within the next day or two. In the blog I will explain how this works. After I write the blog, I will post a link here.

To use this function, copy/paste the code above in to a query window and run it. This will create the function in your database.

To use the function....

[tt][blue]
Select dbo.Proper(YourColumnNameHere) From YourTableNameHere
[/blue][/tt]

To update data...

[tt][blue]
Update YourTableName
Set YourColumnName = dbo.Proper(YourColumnName)
[/blue][/tt]

While I was at it, I tested performance. I have a database that contains map data for the entire united states. Actually, there are 2 copies of the map data, one from TIGER and another from a "Not-Free" source. I tested this against the parks table, which has 239,487 rows. There is a name column in this table. This column is varchar(100).

I created a properize function copied directly from Brad Shulz's site. I copied the function that you supplied showing the author as Adonis Villanueva. I then ran this test...

Code:
Declare @Trash Char(1)
Declare @Start DateTime

Set @Start = GetDate()
Select @Trash =dbo.Properize(Name, 1) From parks
Select 'Brad Shulz', DateDiff(Millisecond, @Start, GetDate())

Set @Start = GetDate()
Select @Trash =dbo.CapFirst(Name) From parks
Select 'Adonis Villanueva', DateDiff(Millisecond, @Start, GetDate())

Set @Start = GetDate()
Select @Trash =dbo.Proper(Name) From parks
Select 'George Mastros', DateDiff(Millisecond, @Start, GetDate())

The results were...

[tt][blue]
Author Time
----------------- -----
Brad Shulz 21573
Adonis Villanueva 7270
George Mastros 1796
[/blue][/tt]

Note that the values shown are times in milliseconds.


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top