Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Field Un-Concatenate in MS Access (SQL)

Status
Not open for further replies.

NadCan

Programmer
Joined
Feb 19, 2007
Messages
5
Location
US
Hi Guys,

I need your help.

I have a table (in MS Access) with a couple of hundred thousand records in a field 'Address' as below:

Address
----------------------------
111 - 2 AVE RD LETHBRIDGE AB
123 WILSON DR WHITEHORSE YT
333 SIMPSON AVE MOOSE JAW SK


I need to split this field as follow:

Address | |City |Province
---------------------------------------------------------

111 - 2 AVE LETHBRIDGE AB
123 WILSON DR WHITEHORSE YT
333 SIMPSON AVE MOOSE JAW SK


Can anyone please help ?

Many thanks,
Nadcan.
 
OUCH!!!!

I'd suggest the split function, but you don't have any consistant delimeters there.

For starters though, it looks like you can always grab the last 2 characters, so there's a column.

As for the balance, shove it into excel, sort by the first column, and manually scrub the data. I've used the text to column function (set width, bot delimited) before on similiar projects. Then, kust make another pass and use the trim function to remove leading and trailing spaces.


Tyrone Lumley
SoCalAccessPro
 
I used too work at a place that gathered addresses from websites all over the world and then tried to parse out the individual parts, just like what you want to do here.

I'm afraid to say that the algorithms can get incredibly complex, and there's not one that will work for 100% of the records.

The one approach I would suggest is to try to figure out the easiest things first. If your sample data holds true for all records, the one thing that seems unambigious is the province. I would strip that out first, then probably work backwards.

One thing we used to do is we had a table that listed every possible word or abbreviation for "street", examples:

St.
Ave.
Avenue
Boulevard
Blvd.
Crescent
Crs.

If we found any of those, we assumed everything before and up to that was the street address.

Another thing would be to have a table of Cities. If you find a match for one of the cities in the string, you can fairly reasonably assume that that part of the string is the city (and it is likely that everything before it is the street address).

The thing to remember is that you will not be able to write anything that gets it right 100% of the time. You will need human eyes to review and correct what your automated process churns out.

 
How are ya NadCan . . .

There's no way you can accomplish this without code, and with 200k records plus, you'll need as much speed as you can get. Looking to query for speed and using a common function for address & city, I present the following:
[ol][li]In a module in the modules window, copy/paste the following function:
Code:
[blue]Public Function AdrPart(Dat As String, flg As Boolean)
   [green]'flg = True = return Address
   'flg = False = return City[/green]
   Dim Ary, x As Integer, y As Integer, Detect As String, Pack As String
   
   Ary = Split(Dat, " ")
   Detect = "StRdDrAvePlace"
   
   For x = UBound(Ary) - 1 To 0 Step -1
      If InStr(1, Detect, Ary(x)) > 0 Then
         If flg Then [green]'Address[/green]
            For y = 0 To x
               Pack = Pack & Ary(y) & " "
            Next
            
            Exit For
         Else [green]'City[/green]
            For y = x + 1 To UBound(Ary) - 1
               Pack = Pack & Ary(y) & " "
            Next
            
            Exit For
         End If
      End If
   Next
   
   AdrPart = Pack
         
End Function[/blue]
[/li]
[li]In a query based on the table, add the following three custom fields:
Code:
[blue]Addr:AdrPart([Address],True)
City:AdrPart([Address],False)
Province:Right([Address],2)[/blue]
[/li]
[li]Peform your testing![/li][/ol]
Be aware: functions have a tendency to slow queries down, and this is certainly not the fastest function I've ever made. However it is as fast as I could make it.

With 200k records [purple]I'd like you to post back just how much time it takes for the query to complete complete.[/purple]

As a short insight into how it works, I split the address into an variable array, using a space [blue]" "[/blue] as the delimiter. This would make the address [blue]333 SIMPSON AVE MOOSE JAW SK[/blue] look like

[tt][blue] Index Data
***** *******

0 333
1 SIMPSON
[purple]> 2 AVE[/purple]
3 MOOSE
4 JAW
5 SK[/blue][/tt]

Addresses usually end in [blue]St, Rd, Dr, Ave, Place, and others[/blue]. I ping against these to determine their index in the Ary. Form there is pretty straight forward to get address & city.

In the code [blue]notice the Detect string[/blue]. You may have to add others to aquire a full complement. For instance I added [blue]Place[/blue].

Anyway, I hope this can get you started.

[blue]Your Thoughts? . . .[/blue]

Calvin.gif
See Ya! . . . . . .

Be sure to see thread181-473997
Also faq181-2886
 
Howdy JoeAtWork . . .

I sure hope [blue]NadCan[/blue] is not getting addresses from all over the world! [ponder] . . . Yikes!

Calvin.gif
See Ya! . . . . . .

Be sure to see thread181-473997
Also faq181-2886
 
Actually, I think at the time we were concentrating on North America.

 
JoeAtWork is correct. This is not as simple as some people think. I work in the mailing industry and even the canned software we use to hygiene the data is not always 100% although the data coming in is far from 100% as well.

You may want to look into some pre-built software that you can use if you need an immediate solution. I would suggest looking at Melissa Data's products. This product works directly with a good amount of database formats.


They used to sell API's when they were under another company called PeopleSmith Software so I do not know for sure if they still sell the API's.

Swi
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top