Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regex question

Status
Not open for further replies.

Gooser

Technical User
Joined
Jun 28, 2006
Messages
803
Location
US
HELP! I am new to regex and have a question that I need a quick answer to.

I've got a table with a varchar(512) /*NOT designed by me*/ that holds the number part of an address, (i.e. the '200' of '200 1st Ave') I need to select from that column only the values that ARE NOT numbers to find all the rows that I have to (somehow) fix.

and is it Regg-X or Reej-ects?

Thanks,

v/r

Gooser

Why do today
that which may not need to be done tomorrow [ponder] --me
 
Something like this...

Code:
Select	*
From	Table
Where	Left(FieldName, CharIndex(' ', FieldName)) Like '%[0-9]%'

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
RiverGuy

IsNumeric doesn't work real well for address house numbers.

The problem is that IsNumeric supports scientific notation with e and d, so 10e3 is a valid number and so is 5d2.

Code:
Select IsNumeric('10e3'), IsNumeric('5d2')

Using the like search is better because it won't return those 2 values.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
George, I don't get it. The results of what you gave me are super-duper weird. One way gives me all the ones that have '100/150', '200 & 202', etc. But, if I add a NOT to it, I get all the real address numbers, but I also get the offensive rows that have 'Stevens Pass Summit' in them, too many rows to be sure these are the only ones by scanning the result set.

RiverGuy, what you gave me didn't work either, I think, unless I did it wrong.

I entered

Code:
select isnumeric(column_name) from table_name

Is that correct usage? and Shouldn't that have a NOT as well?

(Frustrated by a poorly designed database.)


v/r

Gooser

Why do today
that which may not need to be done tomorrow [ponder] --me
 
Gooser,

I suppose I'm not really sure which records you are trying to get/filter out. Can you post about a dozen records and indicate which ones you want returned?

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
I'm not really developing here, just trying to get a query in QA to show me what rows have bad data (not integers). The guy who 'designed' the database decided just about every damn column should be varchar(512) so I have rows that have things like 'Stevens Pass Summit' for the number part of the address. When, realistically, it should only be numbers in there.

I want a list of bad rows.

Thanks,


v/r

Gooser

Why do today
that which may not need to be done tomorrow [ponder] --me
 
Try this...

Code:
Select *
From   Table
Where  Field Like '%[^0-9]%'

It should return all records that has any character, anywhere in it, that is not 0 through 9.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
George,

Much closer, but I am still getting a bunch of numerics,

701
1506
2401
800
7114

to name just a few.

Any idea why?

--this just illustrates and underscores the NEED for data constraints

Thanks,

v/r

Gooser

Why do today
that which may not need to be done tomorrow [ponder] --me
 
Gooser,

I just did a little testing on a table I have here. I think I should modify my advice a little because you could possibly get more data than you were expecting. In my database, I got house numbers that had a leading and/or trailing space even though the house number itself was numeric. I also got records that didn't have any house number at all. By modifying the query, I was able to return just the addresses that had a non-numeric house number.

Code:
Select HouseNumber 
From   Address 
Where  [!]LTrim(RTrim(HouseNumber)) like '%[^0-9]%'[/!]

In my database, I ended up with...
[tt][blue]
HouseNumber
-----------
74 A
39A
4A
#34
#34
116 1/2
518 1/2
644 1/2

(8 row(s) affected)
[/blue][/tt]



-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Sorry George,

My dumb-a__ wasn't paying attention to the other column!

Code:
Select contact_mechanism_type_item_cd_value
, contact_mechanism_type_nm_value
, from_primary_number
, to_primary_number
, street_nm
, city 
, state
, from_secondary_number
, to_secondary_number
From   situs_address
Where  from_primary_number NOT Like '%[0-9]%'
OR to_primary_number NOT Like '%[0-9]%'
ORDER BY from_primary_number desc

--get a load of these column names!  Sweet, right?

A lot of our addresses are a range, like for an apartment building, since we are concerned with the WHOLE building. I kept looking at from_primary_number, the problem was in to_primary_number for many of the rows. Rather than the second number in a range, as it is supposed to hold. People put the apartment letter, range of letters 'A-D', set of letters 'B C D', pair of letters 'A&B', (or some other value as in the rows that have: '.', '&10100', '8300 Rear', etc.--gotta love the user.)

Thanks a TON for your patience and help on this and all my other problems lately.

v/r

Gooser

Why do today
that which may not need to be done tomorrow [ponder] --me
 
George-

After further trial-and-error, I find that as written here, it returns 495 rows, while if I swap out the commented WHERE for the existing WHERE, I get 522

Code:
SELECT contact_mechanism_type_item_cd_value
, contact_mechanism_type_nm_value
, from_primary_number
, to_primary_number
, street_nm
, city 
, state
, from_secondary_number
, to_secondary_number
FROM   situs_address_w_item_cd_value_v
--WHERE  from_primary_number LIKE '%[^0-9]%'
WHERE LTrim(RTrim(from_primary_number)) like '%[^0-9]%'
--OR to_primary_number LIKE '%[^0-9]%'
OR LTrim(RTrim(to_primary_number)) like '%[^0-9]%'
ORDER BY from_primary_number DESC

<insert spaces here>Gotta love the user!<or here>

Thanks again!

v/r

Gooser

Why do today
that which may not need to be done tomorrow [ponder] --me
 
Okay, now I am trying to add this to a WHERE clause of a stored procedure. (a test stored procedure) I want to see if the rest of my stuff would work if the data was clean. I keep running into problems with the special characters and I can't get the regex sorted out. I've been reading a Regular Expressions Tutorial, but it's just not clicking.

Here's what I've got:
Code:
WHERE  LTrim(RTrim(from_primary_number)) like '%[0-9^A-ZE/E-]%' ESCAPE 'E'
   AND LTrim(RTrim(to_primary_number))   like '%[0-9^A-ZE/E-]%' ESCAPE 'E'

I also tried:
Code:
WHERE  LTrim(RTrim(from_primary_number)) like '%[0-9^A-Z^E/a^E-]%' ESCAPE 'E'
   AND LTrim(RTrim(to_primary_number))   like '%[0-9^A-Z^E/a^E-]%' ESCAPE 'E'

But both of them give me bad results:

Code:
Msg 245, Level 16, State 1, Procedure tspAddressSelect, Line 112
Syntax error converting the varchar value '2205 1/2' to a column of data type int.

Line 112 is the line AFTER the to_primary_number column in the SELECT statement.

I had this:

Code:
WHERE  LTrim(RTrim(from_primary_number)) like '%[0-9]%'
   AND LTrim(RTrim(to_primary_number))   like '%[0-9]%'

AND from_primary_number NOT like '%[A-Z]%'
AND   to_primary_number NOT like '%[A-Z]%'

AND from_primary_number NOT like '%[E/]%' ESCAPE 'E'
AND   to_primary_number NOT like '%[E/]%' ESCAPE 'E'

AND from_primary_number NOT like '%[E-]%' ESCAPE 'E'
AND   to_primary_number NOT like '%[E-]%' ESCAPE 'E'

It worked but took 3:21 first run and 3:19 second run to go through about 52,000 rows. That is way too slow.

So, am I doing something way-wrong here, or what?

Thanks,

v/r

Gooser

Why do today
that which may not need to be done tomorrow [ponder] --me
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top