Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Using RegExp to Remove Extra Quotes 3

Status
Not open for further replies.

BuckBMD

MIS
Mar 15, 2001
13
US
I need some assistance in constructing a RegExp pattern for removing/changing stray quotes within a text file.

My script basically reads a line from a comma-delimited text file (with quotes for text qualifiers). I then want to construct a RegExp pattern to remove.change quotes if they are not at the beginning of the line, the end of the line, followed by a comma, preceded by a comma, etc.

The resulting modified string is then written to another text file, leaving the original text file intact.

It all works except for the appropriate 'objRegExp.Pattern'

In essence I want to leave all of the text qualifiers, and remove the other stray quotes.

Any help would be greatly appreciated.

Thank you.
 
I neglected to mention that I'm using VBScript.

Thanks!
 
I don't know that you could get a pattern to be 100% efective. consider this potential line:

"Last, "First Name"", Second Field, "Third, and hopefully "last", field"

I think it woould take some really and I mean really tricky parsing to turn that line into:

"Last, First Name", Second Field, "Third, and hopefully last, field"

[red]"... isn't sanity really just a one trick pony anyway?! I mean, all you get is one trick, rational thinking, but when you are good and crazy, oooh, oooh, oooh, the sky is the limit!" - The Tick[/red]
 
The major problem as I see it is the lack of support of lookbehind in the latest release of regexp engine for vbs. It supports lookahead well.

I would workaround this by multiple matching and replacing.
[tt]
set re=new regexp
re.global=true

's is the readline, below is a sample
s="""""abc"",""def"",""g""hi"", ""jkl"",""mno"" ,""pqr"""""

[blue]'special symbols are chosen to be % and #
'you may have to choose some other symbols of improbable occurrance in the s[/blue]
re.pattern="^""|""$"
t=re.replace(s,"%")
re.pattern=",\s*"""
u=re.replace(t,",#")
re.pattern="""\s*,"
v=re.replace(u,"#,")
re.pattern=""""
w=re.replace(v,"")
re.pattern="^%|%$|#"
x=re.replace(w,"""")
'x is the desired string to be written to new file
'you should actually use s all along if you want to save memory usage, I use t,u,v,w,x for clarity and checking
wscript.echo s & vbcrlf & x
[/tt]
 
Thanks to both 'TomThumbKP' and 'tsuji' for your help. I was thinking about using multiple passes to accomplish this. That method works great when cleaning addresses lines, etc.

I certainly appreciate the efforts on both your parts.

Thanks again.
 
Another way:
Function StripQuotes(buf)
Dim x, i, j
x = Split(buf, ",")
For i = 0 To UBound(x)
x(i) = Trim(x(i))
For j = Len(x(i)) - 1 To 2 Step -1
If Mid(x(i), j, 1) = Chr(34) Then
If Mid(x(i), j - 1, 1) <> "," And Mid(x(i), j + 1, 1) <> "," Then
x(i) = Left(x(i), j - 1) & Mid(x(i), j + 1)
End If
End If
Next
Next
StripQuotes = Join(x, ",")
End Function

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Or even

Function StripQuotes(strSource, Optional vbQuoteMarker = """")
Set re = New RegExp
re.Global = True

re.Pattern = "(" & vbQuoteMarker & ")" & "\1*"
StripQuotes = re.Replace(strSource, vbQuoteMarker)
re.Pattern = "(\w)" & vbQuoteMarker & "(\w)"
StripQuotes = re.Replace(StripQuotes, "$1$2")
End Function
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top