Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Counting Words in a String 3

Status
Not open for further replies.

KirtSawaya

Technical User
Mar 7, 2005
7
US
I'm trying to count the number of words in a string. Note: A word is a sequence of characters seperated by a space.

I know it will not give me the exact number of words, but that is the guild lines I have to follow. Here is the code I use to total the words of any given array. But for some reason when I run the program, it freezes up. Here is the code somebody, please tell me what I'm doing wrong.

'Declares string to count words, sentences, and paragraphs
Dim strCounter As String

'checks to see if there is a space in the string
If mTextString.IndexOf(" ") <> -1 Then
strCounter = mTextString.Substring(mTextString.IndexOf(" "))
numWords += 1
End If

'Counts the total number of spaces in the string
While strCounter.Length > 0
If strCounter.IndexOf(".") <> -1 Then
strCounter = strCounter.Substring(strCounter.IndexOf(" "))
numWords += 1
End If
End While

Thank you for your help,
Kirt
 
VBakias,

Thanks for pointing out that typo, but the program still doesn't work.

Kirt
 
Do not create new strings. You will have objects all over the place.
try
dim l as integer = strCounter.Length
dim i as integer = strCounter.Indexof(" "c)
While (I > 0 andalso i < L)
numwords +=1
i += 1
if i < l then
i = strCounter.Indexof(" "c, i)
end if
Loop

' Each Consecutive blank counts as a word



- free online Compare/Diff of snippets
 
Maybe .Length instead of .Count

strCounter.split(" ").Length

Easier but slower because it will create a separate string for each word. All those strings have to be garbage collected.

- free online Compare/Diff of snippets
 
JohnYingling and TheRickGuy,

Thank you guys for your help. I always tend to make thing harder then they need to be.

Kirt
 
A few points come to mind:

Both JohnYingling's and Rick's solutions -
Leading / Trailing space(s) would cause incorrect values to be returned.
Multiple spaces between words would cause incorrect values to be returned.
What if no space(s) followed a punctuation mark?

Whichever solution you use, I would suggest at a minimum trimming the string before you start, and probably also stripping out any repeated spaces.

Hope this helps.
 
EarthandFire,

Thanks for the trimming solution. I'll apply it to my program.

Kirt
 
I've been giving this problem a little more thought.

JohnYingling's solution will not count the final word. I've put together a little function that I think resolves most of the issues:

Code:
  Private Function CountWords(ByVal SourceString As String, ByVal Method As String) As Long

    'Add any additional punctuation marks that you need
    'The three .Replace("  "," ") should pick up all punctuation mark generated spaces ...
    'but just in case the while loop picks up any that were missed
    'this approach reduces the number of string assignments to an absolute minimum
    Dim s As String = _
      SourceString.Trim.Replace(".", " ").Replace(",", " ").Replace("!", " ").Replace("?", " ").Replace("  ", " ").Replace("  ", " ").Replace("  ", " ")

    Do While s.IndexOf("  ") > -1
      s = s.Replace("  ", " ")
    Loop

    Select Case Method
      Case "JY"
        Dim c As Long = 0
        Dim l As Integer = s.Length
        Dim i As Integer = s.IndexOf(" ")
        Do While (i > 0 AndAlso i < l)
          c += 1
          i += 1
          If i < l Then
            i = s.IndexOf(" ", i)
          End If
        Loop
        Return c + 1
      Case "RICK"
        Return s.Split(" ".ToCharArray).Length
    End Select

  End Function

called with a rather ridiculous example for demonstratation purposes:

Code:
  Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

    MessageBox.Show(CountWords("  A question? and an exclamation!This.,,.    is.. a. string  ", "JY").ToString)
    MessageBox.Show(CountWords("  A question? and an exclamation!This.,,.    is.. a. string  ", "RICK").ToString)

  End Sub
 
I believe you are improving on the question, which was:

A word is a sequence of characters seperated by a space.

While we do that, we could include tabs or other non-alphameric characters, except perhaps the hyphen.
 
PC888, I agree, however I did say :
'Add any additional punctuation marks that you need

JohnYingling, I also agree. I had just noticed that your solution missed the final word and was just going to post to that effect - but then I started thinking ...

As a Regular Expression extension:

Case "REGEX"
Dim rx As New System.Text.RegularExpressions.Regex("[\w\'\-\d]+")
Return rx.Matches(SourceString).Count
rx = Nothing

although I'm aware that this doesn't allow for time, floating point numbers, email addresses or web addresses etc.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top