Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parsing text from paragraph into Excel 2

Status
Not open for further replies.

LScharf

Technical User
Aug 9, 2002
42
US
I'm trying to take a text document that has data in paragraphs (job descriptions) and put it into a spreadsheet, shifting various pieces of text into Excel columns. Each paragraph has certain characteristics that I THINK would allow me to identify the different elements (Title, Salary, Description, Hours, etc.). For example, Title is always underlined.

So I started writing a subroutine to find the underlined words, but it craps out:

Sub SearchCellFormat()

' Establish search criteria.
With Application.FindFormat.Font
.Underline = xlUnderlineStyleSingle
End With

' Find the cells based on the search criteria.
Cells.Find(What:="", After:=ActiveCell, _
LookIn:=xlFormulas, LookAt:= _
xlPart, SearchOrder:=xlByColumns, _
SearchDirection:=xlNext, MatchCase:=False _
, SearchFormat:=True).Activate

End Sub


Even if only a crude start, is this the way to go about parsing text?

Sample text:

START SAMPLE

Admin Dir Cardiology Services Starting Salary: Admin

Description: Responsible for managing...

Department: Cardiology. Location: Medical Center Campus. Hrs per week: 40. Shift: Day. Hours: 8:00am-4:30pm.

END SAMPLE

Though you can't see it here, "Admin Dir Cardiology Services" is underlined. Nothing else is. All the other elements are preceded by their names, which should be handy.

Any thoughts appreciated!

Lauri
 
Lauri,

Here's an example of "extracting" the underlined string. Select the cell(s) containing your paragraph text and run...
Code:
Sub ParseIt()
    For Each c In Selection
        With c
            s = ""
            For i = 1 To Len(.Value)
                If .Characters(Start:=i, Length:=1).Font.Underline = xlUnderlineStyleSingle Then
                    s = s & Mid(.Value, i, 1)
                End If
            Next
            MsgBox s
        End With
    Next
End Sub
:)

Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884

Skip,
 
Skip,

Many thanks! That is a big help.

One last request: Can you give a suggestion of how to paste the value into another worksheet, into cell A2, then continue looping through my text, and paste all the other job titles into A3, A4, etc. There are hundreds of jobs.

Just so you know, my next step will be to also paste the Salary into B2, B3, etc.; Description into C2, C3; and so on. But I can probably figure that out if you can help me get started with Job Title.

Lauri
 
Code:
Sub ParseIt()
    'this assumes that _
        all source data is contiguous on Sheet1 _
        Sheet2 has headings in row 1 like "Title", "Salary" etc
    Dim r As Long, r1 As Long, r2 As Long, rOut As Integer
    Dim c As Integer, c1 As Integer, c2 As Integer, cOut As Integer
    Dim sTitle As String, nSalary As Currency
    With Sheet1.UsedRange
        r1 = .Row
        r2 = r1 + .Rows.Count - 1
        c1 = .Column
        c2 = c1 + .Columns.Count - 1
    End With
    For r = r1 To r2
        For c = c1 To c2
            With Sheet1.Cells(r, c)
                'title
                sTitle = ""
                For i = 1 To Len(.Value)
                    If .Characters(Start:=i, Length:=1).Font.Underline = xlUnderlineStyleSingle Then
                        sTitle = sTitle & Mid(.Value, i, 1)
                    End If
                Next
                'salary
                nSalary = 0
                'other
            End With
        Next
        With Sheet2
            With .Cells(1, 1).CurrentRegion
                rOut = .Rows.Count + 1
                cOut = .Columns.Count
            End With
            For c = 1 To cOut
                Select Case .Cells(1, c).Value
                    Case "Title"
                        .Cells(rOut, 1).Value = sTitle
                    Case "Salary"
                    
                End Select
            Next
        End With
    Next
End Sub
:)


Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884

Skip,
 
Skip,

Thanks again. I adapted your code and got it to work as intended, but I still have a problem. There are hundreds of these jobs in the range, and I need to paste the Title, Salary, etc. for each job onto one row in Sheet2.

I'm trying to figure out how to tell Excel that when it finds the next underlined title, it should start pasting to a new row. So everything between two titles goes onto one line.

Do I want to get a count of jobs, then use that as a counter to loop through the document? I'm not having luck with that!
 
In this part of the code you do EVERYTHING for all the data that you want to extract
Code:
            With Sheet1.Cells(r, c)
                'title
                sTitle = ""
                For i = 1 To Len(.Value)
                    If .Characters(Start:=i, Length:=1).Font.Underline = xlUnderlineStyleSingle Then
                        sTitle = sTitle & Mid(.Value, i, 1)
                    End If
                Next
                'salary
                nSalary = 0
                'other
            End With
In this part of the code, the extracted data is placed in the NEXT ROW
Code:
        With Sheet2
            With .Cells(1, 1).CurrentRegion
                rOut = .Rows.Count + 1
                cOut = .Columns.Count
            End With
            For c = 1 To cOut
                Select Case .Cells(1, c).Value
                    Case "Title"
                        .Cells(rOut, 1).Value = sTitle
                    Case "Salary"
                    
                End Select
            Next
        End With
:)


Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884

Skip,
 
Hi Again,

Still problems: As I understand your code, it assumes that all the data for one job is on the same row. In fact, they're on separate rows, all in column A.

So I'm still trying to figure out how to make it copy hundreds of jobs from A2:A1200, using just the fact that Title is underlined as the trigger for the loop to start over.

Thanks!
 
Not a whole lot differentiates one job record from the next, unfortunately!!

Basically it looks like this:

Analyst (underlined) Salary: $18.00 - $23.53

Description: Turn around the economy in 30 days...
Requirements: Intelligence
Hours: Many. Location: Windswept tundra. Shift: All 3.

IT Specialist (underlined) Salary: $90

Description: Tweak system and troubleshoot
Requirements: AS400 5+ years
Hours: 35 per week. Location: Basement. Shift: 8 - 5.

And so on.

I wrote a routine to copy the Salary to column A, so now all the data appear in column A.
I also wrote a routine to copy the title into another column next to every row related to that job, so I tried to use that column as the trigger for the Sheet2 statement to go to the next line, but I just can't figure out how to LOOP through all the instances of Title...

STYMIED!

Thanks!
 
So, does the string, "Analyst " in the first 8 positions of the text in a cell ALWAYS mean a NEW RECORD?

Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884

Skip,
 
No, the job title could be anything. Its distinguishing feature is that it's always underlined, and it's the only text that is underlined. That's the only thing that tells us that it's a new record.

(The data originally came from a Word document.)

Thanks for sticking with me!
 
JL,

It's essentially the same code except whenever we encounter UNDERLINED text, it means the start of a new record, so go write the data that has been accumulated...
Code:
Sub ParseIt()
    'this assumes that _
        all source data is contiguous on Sheet1 _
        Sheet2 has headings in row 1 like "Title", "Salary" etc
    Dim r As Long, r1 As Long, r2 As Long, rOut As Integer
    Dim c As Integer, c1 As Integer, c2 As Integer, cOut As Integer
    Dim sTitle As String, nSalary As Currency
    With Sheet1.UsedRange
        r1 = .Row
        r2 = r1 + .Rows.Count - 1
        c1 = .Column
        c2 = c1 + .Columns.Count - 1
    End With
    sTitle = ""
    For r = r1 To r2
        For c = c1 To c2
            With Sheet1.Cells(r, c)
                'title
                For i = 1 To Len(.Value)
                    If .Characters(Start:=i, Length:=1).Font.Underline = xlUnderlineStyleSingle Then
                    'we have hit UNDERLINED text other than the first time _
                        so write the data to sheet2
                        If sTitle <> &quot;&quot; Then
                            GoSub WriteRecord
                        End If
                        'then initialize all OTHER record variables
                        nSalary = 0
                        
                        sTitle = sTitle & Mid(.Value, i, 1)
                    End If
                Next
                'salary
                'other
            End With
        Next
    Next
    'now write the last record
    GoSub WriteRecord
    Exit Sub
WriteRecord:
    With Sheet2
        With .Cells(1, 1).CurrentRegion
            rOut = .Rows.Count + 1
            cOut = .Columns.Count
        End With
        For c = 1 To cOut
            Select Case .Cells(1, c).Value
                Case &quot;Title&quot;
                    .Cells(rOut, 1).Value = sTitle
                Case &quot;Salary&quot;
                
            End Select
        Next
    End With
    Return
End Sub


Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884

Skip,
 
Skip,

YOU DID IT!!! Man, what a bear. As a hack VBA programmer, this type of logic has me running around in circles (literally).

Many, many thanks. I really appreciate your multiple attempts at solving this, despite my lack of clarity.

J. Lauri Scharf
 
Hi,

I'm using an adapted version of a macro to parse names into two columns, a forename & surname. Now it works, but every time I run the macro, I get a dialog box asking if I want to overwrite data in the destination cells. But the destination cells are blank, but would have had data in them previuosly (the data moves around the spreadsheet). This is frustrating, as I want a seamless transistion, without user interaction.

Is there something I'm not doing?

The code I'm using is:

ActiveCell.TextToColumns Destination:=ActiveCell.Offset(0, 4), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=True, _
Semicolon:=False, Comma:=False, Space:=True, Other:=False, FieldInfo _
:=Array(Array(1, 1), Array(2, 1))

Please help!!!!
 
avandale, what about a &quot;Application.DisplayArlerts=False&quot; ?

Hope This Help
PH.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top