Smart questions
Smart answers
Smart people
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Member Login




Remember Me
Forgot Password?
Join Us!

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips now!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

Join Tek-Tips
*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Donate Today!

Do you enjoy these
technical forums?
Donate Today! Click Here

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.
Jobs from Indeed

Link To This Forum!

Partner Button
Add Stickiness To Your Site By Linking To This Professionally Managed Technical Forum.
Just copy and paste the
code below into your site.

idbr (MIS) (OP)
22 Dec 09 4:58
Hi,

I'm trying to process a csv export file but have a bit of a problem. It's been exported in Excel csv format. When I view the file in Excel, it splits into its constituent columns just fine. However, when I look at the file using notepad there dont appear to any row terminators at all.

Reformatting the files as tab delimited makes the problem go away - but I've got many files > 65536 rows (the Excel maximum for 2003, which I'm working on).

Excel must have some way of determining the file structure, does anyone know hat it is?

Thanks, Iain

Example code:

CODE

    ' open the error file
    Open strFolderPath & fl.Name For Input As #1
    
    ' reset the counter
    i = 0
    
    ' loop to the end of the file counting rows
    Do Until EOF(1)
        
        If i = 0 Then
        
            ' output the header row
            this fails because the entire file is read in due to lack of row terminators
            Line Input #1, strTest
            Debug.Print fl.Name & ": " & strTest
            
        Else
            
            ' just count the row
            Input #1, strTest
            
        End If
        
        i = i + 1
        
    Loop
    
    ' output the final result
    Debug.Print fl.Name & " = " & i
    
    Close #1
 
idbr (MIS) (OP)
22 Dec 09 4:59
Actually, also worth pointing out that because of the lack of row terminators the file row count is hopelessly out too.

Ta, Iain
idbr (MIS) (OP)
22 Dec 09 6:23
Fixed it with some help from this thread.

Slooooow running code below, suggestions for improvements welcome.

CODE

intInFileID = FreeFile
Open strFile For Binary As #intInFileID

intOutFileID = FreeFile
Open strFile & ".txt" For Binary As #intOutFileID

Do Until EOF(intInFileID)

    Get #intInFileID, , strChar
    
    If strChar = vbLf Then
    
        Put #intOutFileID, , vbCrLf
        
    Else
    
        Put #intOutFileID, , strChar
        
    End If
    
Loop

Close #intOutFileID
Close #intInFileID
HarleyQuinn (Programmer)
22 Dec 09 7:34
Not massively tested the output and no error handling included but this should be faster than sloooow for you (depending on file size, tested against 3Mb csv and runs in less than half a second on my machine):

CODE

Sub ChangeFileData(startfile As String, outfile As String)
Dim indata() As Byte
Dim flen As Long
Dim f As Integer
Dim strTarget As String

Open startfile For Binary Access Read As #1
flen = FileLen(startfile)
ReDim indata(flen)
Get #1, , indata
Close #1

strTarget = Replace(StrConv(indata, vbUnicode), vbLf, vbCrLf)

Debug.Print Len(strTarget)

Open outfile For Output As #1

Print #1, strTarget

Close #1
End Sub
Hope this helps

Andy
---------------------------------
Zebracorn: 50% Zebra, 50% Unicorn = 100% Real.

http://lessthandot.com  

macropod (TechnicalUser)
22 Dec 09 20:34
Hi Iain'

Quote:

When I view the file in Excel, it splits into its constituent columns just fine. However, when I look at the file using notepad there dont appear to any row terminators at all.
Regardless of what Notepad shows, the rows must have something that Excel recognises as a row terminator - most likely ASCII 13, which is a carriage return character Notepad won't process. If so, opening the file in Word will reveal ¶ characters at the end of each line but, although these look like paragraph breaks, they aren't; a paragraph break in Word consists of an ASCII 13 & 10 pair (carriage-return & line feed).


Cheers
[MS MVP - Word]

CautionMP (Programmer)
25 Dec 09 19:38
Following the thread I agree with Andy.

Couple of thoughts for speed.
  1. I beleive that Scripting.TextStream.ReadLine understands Carriage Return (Cr), Line Feed (Lf), and CrLf terminators for rows and works much the same as VBA.FileSystem operations.
  2. Use ADO. You can query a text file directly and it does understand the three different types of line terminators (plus you have a little more contol since ADO will parse the lines (row) into fields.
Hope this makes sense and helps,
CMP

For the best results do what I'm thinking, not what I'm saying.
(GMT-07:00) Mountain Time (US & Canada)

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close