INTELLIGENT WORK FORUMS FOR COMPUTER PROFESSIONALS
Come Join Us!
Are you a Computer / IT professional? Join Tek-Tips now!
- Talk With Other Members
- Be Notified Of Responses
To Your Posts
- Keyword Search
- One-Click Access To Your
Favorite Forums
- Automated Signatures
On Your Posts
- Best Of All, It's Free!
*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.
Partner With Us!
"Best Of Breed" Forums Add Stickiness To Your Site

(Download This Button Today!)
Feedback
"...I love the structure of the site. You start at the top, and drill down to what you want. Maybe I've been using Unix too long... :-) "
Geography
Where in the world do Tek-Tips members come from?
|
Evaluating CSV file (3)
|
|
|
EdRev (Programmer) |
9 Jan 04 10:26 |
I would like to evaluate the cloumns and rows of a csv file before I load it into my Oracle database.
How can I do this in vbscript?
|
|
PHV (MIS) |
9 Jan 04 10:55 |
What have you so far ? You can do a keyword search in this forum for fso split to get some ideas. You can read your csv file line by line (ie row by row) and then split each line into an array (of columns) for evaluating the datas. Hope This Help PH. |
|
I'm not sure what sort of evaluation you need to do, but you might be interested in this: There is also the approach described in FAQ329-3323 which might fit your needs and can be faster than parsing the data into arrays via Split(). It also handles DateTime, Currency, and other data types without writing a lot of code. The result of the process is an ADO Recordset view of your CSV data or even a new Text Driver "table" (text file in any of several formats: CSV, fixed-field, etc.). Keep in mind that the Jet SQL you have here is not as powerful as something like full Oracle SQL. You also can't use all of Jet SQL's features and capabilities with the simple Text Driver (though even JOINs are possible if you have multiple CSVs). But it will allow you to query and requery the data using SQL syntax to perform some of your "evaluation." This might provide an Oracle guy/gal a lot more leverage than writing a whole lot of VBScript to process a full-table recordset or data parsed "by hand" out of the raw text lines. I've used some spiffy SQL to automate such things as boiling down CSV files: Summarize.wsf <job> <reference object="ADODB.Connection"/> <object id="cn" progid="ADODB.Connection"/> <script language="VBScript"> Option Explicit Dim strSQL
strSQL = "SELECT" _ & " First(Date) AS [Date]," _ & " First(Filename) AS [Filename]," _ & " First(VirusName) AS [VirusName]," _ & " First(VirusType) AS [VirusType]," _ & " First(ActionTaken) AS [ActionTaken]," _ & " Computer," _ & " First(User) AS [User]" _ & " INTO navlog_summ.csv" _ & " FROM navexp.csv GROUP BY Computer" _ & " ORDER BY First(Date)"
cn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" _ & "Data Source='Virus Data';" _ & "Extended Properties='Text'"
cn.Execute strSQL, , adExecuteNoRecords Or adCmdText cn.Close MsgBox "Done" </script> </job> Schema.ini [navexp.csv] ColNameHeader=False CharacterSet=1252 Format=CSVDelimited DateTimeFormat=m/d/yyyy h:nn Col1=Date DateTime Col2=Filename Char Col3=VirusName Char Col4=VirusType Char Col5=ActionTaken Char Col6=Computer Char Col7=User Char [navlog_summ.csv] ColNameHeader=False CharacterSet=1252 Format=CSVDelimited DateTimeFormat=m/d/yyyy h:nn Col1=Date DateTime Col2=Computer Char Col3=User Char Col4=Filename Char Col5=VirusName Char Col6=VirusType Char Col7=ActionTaken Char Schema.ini is what tells the Text Driver what the input CSV looks like and what the output CSV should look like. In this case Schema.ini and navexp.csv are placed into a directory (folder) called Virus Data that is in the same directory as the script file itself. The navlog_summ.csv output file gets created within that same subfolder: <some folder> | |_<Summarize.wsf> | |_<Virus Data> | |_<Schema.ini> | |_<navexp.csv> | |_<navlog_summ.csv>
If the data/schema folder is someplace else you just supply the full path in the Data Source property of the connection string. Jet SQL documentation can be found in MS Access Help and other places. There are three articles on the subject starting at: http://msdn.microsoft.com/library/default.asp?url=/libr...The online articles are offered there as well via links to 3 self-extracting EXEs (one in each article). I think these are Word documents with the same article text. This lets you grab a copy of the info for offline reading. See the end of the 1st article for links to the other two and a few other resources. VBScript doesn't have to reduce one to stone knives and bearskins.  What you choose to use depends on what you need to accomplish and what tool subsets you are most comfortable with. |
|
|
EdRev (Programmer) |
11 Jan 04 19:29 |
Thanks for the replies!
Dilettante, I tried your code and it worked fine. I had one problem though, I added a where clause on the SQL "where Col2='this' and it bombed - No value given for one or more required parameters. Am I coding it correctly?
|
|
Hi EdRev,
Glad it worked so far. Did you try something like:
WHERE field-name = "this"
The Col2 syntax is only needed in the .INI file to assign a field name and a type to the column. In the SQL you use the names you provided instead of the ColX names.
I think the ColX names will be valid in SQL if you don't provide names in the .INI file. Otherwise they probably aren't valid in SQL expressions and you must use the names you provide instead. |
|
|
EdRev (Programmer) |
13 Jan 04 13:07 |
Hi, dilettante.
So far it's working fine. One, hopefully(??) last question. My schema.ini looks like this:
[source.csv] ColNameHeader=False CharacterSet=1252 Format=CSVDelimited Col1=Area Char Col2=Acct Char Col3=Jan Integer Col4=Feb Integer Col5=Mar Integer Col6=Ver Char Col7=Loc Char [target.csv] ColNameHeader=False CharacterSet=1252 Format=CSVDelimited Col1=Yr char col2=Ver char Col3=Loc Char Col4=Acct Char Col5=Jan Integer Col6=Feb Integer Col7=Mar Integer
I want to insert the literal values "2003" and "Actual" to the first and second columns resp. of the target file, and map col 7 of source to the col 3 of target, but I need to append a prefix "L" to col 3 (i.e source is 9999 - target should read "L9999"). Any idea on how I can code my select..into.
Thanks in advance.
|
|
Hi again EdRev, This sort of thing isn't too tough as it turns out. Take my example for instance. I went in to my Schema.ini and changed the type for navlog_summ.csv's Date field (Col1) to Char. Could have been a Number or something too I suppose, or maybe even left as a Date. I was focusing on the SQL here though. Then I used: <reference object="Scripting.FileSystemObject"/> <reference object="ADODB.Connection"/> <object id="fso" progid="Scripting.FileSystemObject"/> <object id="cn" progid="ADODB.Connection"/> <script language="VBScript"> Option Explicit Dim strSQL
If fso.FileExists("Virus Data\navlog_array.csv") Then fso.DeleteFile "Virus Data\navlog_array.csv" End If
strSQL = "SELECT" _ & " '2003' AS [Date]," _ & " First(Filename) AS [Filename]," _ & " First(VirusName) AS [VirusName]," _ & " First(VirusType) AS [VirusType]," _ & " 'L' & First(ActionTaken) AS [ActionTaken]," _ & " Computer," _ & " First(User) AS [User]" _ & " INTO navlog_array.csv" _ & " FROM navexp.csv GROUP BY Computer" _ & " ORDER BY First(Date)"
The new stuff prior to setting up the SQL is just to get rid of any existing output file when I rerun the script job. I was getting tired of doing it manually.  The real change is to the SQL string itself here. I believe you can use a good portion of what's allowed in VBA's expression syntax (including built-in functions) in forming these subexpressions for the result values. Hope that helps! |
|
Also,
If you have MS Access 2000 or newer, the Help there might be useful in figuring out what sorts of things you can do in Jet SQL. |
|
|
EdRev (Programmer) |
14 Jan 04 20:58 |
Hi, dilettante. Thank you very much for all your help. Everything is working great. |
|
Glad to be of assistance.
I missed your point about rearranging the output field orders, but you probably picked up that you can rearrange your output file schema to reflect the layout (field order) you need. |
|
i am having an interesting day..... if we focus on three data elements in my database i can make my cunudrum knowable. there are two tables we'll focus on. among other fields, one collects height and weight as ####.#. the other collects dosage as ################.### (although it'd be rare for height/weight and dosage to require more than 3 and 5 digits TO THE LEFT of the decimal point, respectively). i am creting an exported csv text file via a query and finding that despite my current best efforts that these three data elements are being exported with two significant digits to the right of the decimal point. i should add that the export file is to be uploaded onto a client's mainframe and that their blueprinted specifications would have the values of the three fields occur in the export file as numbers (and not as text, i.e. surrounded by quotation marks). it this were indeed the case, then my problems would be easily solved, since expr1: Format([Height],#.#) and expr2: Format([Dosage],#.###) would result in values like "96.7" and "258.000" respectively. the coercive requirements however compel me to employ the likes of expr1: Round(Val(Format([Height],#.#))) and expr2: Round(Val(Format([Dosage],#.###))) which as i said earlier despite my best efforts have no apparent effect on the outcome of the exported values, namely, what you'll see in the exported file are, e.g. 96.70 and 258.00 respectively. i read something about a default Schema.ini file and the documetnation alluded to generating a custom Schema.ini file which would be deposited in the target directory of the file (in this case) being exported. i guess my question's got to do with a) what's behind this and b) would Schema.ini generating solve it? “The philosophy of the school room in one generation will be the philosophy of government in the next." --- Abraham Lincoln
|
|
This ought to be a separate thread, but let's give it a go here anyway. First of all, your basic problem is that your registry settings for the Text Driver must specify a default of two decimal places. This can indeed be overridden via Schema.ini and here is a short example: CODE<job> <!-- This script puts some data into an empty CSV text file called "data.csv" in the directory "datadir" located in the same directory as the script itself. Using this technique Jet cannot create the file by itself, so an empty one is needed. This script could create one via FSO, but this is merely an illustration.
Within "datadir" there should be a "Schema.ini" file that contains:
[data.csv] ColNameHeader=False CharacterSet=1252 Format=CSVDelimited TextDelimiter=none NumberLeadingZeros=True NumberDigits=1 Col1=Name Char Col2=SomeVal Single Col3=OtherVal Single
Note that there is no way to specify various "numbers of digits after the decimal point" for various Single or Double format fields. All such fields use the same "NumberDigits" value. The same is true for the format parameter "NumberLeadingZeros" which is True if you want a zero (0.3) or False if you do not (.3). --> <reference object="ADODB.Connection"/> <object id="objRS" progid="ADODB.Recordset"/> <script language="VBScript"> Option Explicit Dim strSQL, strConn, n, Names
Names = Array("Bill", "Fritz", "Mary", "Sandy", "Venkat")
strSQL = "data.csv" strConn = "Provider=Microsoft.Jet.OLEDB.4.0;" _ & "Data Source='datadir';" _ & "Extended Properties='Text'"
objRS.Open strSQL, strConn, adOpenForwardOnly, _ adLockOptimistic, adCmdTable For n = 0 To 4 objRS.AddNew objRS("Name").Value = Names(n) objRS("SomeVal") = Rnd * 10 objRS("OtherVal") = Rnd * 10 objRS.Update Next objRS.Update objRS.Close MsgBox "Done!" </script> </job> Yes, I said short. I guess I went crazy on the comments instead of placing some of those remarks here in the text of my reply. The results: Bill,7.0,5.3 Fritz,5.7,2.8 Mary,3.0,7.7 Sandy,0.1,7.6 Venkat,8.1,7.0If you read my comments above though you'll see that this might not get you too far. There is only one setting per file described in Schema.ini for the number of decimal places. Ouch. Well, you seem to have hit upon the right idea by looking at Format( ), but there are a couple of problems with this idea. First, Format( ) doesn't exist in VBScript. But maybe you are using VBA or VB, or maybe you can live with using VBScript's FormatNumber( ) instead. In any case you need to do your Round( )-ing before you format the value. Round(Val(Format(xxx))) is sort of "roundabout." As it is I don't recall which rounding approach is used by FormatNumber( ) either, though different techniques/function seem to do it different ways, oddly enough. But your second problem is those nasty quotes. Well in my example script I demonstrate how to suppress quoting of string values in the CSV output. If you still need the quotes in your CSV output file for data that is intended to be strings, I suppose you can cram them in around the data yourself. Not much uglier than doing that formatting for numbers I suppose. Just use something on the order of: CODEstrStringValOut = """" & strStringVal & """" strNum1ValOut = FormatNumber(dblNum1Val, 1, -1) strNum2ValOut = FormatNumber(dblNum2Val, 3, -1) Ugly yes, but it would work. And maybe you can live with FormatNumber( )'s rounding as is. Now, if you were actually exporting dblNum1Val to one CSV file and dblNum2Val to a second one, each CSV file could have its own setting in Schema.ini for the number of digits after the decimal point. Then you wouldn't need to mess around with formatting and strings at all. |
|
whoa  i'm not quite sure what i'm looking at, but it seems to be 'what the doctor ordered'.... thanks! “The philosophy of the school room in one generation will be the philosophy of government in the next." --- Abraham Lincoln
|
|
|
 |
|