Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

calculate the frequency of the type of value from a recordset

Status
Not open for further replies.

tansc81

Technical User
Mar 2, 2003
15
GB
would like to calculate the frequency of the type of value from a recordset. i want to check if the values of a columns from a table is if correct type. For example a column in a table might be defined as of type string, but in reality the data in that columns are all integer.. so i will return a result tat indicate integer is of a more suitable type than wat is defined.

so the way i'm doing now is to loop thru the recordset and check for the type, then i will have a counter, which is an array of int, to store how many times it occur. Then if all the columns return the same type


Public Function suggestType(ByVal relation As String, ByVal column As String)
Dim sqlString As String
Dim counter(1 To 13) As Integer
'Dim booleanCounter, byteCounter, dataCounter, nullCounter, decimalCounter As String
'Dim doubleCounter, intCounter, objectCounter, longCounter, shortCounter As String
'Dim singleCounter, stringCounter As String

sqlString = ""
sqlString = "SELECT distinct " & column & " From " & relation

If rec.State = adStateOpen Then rec.Close

rec.Open sqlString, conn
If rec.State = adStateOpen Then

While Not rec.EOF
Select Case TypeName(rec.Fields(0).value)
Case "Boolean"
counter(1) = counter(1) + 1
Case "Byte"
counter(2) = counter(2) + 1
Case "Char"
counter(3) = counter(3) + 1
Case "Date"
counter(4) = counter(4) + 1
Case "DBNull"
counter(5) = counter(5) + 1
Case "Decimal"
counter(6) = counter(6) + 1
Case "Double"
counter(7) = counter(7) + 1
Case "Integer"
counter(8) = counter(8) + 1
Case "Object"
counter(9) = counter(9) + 1
Case "Long"
counter(10) = counter(10) + 1
Case "Short"
counter(11) = counter(11) + 1
Case "Single"
counter(12) = counter(12) + 1
Case "String"
counter(13) = counter(13) + 1
End Select
Wend
End If

For i = 0 To 12
If counter(i) = rec.RecordCount Or rec.RecordCount = counter(i) + counter(5) Then
suggestType = i
End If
Next

rec.Close
Set rec = Nothing

End Function



is there a better,fast n more effient way to do this?
 
You may have to do a little more than this. Typename only supplies the type of variable, not it's contents, so reading 23 in a variable doesn't make it an integer
________________________________________________________________
If you want to get the best response to a question, please check out FAQ222-2244 first

'People who live in windowed environments shouldn't cast pointers.'
 
hmmmmmmmmmmmm,


An Orwellian conundrum? Less is more?

First, why really bother? Every value in a recordset will evaluate to either numeric or string. It is generally not pratical to determine the difference in the various numeric types, as date (or date / time) will not generally be distinguisable from a double, single, currency or - in some instances from a long (data ONLY), so your quest (at least as shown above will generally only count the first of the often several 'types' which might be the true type of the actual field value. On the other side, the what is stored in a field IS (somewhat) limited by the type assignment given by the 'design' and thus can only reflect the vagaries of the data entry process generating an 'automatic' type coercion to the field's "defined" type, so the actual effect is (broadly speaking) to simply find that the actual data is 'like' the defined data in most respects. Placing an "integer" value in a Long field does not change anything except the storage requirement of the value. Placing a "single" value in a long field simply truncates the decimal part (or yields an error) so (assuming no error) reading the actual type will not reveal the 'placement' of the single type - but dutifully return the long (or integer) type - depending on wheatrher the value will 'fit' into an integer field and which order the tests are within your procedure.

So, with a few caveats (BLOB?, Object?) you are really just exercising the cpu on the Orwellian conundrum


MichaelRed
m.red@att.net

Searching for employment in all the wrong places
 
Beacause i'm trying to find the best suitable type for the data in the database and not just chuncking everything in string n numeric
 
Hi, i've found the solution to my question, thanks to those who had help me!

Public Function suggestType(ByVal relation As String, ByVal column As String) As String
Dim recType As New ADODB.Recordset
Dim sqlString, suggestString, s As String
Dim counter(0 To 17), i, recCount, EmptyValue As Integer

Dim dataValue As Variant
For i = 0 To 17
counter(i) = 0
Next

sqlString = "SELECT distinct " & column & " From " & relation
recType.CursorLocation = adUseClient
recType.Open sqlString, conn

If recType.State = adStateOpen Then
recCount = recType.RecordCount

'0 = vbEmpty - Indicates Empty (uninitialized)
'1 = vbNull - Indicates Null (no valid data)
'2 = vbInteger - Indicates an integer
'3 = vbLong - Indicates a long integer
'4 = vbSingle - Indicates a single-precision floating-point number
'5 = vbDouble - Indicates a double-precision floating-point number
'6 = vbCurrency - Indicates a currency
'7 = vbDate - Indicates a date
'8 = vbString - Indicates a string
'9 = vbObject - Indicates an automation object
'10 = vbError - Indicates an error
'11 = vbBoolean - Indicates a boolean
'12 = vbVariant - Indicates a variant (used only with arrays of Variants)
'13 = vbDataObject - Indicates a data-access object
'17 = vbByte - Indicates a byte
On Error Resume Next

suggestString = ""
i = -1
Do Until recType.EOF
dataValue = recType.Fields(0).Value
i = VarType(dataValue)
If i <> 11 Then
If IsNumeric(Trim(dataValue)) Then
i = numericType(dataValue)

ElseIf i = vbString Then
If IsDate(dataValue) Then
i = 7
End If
End If
End If
counter(i) = counter(i) + 1
i = -1
recType.MoveNext
Loop

EmptyValue = counter(0) + counter(1)
If counter(8) > 0 Then
suggestString = &quot;String&quot;
ElseIf EmptyValue = recCount Then
suggestString = &quot;Empty&quot;
ElseIf counter(7) + EmptyValue = recCount Then
suggestString = &quot;DateTime&quot;
ElseIf counter(11) + EmptyValue = recCount Then
suggestString = &quot;Boolean&quot;
ElseIf counter(2) + EmptyValue = recCount Then
suggestString = &quot;Integer&quot;
ElseIf counter(3) + EmptyValue = recCount Then
suggestString = &quot;Long&quot;
ElseIf counter(4) + EmptyValue = recCount Then
suggestString = &quot;Single&quot;
ElseIf counter(5) + EmptyValue = recCount Then
suggestString = &quot;Double&quot;
ElseIf counter(6) + EmptyValue = recCount Then
suggestString = &quot;Curreny&quot;
ElseIf counter(17) + EmptyValue = recCount Then
suggestString = &quot;Byte&quot;
Else
If counter(5) > 0 Then
suggestString = &quot;Double&quot;
ElseIf counter(6) > 0 Then
suggestString = &quot;Currency&quot;
ElseIf counter(4) > 0 Then
suggestString = &quot;Single&quot;
ElseIf counter(3) > 0 Then
suggestString = &quot;Long&quot;
ElseIf counter(2) > 0 Then
suggestString = &quot;Integer&quot;
Else
suggestString = &quot;Byte&quot;
End If
End If
suggestType = suggestString
recType.Close
Set recType = Nothing
End If

End Function

Public Function numericType(dataValue As Variant) As Integer
Dim numType As Integer
Err.Clear

On Error Resume Next

If InStr(1, CStr(dataValue), &quot;.&quot;) = 0 Then 'Without decimal
numType = vbInteger
dataValue = CByte(dataValue)
numType = vbByte
If Err.Number = 6 Then
Err.Clear
dataValue = CInt(dataValue)
' numType = vbInteger
If Err.Number = 6 Then
Err.Clear
dataValue = CLng(dataValue)
' numType = vbLong
If Err.Number = 6 Then
Err.Clear
dataValue = CSng(dataValue)
' numType = vbSingle
If Err.Number = 6 Then
Err.Clear
dataValue = CDbl(dataValue)
' numType = vbDouble
If Err.Number = 6 Then
Err.Clear
dataValue = CStr(dataValue)
' numericType = vbString
End If
End If
End If
End If
End If
Else 'With decimal
dataValue = CSng(dataValue)
' numType = vbSingle
If Err.Number = 6 Then
Err.Clear
dataValue = CDbl(dataValue)
' numType = vbDouble
If Err.Number = 6 Then
Err.Clear
dataValue = CStr(dataValue)
' numType = vbString
End If
End If
End If
numericType = CInt(VarType(dataValue))
End Function
 
MichaelRed, are you saying that its not worthwhile to determine the smallest Type that can be used to store your data? Is there no merit to searching for fields set to store Longs that are, in fact only storing integer-sized numbers and then changing their Type to integer?
I've read/heard that, at least with Text fields, Access only uses as much 'space' as is necessary to store the value rather than the entire amount allocated to each record. But outside of that instance, it seems a useful endeaver to try to minimize field sizes.

Agreed that if a field was set to integer and someone tried to store a long, and the database didn't complain, it was probably truncated and now appears to be an integer but that seems to be beside the point.

Not trying to be argumentative. Just trying to save myself time should I ever find myself in a situation similar to that of tansc81's.
 
In my opinion i believe it best to use the smallest data type.. it's ok for small Database.. but how abt big database? there will be too much unused space...
 
Schroeder,

Approximatly (qualified) yes.


In my opinion, attempting to 'adjust' the types of the various Numeric fields is, at best, useless and possibly harmful.

Adjusting the size of test fields can, with some careful judgement, be a mechanisim to save quite a bit of 'storage' -but more importantly- it can improve performance (particularly for miultiuser apps).


Adjustment of &quot;numeric&quot; types will save at most six bytes per field per record, and probably result in much less than this.

Adjustment of text fields can be considerably more effective. Consider the (purely imaginary?) situation where the original 'designer' left most or all of the text fields atthe default (50 or 255 BYTES) and used [phone number | zip code | State (abbreviation) | Social Security Number | ... ] as typical fields. A reasonable review (actual storage required for these fields) will easily point out the potential savings of ~ 40 to ~~ 240 bytes per field pre record). Adjusting the field properties (width) to REASONABL reflect the actual usage saves ~~ *10 to *100 the storage of the adjustment from the numeric types.

I DO NOT recommend that such an adjustment be done &quot;automatically&quot; (via code or without careful analysis), but with a comparision of the disparate 'actual' use to the INTENDED function. For ONE example, consider the Phine # situation. Assume it was left to a default of 50 bytes. Further assume that the field in set up to include the full phone number, including punctuation (e.g. &quot;123-555-6789&quot;, so you get the 12 char string. Without any further 'checking' you automatically reduce the field width to 12 characters. Hooray for you? WRONG. The original design -along with documentation reflects the need to include foregin phone numbers, so the entire process has somehow ignored the 'country code', or you have been 'lucky' and -at least so far, the app has 'assumed' that all calls are FROM U.S. phones so U.S. Country code is un-necessary and all the remaining entries use fewer characters / digite in the internal phone numbers and (so far) no phone number exceeds the 12 char of the U.S. system. You (conveniently) also just assume the lack of consistiency (not all are 12) is simply that the area code is not required for 'local' calls. Today's hero. Tomorrow's goat (working late to un-fix) the problem?

Hopefully, you have the opportunity of other perspecitv(s), as (obviously?) tansc81 has.

MichaelRed
m.red@att.net

Searching for employment in all the wrong places
 
hi MichaelRed,

u got ur point too... Thanks very much!
 
If you are using this as a one off to look at exsisting data in one system to determine what types would be appropriate in a new system I agree otherwise I say you are doomed to have a problem later on.

A data type should be determined by a requirement not what happens to be in the dataset at that time. You might have a CHAR(2) field that only has numbers in it right now but if this column is used to hold information that could be alphanumeric but just happens to not have had no alpha character yet then changing the data type is doomed to failure.

You can build tools to add in the design process but tools can't know requirements themselve (lets not get into self describing systems).

Ie you could use the tool to goto the client and say.

This piece of data seems to only hold numeric data but has been defined as allowing alphanumeric data.
Can you think of any situation Alpha data would be present?
If all data is numeric then what is the range you need from this data?

Just be careful
 
Actually, I can now recall a situation like this. An Access database needed to be created on a daily basis and uploaded to a server via FTP. The database was read-only after it was uploaded. The guy at the other end who programmed the website that used my database demanded that I reduce each of the fields to their minimum size. They were almost all text fields so I didn't have to worry about accidental implicit type coercion. Later, I learned that I was wasting my time because Access only uses as much space as is necessary to store text fields. The size property just sets a maximum limit.

Now, if its not an Access database...
 
If it is a normal SQL database like oracle or SQL server then you use VARCHAR. But i'm still unsure why it is being done.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top