×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Anyone create text and similarity functions in vfp

Anyone create text and similarity functions in vfp

Anyone create text and similarity functions in vfp

(OP)
Hey vfp community,

I have been using Jaro–Winkler, Levenshtein, & Max Similarity in excel to clean up data and find matching pairs.

Have anyone created functions for these 3 text and similarity functions in vfp?

Any help is appreciated!

RE: Anyone create text and similarity functions in vfp

New to me, I know what Jaro–Winkler is, having looked it up in Google, but never needed it.

Regards

Griff
Keep Smileing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

RE: Anyone create text and similarity functions in vfp

Yes, I wrote a Levenshtein function in VFP. I published it in FoxPro Advisor. But unfortunately that was many years ago. I no longer have the article or the code, and it is no longer available on line (as far as I know).

The function turned out be quite slow, mainly because it relied on recursion.

In addition, while Levenshtein is good for comparing two string and evaluating their proximity, it is not suitable for searching a large table in VFP. That's because you can't index a string on its Levenshtein value. To do that, you would need to know what string you want to compare it with, which of course you don't know in advance.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Anyone create text and similarity functions in vfp

By the way, do you know that VFP has a DIFFERENCE() function, which is supposed to evaluate similarity between strings. If I understand it right, it works a Soundex principle, so it might be useful for searching for names that you might have mis-heard over the phone, but less useful for finding typing errors, for example.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: Anyone create text and similarity functions in vfp

Not come across difference() either.
One of those weekends I guess

Regards

Griff
Keep Smileing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

RE: Anyone create text and similarity functions in vfp

I'm wondering if there is something like it in the spell checker I put into VFP apps

** update **
The spell checker I'm using is FoxSpell, and it uses soundex()


Regards

Griff
Keep Smileing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

RE: Anyone create text and similarity functions in vfp

It's also a good case to use C++ code, build a DLL from it and use that from VFP.

There are some minor hurdles and ideally, you'd even go one step further and build an FLL, but a DLL is good enough. Or an assembly, too, used with West Winds dotNetBridge as discussed in length in the thread thread184-1803903: How to use QR barcodes in an easy way

The only downsides are further dependencies. With C++ DLLs you introduce the need for msvcrt120.dll (depending on what VS version you use). Or a .NET framework.
With VS .NET 2003 you target msvcrt71.dll, the C++ runtime VFP9 itself needs anyway, for example, which thus means no new dependency. Some code will also need the msvcp71.dll or newer version, you find out with dependency walker (depends.exe). The advantage of C++ DLL or FLL solutions is they don't need registering, just C++ DLL or DLLs.

And then you can use many more resources of implementations, even simple ones like Levenshtein from Rosetta Code: https://rosettacode.org/wiki/Levenshtein_distance#...

And some minor hurdles are, that VFP may not support any type. For example the C++ std:string type used here, but there's a simple fix to change the parameter datatype to char*, which VFP does pass in when you specify STRING as datatype in DECLARE calls, and in the C++ function body set internal function variables of std::string type, it's not hard to find conversions from simpler to more complex data types, in this case you could pass in char* c1 and then simple declare the s1 variable by std:string s1=c1; std:string has a constructor that converts char*. It's also a reason to take the C++ code and compile a DLL project yourself, to be able to make such modifications allowing easier usage from VFP.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: Anyone create text and similarity functions in vfp

Another hurdle I just needed to fix to try this:

Create a new VS solution with the project template "DLL (Dynamic Link Library) with exports", the difference to the normal DLL project template is that "with exports" creates a header file "YourDLL.h" including the declarations of a sample function for external usage. The actual code implementations of a sample function and something else are in turn in a separate "YourDLL.cpp" code file.

And you can almost go straight to building this solution for a test and get a DLL working for VFP, too. You want to change the target platform to x86, not x64, otherwise no chance to use this from VFP. Maybe already change from Debug to Release, too, but not that important. What's far less obvious and thus most important to tell here is, that the way the declarations are written in the sample header file, VFP won't find the entry point to a function, for example for the sample DLL function fnYourDll() of the project template.

There's just a slight fix necessary, change the header file here

CODE --> VS DLL with exports template header file

YOURDLL_API int fnYourDll(void); 
and prefix it:

CODE --> fixed for DLL entry point visibility to VFP

extern "C" YOURDLL_API int fnYourDll(void); 

If you now build you can use the DLL in VFP with

CODE --> VFP

*CD into the Visual Studio output path to Debug or Release folder with YourDll.DLL
DECLARE INTEGER fnYourDll in YourDll.dll
? fnYourDll() && prints 0 
Watch out that in general, not just for VS C++ DLLs DECLARE is case sensitive.

To extend this, start with such a declaration in the header. A light bulb icon will appear as VS detects this declaration has no definition yet and offers to copy this over into the cpp file to match the declaration. Why at all? Ask the c++ inventors, I guess all these declarations in a header file will give a short overview of that functions are available, nowadays you could just collapse a code file to only show the declaration head of the definition to have that same overview. Anyway, that's the C++ world and this will also be found in code you may find as implementation of string function X or encryption function Y or whatever you'd like to have in VFP. It's also an extensibility vector of VFP.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: Anyone create text and similarity functions in vfp

(OP)
Thanks for all the responses!

I never knew VFP had a difference() or soundex() function. I just played with the functions but I feel they aren't what I need.

Koen thanks for the link and I have tested those examples. I am surprised the results are whole numbers and not percentages like they show up on Excel. The whole numbers are throwing me off since I was expecting a 0-1 number (percentage).

Nigel I really like the link you gave, Thank you for that. The Jaro-Winkler function is written in MariaDB unfortunately and would have been great if it was written in both VFP and MariaDB.

Thank you Olaf for the detailed info. I might have to give that a try in the future if I can't find exactly what I am looking for. That looks like a big jump for me but I will tackle that if nothing pans out.

I am more interested in the Jaro–Winkler function.

The work I was doing that made this really help was for names. I had an HR file with employees proper names spelled out. They needed to attend a training and would not always write their name down the proper way, usually a nickname, shorthand, missing words, etc. The Jaro-Winkler and Levenshtein function helped figure out who attended the training even with their names all messed up. I noticed during this exercise that the Jaro-Winkler function return more accurate results and we ended up using that more often to find our matches. I want to use these functions in my future projects.

RE: Anyone create text and similarity functions in vfp

LevenShtein in its base definition is not a number between 0 and 1, it's editing distance. You can get ot to that with a small calculation, as the max value is the length of the longer string, you can divide by that length. Damerau-Levenshtein does so, too, besides other small differences to the original algorithm.

Sure, using another IDE, a language you are not at all used to and merely copy a function declaration/definition into a template in the trust it works is a big jump. But indeed you also rely on VFP code from third party without looking at all of it, just the description of the usage of it. It's obviously easier to deal with problems in VFP code, even when there would be errors.

It's something usable for many more cases than string functions, so it pays to dive into the Visual Studio IDE, the Community edition will be sufficient for such DLLs.

But talking of Jaro-Winkler, it's also not that much more C++ code:
https://rosettacode.org/wiki/Jaro-Winkler_Distance...

See? and there are many more resources besides Rosetta Code, especially for C++ all kinds of GitHub repositories and other open source project platforms.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: Anyone create text and similarity functions in vfp

TinyNinja,

scroll down further... there's a vfp class to download.

n

RE: Anyone create text and similarity functions in vfp

(OP)
Hey Nigel, Thanks for that call out, I found it. I see it has the prg that matches the fox.wikis site but it is nice that an testing prg came with it.

Olaf, You are right and I will dabble in the visual studio IDE and see what I can create with your reccomendations.

Thank you all for the help!

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close