Separate name from different columns
Separate name from different columns
(OP)
Hi!
I have a set of names. Is it possible to separate it from different columns?

Thank you!
I have a set of names. Is it possible to separate it from different columns?

Thank you!
RE: Separate name from different columns
CODE
Simpler, less readable perhaps...
CODE
Regards
ing
Griff
Keep
I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
There is no place like G28 X0 Y0 Z0
RE: Separate name from different columns
Somebody wrote a great article about this some years ago that included a set of tables that helped with American names. I can't remember who wrote it or where it appeared, but maybe someone else here will have their memory jogged by this.
Tamar
RE: Separate name from different columns
I vaguely remember an article like that, Tamar. But I thought it was one that you wrote yourself. Or am I thinking of one you wrote about generating realistic test data? I seem to recall that you had a list of the top so-many girls' and boys' names from the US Census or something similar. Or maybe not.
Mike
__________________________________
Mike Lewis (Edinburgh, Scotland)
Visual FoxPro articles, tips and downloads
RE: Separate name from different columns
using the shortness (2 char) nature?
But the ROEL VERGEL type pair for the forename... that is a poser.
Could a combination of word count and shortness be used?
I worry that there probably are asian names like Oh that would spoil this
approach.
Regards
ing
Griff
Keep
I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
There is no place like G28 X0 Y0 Z0
RE: Separate name from different columns
Are you suggesting the use of look up table(s) for the possible name pairs - based on a census or some such - would enable this to be
processed more easily?
So a table with (say) three entries with ROEL, and separate records for PHILIP, VERGEL, MANUEL and then scan each of the OPs
records looking for those pairs and accepting that as a forename (for example)?
Another table of GENNAME as well, JNR, SNR, II, III, IV etc...?
Would it be possible trigger that based on a word count > 4?
What would you do with our ex-PM though?
Regards
Griff
Keep
I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
There is no place like G28 X0 Y0 Z0
RE: Separate name from different columns
If you can assume the full name data you have always starts with the last name, you still have the cases like DE LA PAZ. The only sure thing is GETWORDNUM(NAME,1) is either that last name or part of the last name. That's too little information to continue. Usually, if full names are stored as Last, First with a comma but your data doesn't even have that as a definite split position.
The largest set of name data I googled is here: https://github.com/philipperemy/name-dataset
The downside is that it comes from a Facebook data leak and using that data might not be ethical to do. The section "Composition" points to a newspaper article about this data breach.
Deciding whether or not to use this or other data sets, you have to invent some way to get to the name parts as you like to store them and it's not as easy as just getting the single parts separated by whitespace alone.
That said, if you have less than 100 names you surely have it easier to do this manually without getting into intricate details of determining name parts by comparing them with a database of names that are at least already split by lastname, firstname. But even the Facebook data leak won't have the middle names separated. I don't think. I know Facebook has a real name policy for the longest time, you could still join Facebook with a fantasy name, so even that aspect is still in this data: That it has non-valid names.
What you could start with is sorting out all the names that only have two parts to be lastname,firstname. And I guess that would cover more than 80% of the data and then it only matters what remains to be done manually.
Chriss
RE: Separate name from different columns
Mike
__________________________________
Mike Lewis (Edinburgh, Scotland)
Visual FoxPro articles, tips and downloads
RE: Separate name from different columns
I don't see how you could split these examples up without some kind of look up for common double word names though.
If you take Mr Musk's childrens naming scheme into account it gets even worse!
Regards
ing
Griff
Keep
I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
There is no place like G28 X0 Y0 Z0
RE: Separate name from different columns
I have a vague sense that the name splitter might have come from Sue Cunningham, but I'm also thinking of Whil when I remember this, and it may have been someone else entirely. But I remember that it could handle more complex names (though now I'm wondering if that was only about formatting them right, not figuring out first, middle, last).
Tamar
RE: Separate name from different columns
One of my first clients - this was in the early days of PCs when MailMerge was a novelty - wanted to send a personalised letter to each of his customers. So he wrote a little program in MBASIC to parse out the elements of each customer's name.
If the name consisted of exactly one word, he would treat that word as the last name, and address the letter as "Dear <title> + space + <last name>". For all other cases, he devised a scheme which he thought would identify the customer's first name, so that the letter could be addresses simply as "Dear + space + < first name>". He thought he was so clever.
One of his customer's was an official at the Saudi embassy in London. I don't remember that person's actual name, but it was something like Adel bin Salman Al Saud. Unfortunately, it was only after my client had mailed the letters that he realised that the person in question was addressed as "Dear Al".
The moral: Don't rely on an algorithm when dealing with something as sensitive as people's names.
Mike
__________________________________
Mike Lewis (Edinburgh, Scotland)
Visual FoxPro articles, tips and downloads
RE: Separate name from different columns
This is even more true when you can't determine the Last- and Firstname e.g. French names "Jean Michel" or "Ronald Pierre" or "Michel Jeannette" ...
MarK
RE: Separate name from different columns
CODE
Chriss
RE: Separate name from different columns
The reason I ask is that, if it is a one-off and the volume is not huge, I would favour doing it manually. As you can see by the replies in this thread, it is unlikely that an automatic solution would give you a correct result. At the very least, use the code given by Griff above, and then manually review the result and make any necessary adjustments.
Mike
__________________________________
Mike Lewis (Edinburgh, Scotland)
Visual FoxPro articles, tips and downloads
RE: Separate name from different columns
Mike has one point: No matter what volume the current data has, if it's a regularly occurring requirement and the new names you get are usually new names and not ones you already have split up before, then it would be good to program something able to identify lastnames, titles or "gennames", like roman numerals or JR. And there are tools and services online, you could use. Just one result from a google search:
https://workspace.google.com/marketplace/app/split...
It's a google spreadsheet add on to split a name column into parts. No idea how well it does so for your purpose, but you could give it a try or search for others.
Chriss
RE: Separate name from different columns
Steve
RE: Separate name from different columns
In fact JOHNNY BASACA seems the only normal 2 part name. But I guess the sample data is handpicked to showcase these hard cases.
I'd really be interested in the statistics my query would show, because I guess my 80% estimate of 2 part names is on the low end - just thinking of the not always correct 80/20 rule. Most people have only two names - first and last. Maybe more than one thinks have a middle name, they just almost never use it.
It's also easy to detect and split off a standalone JR or I, II, III, but an I alone may also just be an initial of a name with a missing dot. And that's just the beginning of many problems you can have. Most commonly that first names can also be last names and vice versa. Well, that would be solved if all the full names start with the last name, but I'd not bet on it.
You could still split the 2 part names easily and cross-checking with a database of first and last names to find the most likely order of them unless you have a JAMES DEAN, DEAN MARTIN, or MARTIN JAMES and don't know whether they are DEAN JAMES, MARTIN DEAN, or JAMES MARTIN instead.
Chriss
RE: Separate name from different columns
Thank you for all the replies. I really had a hard time on this, it will only be used when the .dbf file was given to the user maybe monthly? I already have a solution and it doesn't need to be separate anymore because the database has a full name column. Before I found out that it has a full name column, I tried to separate it first in Excel by using text-to-column but it didn't work out as I wanted.
:)