×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

More English dictionary fun

More English dictionary fun

More English dictionary fun

(OP)
Based on a file that contains 194,000 English words, Mike Lewis mentioned in thread1551-1811770: English dictionary fun

I mentioned a spin off question to his question about all words in the graph

L - A - R
| x | x |
C - O - I
| x | x |
T - A - F
 

What letters in which arrangement allow to build the most words?

I think the sheer number of 26^9 combinations (as letters can occur more than once and their position matters) cannot easily be brute forced.
So I thought of a milder problem of which 9 letter word inn the dictionary yields most words. There still is the issue of needing to test all paths in all the different ways the graph nodes can be populated.
So I thought I at least probe by choosing random paths of length 9 and random 9 letter words from the dictionary.

There are 784 full-length paths and there are 28608 9-letter-words, which may be reduced by palindromes and removing mirrored or rotated paths, but even without such optimizations means analysis of a magnitude of 10 million problems of that type. Could be tested in about 8 hours. Lets say within a day, when assuming 27 ms for a single problem setting is the average.

It could be fun to output the best solution found so far on the way. It will not include situations where no 9-letter word is found, but there might be more shorter words worth doing without the 9-letter-word category. Another idea is combining the letters of all 4 and 5 letter words to have statistics on which letter composition is most used. You can of course get an orientation from letter frequency analysis, but the number of repeated letters and letter combinations can shift this a bit. Many vowels are good, I guess repeated Es would work better but surely the optimum isn't all vowels.

Chriss

RE: More English dictionary fun

(OP)
Oh, that said I don't have done this yet nor will I start right now. But perhaps you bring in your own ideas or puzzles based on the dictionary.

Chriss

RE: More English dictionary fun

Chriss, you mentioned 9-letter palindromes. So I knocked up a quick program to find all the palindromes in the dictionary (ignoring words with three or fewer lettes). The longest are each seven letters.

CODE -->

* Looks for all single-word palindromes with more than three letters.

SELECT UPPER(cWord), nLen FROM Words82K ORDER BY nLen DESC WHERE IsPalindrome(cWord)

FUNCTION IsPalindrome
* Returns .T. if the specified word is a palindrome

LPARAMETERS tcWord

LOCAL lcWord, lcRight, lcReversed, lnI, lnHalfLen

lcWord = ALLTRIM(UPPER(tcWord))

IF LEN(lcWord) <= 3
  * Reject words of three letters or fewer.
  RETURN .F.
ENDIF

* Get the length of half the word, ignoring the middle letter if
* the word has an odd number of letters.
lnHalfLen = INT(LEN(lcWord) / 2)

* Get the right-most half-word and reverse its letters
lcRight = RIGHT(lcWord, lnHalfLen)
lcReversed = ""
FOR lnI = lnHalfLen TO 1 STEP -1
  lcReversed = lcReversed + SUBSTR(lcRight, lnI, 1)
ENDFOR

* Return .T. if the left-hand half matches the reversed
* right-hand half.
RETURN (lcReversed = LEFT(lcWord, lnHalfLen))

ENDFUNC 

Running time (excluding printing the results): 0.73 seconds.

Three longest palindromes:

Spoiler:


DEIFIED, REVIVER, ROTATOR

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

(OP)
Sorry, I meant anagrams, not just palindromes.

Edit: You see? hen you swap letters and apply all arrangements on the 3x3 nodes graph it's sufficient to test one of the anagram words. palindromes obviously are a special case of that.

Chriss

RE: More English dictionary fun

Quote:

Sorry, I meant anagrams, not just palindromes.

Not to worry. It was an interesting exercise to find the longest single-word palindromes.

What I would really like is some automatic way of generating multi-word palindromes, but that would be asking too much. (I'm thinking of things like being able before seeing Panama, or something vaguely like that <g>.)

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

Hi

Quote (Mike Lewis)

The longest are each seven letters.
Maybe I did not understood the challenge, but the longest palindromes in the english3.txt file are 9 letters long.

Spoiler (one-liner)

rev english3.txt | diff --old-line-format= --new-line-format= english3.txt - 

Spoiler (longest palindrome)

malayalam rotavator


Feherke.
feherke.github.io

RE: More English dictionary fun

(OP)
No, my general idea was, what if you don't have given letters. Maybe illustrated this way:

? - ? - ?
| x | x |
? - ? - ?
| x | x |
? - ? - ? 

So which letters do you need to put in to get most words out from traversing legal paths in this graph (the rules were visiting neighbor by neighbor, neither skipping over a node (ie the graph does not connect all letters to each other, only the center letter sees all others as its neighbors) nor visiting nodes twice overall and even not to use an o as oo in a word. And more strict rules someone suggested was not allowing paths to cross, eg not going right, down+left ,right, up+left, as the last step crosses the line of the second step. I won't follow that rule, so I will allow crossing, but you can always modify a task as you like to.

Since you can populate the 9 ? nodes with 26 letters you have 26^9 combinations, some that are mirrored or rotated could be removed, but the magnitude is rather that for the arrangement of any letter alone, and then each can be traversed in all the different allowed paths. The arrangement makes a difference, if you have two Os, but they are not neighbor nodes, you can't build words with oo.

So my thought to make that a bit easier is populate the nodes with 9-letter-words and find the optimal one in its optimal path to allow most other words, down to 1-letter-words and find out which 9-letter word in which arrangement yields the most words. And there anagrams make no difference, nor palindromes, sorry. But that's actually the least important about the task, just a possible way to shrink down the number of cases to process.

(Another edit: I think even anagrams make a difference, because you can have new direct neighbors you didn't have before, so that is even a red herring).

Chriss

RE: More English dictionary fun

Quote (Mike Lewis)

What I would really like is some automatic way of generating multi-word palindromes, but that would be asking too much. (I'm thinking of things like being able before seeing Panama, or something vaguely like that <g>.)

Not sure if you were being circumspect or not but my favorite Palindrome involves Panama.

A Man, A Plan, A Canal, Panama!

RE: More English dictionary fun

Feherke,

Yes, you did understand the challenge. The two words that you identified are indeed palindromes, and are nine letters long. However, they are both proper nouns, Rightly or wrongly, I am now using a slightly shorter word list (downloaded from the same source) that excludes many proper nouns. Maybe I'll go back to the longer one.

kwbMitel,

Yes, of course I was referring to the famous Panama palindrom, and also to the supposed quote re Napoleon's exile: "Able was I ere saw Elba."

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

I was researching Shakespeare's play, Love's Labor's Lost, to see if I might be interested in attending a performance next year. I read that it contains the word 'honorificabilitudinitatibus', which happens to be the longest single word in all of Shakespeare's works. According to Wikipedia, it also happens to be the second longest English word that consists entirely of alternating consonants and vowels.

That suggests a programming challenge to find the longest alternating consonant/vowel or vowel/consonant word in the dictionary used in this thread. It looks to me to be an easy programming assignment, except that I can't think of a simple way to handle the letter 'y'. One would need some artificial intelligence built in to distinguish 'y' as a consonant from 'y' as a vowel. It might be easiest to have a manual step at the end to examine the output and eliminate 'y' words that don't meet the requirements.

RE: More English dictionary fun

Quote:

One would need some artificial intelligence built in to distinguish 'y' as a consonant from 'y' as a vowel.

Alternatively, you could simply establish a rule that says that Y is always a consonant.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

OK. I've just written a quick program to find all words with alternating vowels / consonants (taking Y as a consonant). Ignoring words with fewer than six letters, it find 3,211 words, of which the longest contains 15 letters.

I'll do a bit more testing and tidy up the code before I post it. In the meantime, here is the 15-letter word:

Spoiler:

VERISIMILITUDES

Unless anyone can find a longer one?

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

So here is my code for finding the longest word with alternating vowel/consonants. It's in Visual FoxPro, but should be possible to translate to other languages fairly easily:

CODE -->

* Determines the longest words that contains alternating consontants
* and vowels.

* Create a tempory table to hold the results
CREATE CURSOR csrSolution (cWord c(24), nLen i)
    && assume longest word has 24 letters

* Open the dictionary table
IF NOT USED("Words82K")
    USE Words82K IN 0 ALIAS Words
ENDIF
SELECT Words

* Loop through the table ignoring words with fewer than 6 letters
SCAN FOR LEN(ALLTRIM(cWord)) >= 6

    lcWord = ALLTRIM(UPPER(Words.cWord))

    * replace all vowels with "A" and all consonants with "B"
    lcTest = CHRTRAN(lcWord, "AEIOU", REPLICATE("A", 24))
    lcTest = CHRTRAN(lcTest, "BCDFGHJKLMNPQRSTVWXYZ", REPLICATE("B", 24))

    * Create test pattern: either ABABAB... or BABABA... depending on
    * whether word being tested starts with a vowel or a consonant
    lcPattern = IIF(INLIST(LEFT(lcWord, 1), "A", "E", "I", "O", "U"), ;
        REPLICATE("AB", 12), REPLICATE("BA", 12))

    * Test the word against the test pattern
    IF lcTest == LEFT(lcPattern, LEN(lcTest))
        * We have a hit; write it to the cursoe
        INSERT INTO csrSolution (cWord, nLen) VALUES (lcWord, LEN(ALLTRIM(lcWord)))
    ENDIF

ENDSCAN

* Put into descending order of length
SELECT cWord, nLen FROM csrSolution ORDER BY nLen DESC INTO CURSOR csrSolution

* Display the results
BROWSE NOWAIT 

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

Incidentally, I just modified the program such that Y is treated as a vowel rather than a consontant. I get excactly the same result. So the issue of Y being a vowel or a consonant doesn't arise (at least, not for words with six or more letters).

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

Quote (Mike Lewis)

Incidentally, I just modified the program such that Y is treated as a vowel rather than a consontant. I get excactly the same result. So the issue of Y being a vowel or a consonant doesn't arise (at least, not for words with six or more letters).

In all probability you are right that none of the longer words contain a 'y'. However, I suspect your program doesn't conclusively demonstrate this. I don't see any code that allows for the possibility of multiple 'y's in the same word with at least one 'y' used as a consonant and another 'y' used as a vowel.

RE: More English dictionary fun

Quote:

I don't see any code that allows for the possibility of multiple 'y's in the same word with at least one 'y' used as a consonant and another 'y' used as a vowel.

Agreed. I didn't allow for that possibility. I would need to think about that.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

For what it's worth, I found 313 words from my dictionary that contain more than one "Y". Of course, not many of those will follow the "alternative vowel/consonant" rule, but it should be easy to work out which ones do. I'll do that in due course. But that won't answer the question about words where the "Y" occurs both as a vowel and a consonant.

And in case anyone is interested, the longest word that contains multiple Ys is: encyclopaedically (which my spell-checker has just rejected).

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

I have written my own program to find alternating consonant/vowel words. I suspect that Mike Lewis and I are using different dictionaries, since I found multiple words of 15 or more letters but not the 15 letter word Mike provided earlier. Here is my list:

Hidden:


15 letters
cytomegalovirus
monocotyledones
paramyxoviruses
unimaginatively

16 letters
hypocotyledonary
volatilisability
volatilizability

22 letters
honorificabilitudinity

RE: More English dictionary fun

By the way, there are some words that meet the alternating consonant/vowel criteria that have one 'y' used as a consonant and another 'y' used as a vowel. Examples are 'yakety' and 'yarely'. I happened to notice them towards the tail end of my output. I have no idea if there are longer examples, since the only way I have to detect them is by visually scanning my output.

My program treats every 'y' as a wildcard that can be either a consonant or vowel. Because of this I can find words like 'yakety', but I also get a lot of spurious hits that don't meet the alternating consonant/vowel criteria at all. An example is 'waylays', The first 'y' passes the a/y test as a consonant and the y/l test as a vowel. Similarly for the second 'y'. A more sophisticated program could doubtlessly eliminate these spurious hits within the progamming logic, but I'm not sure I'm sufficiently interested to make the effort.

RE: More English dictionary fun

I guess I was sufficiently interested to make the effort. I found a fairly simple way to make the fix and no longer seem to be getting spurious matches. This also allowed me to produce a list of 107 words of six or more letters that contain more than one 'y'. Visually scanning this list, I found the following words that use 'y' as both a consonant and vowel.

Hidden:

words of six or more letters with 'y' used as both a consonant and vowel
anyway
everyday
everyway
gayety
yakety
yarely

RE: More English dictionary fun

Quote (Karluk)

I suspect that Mike Lewis and I are using different dictionaries, since I found multiple words of 15 or more letters but not the 15 letter word Mike provided earlier.

When I started all this (with my first efforts at a Trackword), I was using a 194,000-word dictionary (http://www.gwicks.net/textlists/english3.zip), which I expect is the one that you are using, Karl. But then I switched to a smaller one (84,000 words: http://www.gwicks.net/textlists/engmix.zip). I can't remember exactly why I switched. It might have been because the bigger one had too many ridiculously obscure words.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

And using that smaller dictionary, her is my list of vowel/consonant alternating words:

CODE -->

15 letters
VERISIMILITUDES

14 letters
RECAPITULATIVE
REHABILITATIVE
VERISIMILITUDE

13 letters
DEMILITARISED 
DEMILITARISES 
DENATURALISED 
DENATURALISES 
DEPOLITICISED 
DEPOLITICISES 
MINERALOGICAL 
NUMEROLOGICAL 
RECAPITALISED 
RECAPITALISES 
RECAPITULATED 
RECAPITULATES 
REHABILITATED 
REHABILITATES 
REHABILITATOR 
TARAMASALATAS 
TOXICOLOGICAL 
UNIMAGINATIVE 


Not all of these are in my "real" dictionary, by the way.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: More English dictionary fun

By comparison, I get the following 21 words of 14 letters. It seems clear that your program not finding any long 'y' words is due in part to your dictionary not bothering to list words formed by adding 'ly' to a shorter word. Thus, you found 'unimaginative' but not 'unimaginatively'

aminosalicylic
deliberatively
deliverability
denominatively
hereditability
heterozygosity
manipulatively
pararosaniline
recapitulative
recapitulatory
recoverability
regeneratively
rehabilitative
relocatability
supererogative
supererogatory
tumorigenicity
vaporisability
vaporizability
verisimilitude
vituperatively

RE: More English dictionary fun

@Mike Lewis
I'm afraid my results using the shorter engmix.zip dictionary don't match yours. Your program doesn't seem to find a bunch of legitimate alternating consonant/vowel words that contain a 'y'. (You also seem to have manually edited out words that are minor spelling variants of another.)

My list using engmix.zip:
15 letters
unimaginatively
verisimilitudes

14 letters
manipulatively
recapitulative
recoverability
rehabilitative
supererogatory
verisimilitude
vituperatively

13 letters
demilitarised
demilitarises
demilitarized
demilitarizes
denaturalised
denaturalises
denaturalized
denaturalizes
depoliticised
depoliticises
depoliticized
depoliticizes
gynecological
imaginatively
ineligibility
inevitability
inimitability
mineralogical
monocotyledon
numerological
recapitalised
recapitalises
recapitalized
recapitalizes
recapitulated
recapitulates
rehabilitated
rehabilitates
rehabilitator
taramasalatas
toxicological
unimaginative

RE: More English dictionary fun

Thanks for letting me know. I'll check my program, and also check which dictionary it was using. I certainly didn't manually edit my list.

I noticed earlier that my list only shows British spellings, not American ones (at a quick glance). Yet your list has both American and British variations, such as "demilitarised" and "demilitarized", which is strange. I'll investigate.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close