×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

English dictionary fun

English dictionary fun

English dictionary fun

(OP)
I don't know if anyone is still lurking in this forum. If so, here's a little exercise for you:

What are the longest words in English that can be made up from each row of a standard English typewriter keyboard?

So, each word can only contain letters from a single row, but each letter can be used multiple times.

I recently downloaded a file that contains 194,000 English words, and I hope to use it to generate various word games and puzzles. I started out by writing a program to find the answer to the above question. The results are quite interesting.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Hi

This kind of problems I prefer to solve without writing programs.

CODE --> command line

master # grep -xE '[qwertyuiop]+' english3.txt | wc -L
11

master # grep -xE '[qwertyuiop]{11}' english3.txt
proprietory
rupturewort

master # grep -xE '[asdfghjkl]+' english3.txt | wc -L
8

master # grep -xE '[asdfghjkl]{8}' english3.txt
alfalfas
falashas
galahads
haggadah

master # grep -xE '[zxcvbnm]+' english3.txt | wc -L
3

master # grep -xE '[zxcvbnm]{3}' english3.txt
bbc
bmx
mcc
xxv
xxx 

Feherke.
feherke.github.io

RE: English dictionary fun

Probably the best way to solve NYTimes xword puzzles is to get PHDs in literature and history. Personally, I'm too old and too not interested.

Anyway, FWIW, here's my code in the Valid method of the text box:

CODE -->

* Search word list to match input template
* Input characters: a-z or A-Z (case insensitive) are hardwired to their places
*   in the input string
* Leading and trailing spaces are trimmed.
* All other characters (including spaces if embedded) are considered blanks
*   in their respective places except trailing asterisk(s)
* Trailing asterisks are used to add additional word lengths to search.

skeleton = LOWER(ALLTRIM(THISFORM.Text1.Value))
skelLen = LEN(m.skeleton)

* Calc length(s) of words to search
IF RIGHT(m.skeleton,1) = '*'
 minLen = m.skelLen - 1
 FOR ii = m.skelLen-1 TO 1 STEP -1
  IF SUBSTR(m.skeleton,m.ii) = '*'
   minLen = m.minLen - 1
  ELSE
   EXIT
  ENDIF
 ENDFOR
 whereCl = 'LEN(TRIM(Word))>=' + LTRIM(STR(m.minLen)) + ' AND ' ;
         + 'LEN(TRIM(Word))<=' + LTRIM(STR(m.skelLen))
 skelLen = m.minLen
ELSE
 whereCl = 'LEN(TRIM(Word))=' + LTRIM(STR(m.skelLen))
ENDIF

* Form where clause
FOR ii = 1 TO m.skelLen
 cChar = SUBSTR(m.skeleton,m.ii,1)
 IF BETWEEN(m.cChar,'a','z')
  whereCl = m.whereCl + ' AND SUBSTR(Word,'+LTRIM(STR(m.ii))+',1)="' + m.cChar + '"'
 ENDIF
ENDFOR

IF USED('Temp')
 USE IN Temp
ENDIF

* Select words into Temp.dbf
sfx = IIF(THISFORM.Op1.Value=1, '', 'Fr')
SELECT Word ;
  FROM Words&sfx ;
  INTO TABLE Temp ;
  WHERE &whereCl

*WordList.Grid1.RecordSource = 'Temp.dbf'
*WordList.Grid1.Column1.ControlSource = 'Temp.Word'
THISFORM.LbResults.Caption = LTRIM(TRANS(RECCOUNT(),'99,999,999'))
WordList.Grid1.Refresh()

*USE IN Temp

*WITH THISFORM
*.CmdSearch.Enabled = .F.
*.Text1.Value = ''   && Reset
*.Text1.SetFocus()
*ENDWITH 

RE: English dictionary fun

Hi Mike,

Out of curiosity, I'd be interested in your "other" approach.

Steve

RE: English dictionary fun

VB6/VBA version (essentially feherke's solution, but as a program smile )

CODE

Option Explicit

Public Sub example()
    Dim re As RegExp
    Dim mymatches As MatchCollection
    Dim fso As New FileSystemObject
    Dim lp As Long
    Dim max As Long
    Dim pattern As Variant
    
    Set re = New RegExp
    For Each pattern In Array("qwertyuiop", "asdfghjkl", "zxcvbnm")
        max = 0
        With re
            .pattern = "^([" & pattern & "]+)$"
            .MultiLine = True
            .Global = True
            Set mymatches = .Execute(fso.OpenTextFile("d:\downloads\english3.txt", ForReading).ReadAll)
         End With
         For lp = 0 To mymatches.Count - 1
            If Len(mymatches(lp)) > max Then max = Len(mymatches(lp))
         Next
         For lp = 0 To mymatches.Count - 1
            If Len(mymatches(lp)) = max Then Debug.Print pattern & "(" & max & "): " & mymatches(lp)
         Next
     Next
End Sub 

RE: English dictionary fun

(OP)
Feherke,

A good idea to use grep. I didn't think of that. Your results are the same as mine, which is encouraging.

Feherke and Steve,

Here is the highly simplified pseudo-code for my method (I wrote the program in Visual FoxPro, but there is nothing language-specific about it):

CODE -->

Index the table (the word list) on length of word.

Loop through the table in descending order (so, start with the longest word in the table).

For each row on the keyboard:

  From the current word, remove those letters that are also present in the current keyboard row
  (there is a built-in VFP function which does that)

   If the result of the above operation is an empty string

      Record the current word as a hit

   If we have now found a hit for the current row AND the current word is shorter than
   the hit

      We've finished with this row

   End of loop on each row of the keyboard

End of loop on current word 

I don't claim that this is the most efficient method, but it only takes a couple of thousand milliseconds to run (on a local machine).

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

I'm disappointed by what appears to be the answer to the top row of keys.

I've always understood this to be a different answer based on a puzzle question from back in my childhood.

The answer I understood to be correct was somewhat ironic which is why it has stuck with me.

How sure are we of the "correct" answer? It seems to simply be a alternate spelling of another word.

RE: English dictionary fun

(OP)
Considering only words that are in common use in English, here is a summary of the results:

Top row

PROPIETORY

But I am inclined to dispute this. According to Chambers, the preferred spelling is PROPIETARY.

Middle row

ALFALFAS

Since this row only has one vowel, I was surprised to get any interesting result at all (although I am not convinced that ALFALFA has a plural).

Bottom row

Given there are no vowels in this row, it's not surprising not to get any actual words - just abbreviations and Roman numbers.

The longest of the above words has 11 letters. I've now tweaked the program to give all words with ten or more letters. These are as follows (all from the top row):

REPERTOIRE

PERPETUITY

TYPEWRITER

I especially like the last of the above. It's nice that one of the longest words that you can type on one row of a typewriter is TYPEWRITER.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Before I program something I know and like that you can write the word TYPEWRITER just using the upper letter row of a keyboard. I also guess that's not by chance.

For the search at hand you can easily go through all words, to get there most efficiently, I'd look into algorithms for finding anagrams. It's not exactly the task, but I guess you could modify something to not just allow words with all letters but only some of them and allowing repeats.

It doesn't really pay since the word list is quite small and the brute force method already is fast enough.

Chriss

RE: English dictionary fun

@Mike Lewis, you sussed out my ironic answer. I am also noting for future reference that it is not the only word of that length that is possible using the top row. Makes it less special in my mind and I may have to retire this tidbit of worthless trivia or at least not sell it so hard.

RE: English dictionary fun

The question of the longest word using the only the top row of the keyboard came up in an old thread thread1229-1712464: Words with special letter qualities At the time I used an anagram solver to come up with "preprototype" as a possible 12 letter solution. The same anagram solver generated the following ten letter words:

perpetuity
pirrouette
proprietor
repertoire
typewriter

RE: English dictionary fun

(OP)
Thanks for that, Karluk. The thread that you mentioned looks interesting. I'll read the whole thing when I can grab a moment.

PREPROTOTYPE is not in my list of 194,000 words. Maybe it should be. Then again, what does it mean? A prototype is something that exists before the thing that is being prototyped. So is a preprototype something that exists before the prototype?

Ah, I have just googled the word. There are lots of hits, mainly to do with medical research and usability testing. So,yes, let's accept it.

Good to see this thread has generated so much interest.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

(OP)
Moving on from typewriter keys, my next exercise will be to produce "trackword" puzzles. (The name "trackword" might be copyright, so I might have to call it something else.)

The puzzle is based on a 3 x 3 grid of letters - see example below. The aim is:

(a) Find the hidden nine-letter word. You start from any letter, and move through the grid one cell at a time, horizontally, vertically or diagonally, visiting each cell only once.

(b) Using the above navigation rule, find as many words as possible of three letters or more.

Here is an example:

L A R
C O I
T A F


At a quick glance, I can see 16 contained words. But the aim is to write a program that would find as many as possible. I haven't yet thought how to do that, but I imagine it would be quite a bit harder than finding the longest word for each keyboard row. (Feel free to make suggestions.)

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

In fact - or I also could be wrong - I don't get a 9-letter word from this.

Chriss

RE: English dictionary fun

(OP)
Chris, I can assure you that there is (at least) one 9-letter word there, and it is a word that most of us here will be familiar with.

Anyone else managed to find it?

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

(OP)
I said earlier that I had found 16 words. But the more you look at it, the more words jump out at you. My score is now up to 21.

These were all found "manually", as it were. I haven't yet tried to automate the process.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Hi

After all this discussion decided to solve it programmatically too :

CODE --> longest-word.rb

result = []
ARGF.each_line chomp: true do |word|
    ['qwertyuiop', 'asdfghjkl', 'zxcvbnm'].each_with_index do |row, nr|
        next if 0 > longer = word.length <=> result[nr][0].length rescue 1

        if word.chars.all? {|char| row.include? char } then
            result[nr] = [] if longer > 0
            result[nr].push word
        end
    end
end

puts result 

CODE --> command line

master # ruby ruby/longest-word.rb english3.txt 
proprietory
rupturewort
alfalfas
falashas
galahads
haggadah
bbc
bmx
mcc
xxv
xxx 

Feherke.
feherke.github.io

RE: English dictionary fun

>Anyone else managed to find it?

yep. With a program, but not one that properly follows the rules of the game. Yet.

RE: English dictionary fun

Hi

Also found it. Also with a program, also not following the rules. Yet.

For my curiosity, strongm, you also sorted the letters or found other simple alternative ?

Feherke.
feherke.github.io

RE: English dictionary fun

(OP)
What do you both mean by "not following the rules"? Do you mean that you found a word that contains exactly the specified nine letters, and that word just happened to fit the rules? If so, that's perfectly legitimate.

But now try to find all the words (three letters or more) that follow the rules. A bit harder, I think.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Hi

Spoiler (I mean not following the rules like this)

master # ruby -nle 'puts $_ if "aacfilort" == $_.chars.sort * ""' english3.txt 
factorial 

Feherke.
feherke.github.io

RE: English dictionary fun

(OP)
Here is some (simplified) pseudo-code for my first shot at a program to find all the contained words:

CODE -->

Start with a 3 x 3 array containing the puzzle

Initialise the list of hits

Flag all the cells in the array as not yet visited

Loop through the array, looking at each of the nine cells in turn

  Set a global variable (call it gcWord) to an empty string

  Call a subsidiary function, passing the current cell's coordinates

End of the loop through the array

We now have the complete list of words; it remains only to eliminate
any duplicates

-----

Here is the subsidiary function:

If the cell (passed as the parameter) is out of range OR it has 
already been visited

  Exit the function

Flag the current cell as visited

Search the table for words that start with gcWord plus the letter in 
the current cell

If found

  If this is an exact match AND it has more than three letters

    Add the word to the list of hits

  Add the letter in the current cell to the end of gcWord

Else (not found)

  Clear gcWord

  Exit the function

Recursively call the same function for each of the eight adjacent
cells (never mind if some of those cells lie outside the array;
these will be ignored) 

I haven't tried to code this yet. It looks quite complicated, and probably has logical errors. Perhaps someone can suggest a simpler approach?

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

>ou also sorted the letters

Yes, there is some letter sorting involved smile

RE: English dictionary fun

(OP)
Feherke, I'm impressed with the conciseness of your Ruby code.I can't imagine writing it in VFP in such a small amount of code.

That said, I feel it must be relatively easy to get the nine-letter word, if you "don't follow the rules". You would find all the 9-letter words that have exactly the same letters as the puzzle, and then examine them by hand (by eye?) to see which ones fit the rules. (In practice, I would guess most puzzles would only produce one possible word.)

What is more interesting is finding all the possible words (three letters or more) that follow the rules.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Mike,

hehe, you failed to see my spoiler.

Spoiler:

In fact - or I also

Chriss

RE: English dictionary fun

>What is more interesting is finding all the possible words (three letters or more) that follow the rules.

Working on it ...

RE: English dictionary fun

Hi

Mike, I wrote a proper solver in Ruby for the trackword puzzle. I would suggest to start a thread dedicated to that one. Discussing multiple puzzles in the same thread will become hard to read.

Feherke.
feherke.github.io

RE: English dictionary fun

Another idea for the 3x3 grid: Find which letters (and their positioning) yield the most words.
Presumabley by symmetry there will be 8 solutions you can boil down to 1 by excluding rotated and mirrored solutions.

Chriss

RE: English dictionary fun

(OP)

Quote (Chriss)

hehe, you failed to see my spoiler

Oh,no. How did I miss that? A neat answer, Chris. You'll be writing cryptic crossword clues next.

Quote (Feherke)

I would suggest to start a thread dedicated to that one [Trackword]. Discussing multiple puzzles in the same thread will become hard to read.

I take your point, Feherke. I think I'll just finish this one by posting the complete solution to the above puzzle (once I've finished my program to solve it - unless anyone else gets there first). I might perhaps start a new thread after that, if anyone is interested.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Mike,

yes, I'd also agree on more new threads. The puzzles forum isn't very popping so it doesn't hurt to have multiple threads on the same theme of word puzzles, does it?

Chriss

RE: English dictionary fun

(OP)
I've temporarily given up trying to write a fully-automatic solver (using recursion, based on the pseudo-code I posted above). It's proving more difficult than I thought.

So I've now down a semi-automatic brute force method. This produces all the words all of whose letters appear in the puzzle, but not necessarily following the navigation rules. I then manually inspected them and eliminated those that don't follow the rules.

Here then is the solution. There are 38 words in the solution, compared to 16 that I found by trying to solve the puzzle by myself. I've checked that all of the 38 are present in Chambers.

Spoiler:


ACT, ACTOR, AIR, ARIA, CAR, CAROL, CLOT, COAL, COAT, COIR, COR, CORAL, COT, FACT, FACTOR, FACTORIAL, FAIR, FAT, FIAT, FOAL, FOCAL, IOTA, LAC, LAIR, LOAF, LOIR, LOT, LOTA, OAF, OAR, OAT, ORAL, RIOT, ROC, ROT, ROTA, TACO, TAFIA

Thanks for all your feedback on this. I'll try to improve my program when I can grab some more time.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Hi

My solution found 90 words. ( Their sorted lowercase list, one word per line, has MD5 d15bf996dbea633df799d048355dfa2a. ) Checked only one randomly : you missed "cairo".

Feherke.
feherke.github.io

RE: English dictionary fun

(OP)
I manually eliminated CAIRO because I decided to disallow proper nouns. This is the advantage of creating a puzzle. It gives you complete power over the rules.

Were your 90 words all ones that followed the navigation rules?

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Mike, I haven't started to program, yet. But my idea differs and led me to the variation of the question I suggested earlier:
Populate a structure with 1-9, create all valid numeric combinations, replace the digits with letters and find whether these are valid words.

This is just the brute force version of it, though, and likely can be optimized further. You only need to start from 1,2, and 5, because starting from anywhere else is just creating a path that's a rotation or mirror of the paths starting at these 3 digits.

One other idea I had would create all subpaths with just 2 or 3 digits (or letters) and find these combinations in words. You could also start to replace letters with digits and exclude any words which don't fully translate into the 9 digits have any digit twice.

I think a combination of these ideas could take the problem on from both sides of a list of allowed words (the dictionary file minus proper nouns per your rule) and from the geometry (the graph). The general idea is any method roughly creating a set of solutions can be combined to find the intersection or union.

Chriss

RE: English dictionary fun

>This produces all the words all of whose letters appear in the puzzle, but not necessarily following the navigation rules

Yep, that's where I started ...

And I also gave up (perhaps temporaril;y) with the recursion I was then going to use to verify them (it is, as you say, not quite as simple as recursing down a directory tree). Considering alternatives currently. But sounds like feherke is way ahead of the game smile


RE: English dictionary fun

Hi

Quote (Chriss)

Populate a structure with 1-9, create all valid numeric combinations, replace the digits with letters and find whether these are valid words.
That is how I also did it. With great help from Ruby at generating the combinations :

Spoiler (trackword.rb)

word_maze = ARGV[0]
word_list = File.readlines 'english3.txt', chomp: true

3.upto 9 do |length|
    [*0...9].permutation length do |order|
        previous = nil
        next unless order.all? {|i| next if previous and ((previous / 3 - i / 3).abs > 1 or (previous % 3 - i % 3).abs > 1); previous = i }

        word = order.map {|i| word_maze[i] }.join

        puts word if word_list.include? word
    end
end 
You pass the letter sequence as parameter, concatenated in a single word :

CODE --> command line

ruby trackword.rb larcoitaf 
For this output I changed it abit to avoid posting 90 lines :

CODE --> output

3 letters : lar lac lao lor lot act air aia rac rai roc rot ria rio ria car col cor coi cot cat oar ora oca oca oat oaf ira tol tor tac tao tai act aia air for foi fir fat
4 letters : lair loir lota loaf aria acta raca rota rial riot clot caro cola coal cora coir coif coat cato oral octa iota tola tori taco atoc foal fora fiar fiat fact fair
5 letters : ariot actor claro clair calor carol cairo coral cairo taira tafia acari actor focal facto
6 letters : lariat factor
7 letters :
8 letters :
9 letters : factorial 
Note that words that can be tracked on 2 different paths are included twice : act, actor, aia, air, cairo, oca, ria.

Quote (Mike)

I manually eliminated CAIRO because I decided to disallow proper nouns.
Well, I guess the "cairo" on line 21504 of english3.txt refers to the cairo graphics library :

CODE --> command line

master # grep -xn cairo english3.txt 
21504:cairo 


Feherke.
feherke.github.io

RE: English dictionary fun

(OP)

Quote (Strongm)

sounds like feherke is way ahead of the game

That's right. And so in Chriss. I need to get my head round their approach. It's not there yet.

Quote (Feherke)

I guess the "cairo" on line 21504 of english3.txt refers to the cairo graphics library

Silly of me. I should have thought of that. (I could argue that the name of a software product is a proper noun. But I won't)

Quote (Feherke)

Note that words that can be tracked on 2 different paths are included twice

That would happen with all the methods discussed. You would have to do a de-dupe:

CODE -->

SELECT DISTINCT cWord FROM Solution 


which also puts them into alphabetical order.

Beofre looking again at the above suggestions, I'm going to have one more shot at a brute-force method.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

(OP)
So here's what I've got in mind now.

I start with my existing semi-automatic brute force method which produces all the words all of whose letters appear in the puzzle, but not necessarily following the navigation rules. In the case of the FACTORIAL puzzle, that produces 165 words.

For each result, I then apply a test for adjacent letters. From the dictionary word, I take each pair of consecutive letters in turn. I search the puzzle word for each occurrence of the first letter of the pair (the puzzle word consists of the nine letters in the grid, in the order in which they appear in the grid; so in this case it would be LARCOITAF).

I then do a CASE structure (or a series of IFs, depending on the language). It would look something like this:

Position of first        Test the letters at these
letter of the pair       positions in the puzzle
within the puzzle        word against the second
word (1-based)           letter in the pair

   1                       2, 4, 5
   2                       1, 3, 4, 5, 6
   3                       2, 5, 6
   4                       1, 2, 5

    and so on
 

If the second letter of the pair is NOT at one of the positions indicated, then the test fails. Otherwise, continue with the next pair. If you get to the end and the test has not failed, then you have a hit.

I haven't coded this yet. I might have to spend some token time today on some real work, but will have a shot at this when I can grab some time. In the meantime, please keep up with all your other efforts.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Hi

Got a bad idea to make the puzzle more difficult : avoid the path crossing itself. Thus making one of the "cairo" invalid solution :

accepted :     rejected :
L  A  R        L  A  R
     /|          / \/|
    / |         /  /\|
C  O  I        C  O  I
 \   /
  \ /
T  A  F        T  A  F
 
No idea for now how could this be handled efficiently.

Feherke.
feherke.github.io

RE: English dictionary fun

>If you get to the end and the test has not failed, then you have a hit.

Also need to consider that if the test fails there may still be a different starting point that provides a hit (e.g. consider ATOC for current puzzle - fails if we use the first A in the grid, succeeds if we use the 2nd A)

RE: English dictionary fun

strongm,

good point, that's why I thought about working with the 9 digits. You then can create the allowed paths as a digit srting and turn these to letters or you can convert words to digits. In that direction you can take into account that letters may appear in more than one place.

Chriss

RE: English dictionary fun

(OP)

Quote:

if the test fails there may still be a different starting point that provides a hit

Yes, that's why I said "I search the puzzle word for each occurrence of the first letter of the pair." I envisage an outer loop for each consecutive pair; then an inner loop for each occurrence of the first letter of the pair within the puzzle word.

Quote:

Got a bad idea to make the puzzle more difficult : avoid the path crossing itself.

That would make it mind-bogglingly difficult (at least, for me).

I wondered about establishing a rule to say that every solution must go through the centre cell. I think that would make it easier. In any semi-automatic solution, you could greatly reduce the number of words to manually inspect, by discarding all words that don't contain the middle letter.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

(OP)

Quote:

Got a bad idea to make the puzzle more difficult : avoid the path crossing itself.

It occurs to me that that would make it much too easy to solve the first part of the puzzle, that is, finding the nine-letter word. You would have to arrange the puzzle so that the word in question always starts in of the corners, and always follows a horizontal or vertical path (row by row or column by column). In that case, the word would probably jump out at you. And if it didn't, there would only be eight possible paths to check (two directions for each of the four corners).

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

If you work with a graph implementation you can store which node edges cross each other and deny path that have that. Another way extending the idea to always include the middle node: Create all possible graphs with non-crossing edges, this may even be feasible to do manually, any diagonal edge can be slash or backslash, connecting either the middle node to a corner node or an edge node (2,4,6,8) with each other. That means you always have 4 slashes for the diagonals that can be slash or backslash and thus have 2^4= 16 graphs instead of the one where all nodes are connected to all neighbors.

To illustrate it.

That's the graph with all neighbor nodes connected:

CODE

1-2-3
!X!X!
4-5-6
!X!X!
7-8-9 

That's a graph avoiding the crossings, ie avoiding the X and only using slash or backslash connections:

CODE

1-2-3
!/!/!
4-5-6
!\!\!
7-8-9 
It's one of 16 graphs you can traverse in any way without any crossing.


RE: English dictionary fun

Hi

Found the solution for my bad idea posted at 17 Sep 21 09:11.
Actually is pretty simple, just have to record the midpoints between the checked coordinate pairs :
    0   1   2
     0.5 1.5
0   L   A   R
     \ / \ /
 0.5  X   X
     / \ / \
1   C   O   I
     \ / \ /
 1.5  X   X
     / \ / \
2   T   A   F
 

Spoiler (trackword-no-cross.rb)

word_maze = ARGV[0]
word_list = File.readlines 'english3.txt', chomp: true

3.upto 9 do |length|
    [*0...9].permutation length do |order|
        previous = nil
        path = []
        next unless order.all? do |i|
            if previous then
                next if (previous / 3 - i / 3).abs > 1 or (previous % 3 - i % 3).abs > 1
                path.push [previous / 3 + i / 3, previous % 3 + i % 3]
            end
            previous = i
        end
        next if path.uniq!

        word = order.map {|i| word_maze[i] }.join

        puts word if word_list.include? word
    end
end 
This results only 82 words ( again, output changed a bit to avoid posting 82 lines ) :

CODE --> output

3 letters : lar lac lao lor lot act air aia rac rai roc rot ria rio ria car col cor coi cot cat oar ora oca oca oat oaf ira tol tor tac tao tai act aia air for foi fir fat
4 letters : lair loir lota loaf aria acta raca rota rial riot clot caro cola coal cora coir coif coat oral octa iota tola tori taco atoc foal fora fiar fiat fact fair
5 letters : ariot actor claro clair coral cairo taira tafia acari focal
6 letters : lariat
7 letters :
8 letters :
9 letters : 
This way the number od words that can be tracked on 2 different paths gets reduced to the 3 letter words ( which are not affected by my bad idea ) : act, aia, air, oca, ria.

Feherke.
feherke.github.io

RE: English dictionary fun

(OP)
I haven't tried to follow up Feherke's soltuion yet - I may get to it later - but I've been refining my brute force method.

In my previous attempt, I found all words whose letters were all present in the puzzle, but without regard to the navigation rules. In the case of the FACTORIAL puzzle, that produced 165 words. I've now inserted a test for "adjacency", which means that a word will be rejected if no consecutive pair of letters in the word matches a pair of letters in adjacent cells of the puzzle. This brought the word count down to 76.

But I'm not quite there yet. As Strongm pointed out, this won't work if any letter occurs more than once in the puzzle, as is the case with the A in FACTORIAL. In that case, we will get false positives. For example, it finds RAT, even though this does not follow the navigation rule. That's because R is adjacent to A, and A is adjacent to T - but it's a different A.

However, I reckon on always doing a manual review of the results, to eliminate proper nouns, etc. as well as any ultra-obscure words. So I can manually eliminate any words that fail the navigation test at the same time. In this case, that brought the score down from 76 to 38, which matches the figure I got from my initial attempt (as per the solution in post dated 16 Sep 21 15:21 above).

I'm just tidying up my code and adding comments. If anyone is interested, I'll post it here.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

(OP)
So here is the code for my latest solution, as described in my previous post. The code is in Visual FoxPro. For the benefit of those not intimate with the language, I have added some explanatory comments, which I hope you will find helpful.

First, here is the code for my initial brute-force method. This finds all words that contain the correct letters, but fails to take account of the navigation rules. For the moment, disregard the text flagged as TEMP COMMENTED-OUT. This is code that wasn't present in the initial version.

CODE --> VFP

* Trackword solver. Written in Visual FoxPro 9.0.

LOCAL lcWord, lcPuzzle, lcPuzzleMain, lnI, llFound

CREATE CURSOR csrSolution (cWord c(9))
    && In VFP, a "cursor" is a temporary table.

* The Words table contains two columns; cWord is the dictionary word;
* nLen is the length of the word. Both columns are indexed.
IF NOT USED("Words")
    USE Words IN 0
ENDIF
SELECT Words
SET ORDER TO nLen

lcPuzzleMain = "LARCOITAF"
    && Here, we are hard-coding the puzzle word (FACTORIAL); in practice
    && we would pass it in as a parameter or read it in from an
    && external source.
LOCATE FOR nLen = 3
SCAN WHILE nLen <= 9
    * Do a pass or the Words table, looking at all rows whose word length
    * is between 3 and 9.
    lcWord = UPPER(ALLTRIM(Words.cWord))
    lcPuzzle = lcPuzzleMain

    llFound = .F.
    && In VFP, .F. means False and .T. means True

    * For each letter in the puzzle word, remove the corresponding letter
    * (if any) in the dictionary word.
    FOR lnI = 1 TO LEN(lcWord)
        lcLetter = SUBSTR(lcWord, lnI, 1)
        IF AT(lcLetter, lcPuzzle, 1) > 0
            lcPuzzle = STRTRAN(lcPuzzle, lcLetter, "", 1, 1)
            llFound = .T.
        ELSE
            * Letter not present in puzzle word, so skip remainder of this word
            llFound = .F.
            EXIT
        ENDIF
    ENDFOR

    IF llFound
        * All letters in dictionary word are also present in the puzzle word.
    ***    * Now test for adjacency    TEMP COMMENTED-OUT
    ***    IF IsAdjacent(lcPuzzleMain, Words.cWord)   TEMP COMMENTED-OUT
            INSERT INTO csrSolution (cWord) VALUES (Words.cWord)
    ***    ENDIF   TEMP COMMENTED-OUT
    ENDIF

ENDSCAN

* Remove duplicates from the solution
SELECT distinct * FROM csrSolution INTO CURSOR csrSolution

* We now have all the valid results in a cursor, which we can copy
* to permanent table or a text file, or use as the source
* for a report. 

In order to take account of the navigation rules, I then added a function to test pairs of letters for adjacency. If a word from the result set fails this test, it is discarded. So, now un-comment the text flagged as TEMP COMMENTED-OUT, then add this function:

CODE -->

FUNCTION IsAdjacent
LPARAMETERS tcPuzzle, tcTestWord
* Returns true if each pair of letters in the test word corresponds
* to an adjacent pair of letters in the puzzle word.

LOCAL lnI, lnJ, lnTestLen, llMatch, lcLeft, lcRight

tcPuzzle = UPPER(ALLTRIM(tcPuzzle))
tcTestWord = UPPER(ALLTRIM(tcTestWord))

* Copy letters of puzzle word to an array; this is done purely to
* simplify the syntax of the code that follows.
LOCAL ARRAY laP(9)
FOR lnI = 1 TO 9
    laP(lnI) = SUBSTR(tcPuzzle, lnI, 1)
ENDFOR

lnTestLen = LEN(tcTestWord)

llMatch = .T.
FOR lnI = 1 TO lnTestLen - 1
    * For each consecutive pair of letters in the test word
    lcLeft = SUBSTR(tcTestWord, lnI, 1)	&& first letter of pair
    lcRight = SUBSTR(tcTestWord, lnI+1, 1)	 && second letter of pair
    FOR lnJ = 1 TO OCCURS(lcLeft, tcPuzzle)
        * For each occurrence of the first letter of the pair within the
        * puzzle, compare the second letter of the pair with each of
        * the adjacent cells. To understand this, it helps to think of the
        * puzzle as a 3 x 3 grid, with the cells numbered row by row. So
        * the letter in cell 1, for example, is being compared with those in
        * cells 2, 4 and 5.
        lnPos = AT(lcLeft, tcPuzzle, lnJ)
        llMatch = ICASE( ;
            lnPos = 1, INLIST(lcRight, laP(2), laP(4), laP(5)), ;
            lnPos = 2, INLIST(lcRight, laP(1), laP(3), laP(4), laP(5), laP(6)), ;
            lnPos = 3, INLIST(lcRight, laP(2), laP(5), laP(6)), ;
            lnPos = 4, INLIST(lcRight, laP(1), laP(2), laP(5), laP(7), laP(8)), ;
            lnPos = 5, INLIST(lcRight, laP(1), laP(2), laP(3), laP(4), laP(6), laP(7), laP(8), laP(9)), ;
            lnPos = 6, INLIST(lcRight, laP(2), laP(3), laP(5), laP(8), laP(9)), ;
            lnPos = 7, INLIST(lcRight, laP(4), laP(5), laP(8)), ;
            lnPos = 8, INLIST(lcRight, laP(4), laP(5), laP(6), laP(7), laP(9)), ;
            lnPos = 9, INLIST(lcRight, laP(5), laP(6), laP(8)))
        IF llMatch
            * The second letter of the pair matches an adjacent cell, so skip any
            * further occurrences of the first letter within the puzzle.
            EXIT
        ENDIF
    ENDFOR

    IF NOT llMatch
        * No match in the current pair of letters, so skip any further pairs.
        EXIT
    ENDIF

ENDFOR

RETURN llMatch

ENDFUNC 


It's not the most elegant code I have ever written, but I hope it makes sense.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

>But I'm not quite there yet.

Nor me. I am at about the same place as you. I have a fairly simple funbction called Walkable that verifies if a word can be walked, but currently does not handle multiple letter occurrences. But only because I am being dumb ...

RE: English dictionary fun

Ok, got it! Now just need to clean up the code and optimise ...

RE: English dictionary fun

(OP)

Quote:

currently does not handle multiple letter occurrences. But only because I am being dumb ...

No you're not. It turns out to be harder than you probably thought.

Quote:

Ok, got it! Now just need to clean up the code and optimise ...

Looking forward to seeing the results. I've also got an idea how to solve the duplicate-letter problem. What I lack right now is the time to do anything about it.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Since the letter position in the 3x3 grid plays a role, I think my suggestion to first work with numbers 1 to 9 and then replace them with the letter at that position would help to figure out whether a letter that's twice in the grid works at the place it is.

I based my try on an idea for a python graph class, which includes a "find_all_paths" method:

https://www.python-course.eu/graphs_python.php

And used that in a Jypiter notebook online:
https://hub.gke2.mybinder.org/user/ipython-ipython...

Which gives about 10000 paths.

Now it becomes a query finding the numbers translated to the words in the word list.

Chriss

RE: English dictionary fun

Sorry, sharing that Jypiter notebook cannot be shared that way, but here's the python code outputting the ~10000 paths:

CODE --> 3

g = { "1" : {"2", "4", "5"},
      "2" : {"1", "3", "4", "5", "6"},
      "3" : {"2", "5", "6"},
      "4" : {"1", "2", "5", "7", "8"},
      "5" : {"1", "2", "3", "4", "6", "7", "8", "9"},
      "6" : {"2", "3", "5", "8", "9"},
      "7" : {"4", "5", "8"},
      "8" : {"4", "5", "6", "7", "9"},
      "9" : {"8", "5", "6"}
    }


graph = Graph(g)

allnodes = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
for startnode in allnodes:
    for endnode in allnodes:
        paths = graph.find_all_paths(startnode, endnode)
        for path in paths:
            print (path) 

Once I had the path list I created data with path and path (digits) translated to wordcandidates. Included english3.txt as table of word and then let SQL do it:

CODE

Select wordcandidates.* From wordcandidates Inner Join words On wordcandidates.Word = words.Word 

wordcandidates has path, word (well, candidate word).

Chriss

RE: English dictionary fun

And from that path list you can deduce the words (including their path on the numpad):

Spoiler:

84236 acari
247 act
847 act
2478 acta
84753 actor
24753 actor
86 ai
26 ai
268 aia
862 aia
863 air
263 air
21 al
25 ao
85 ao
23 ar
2368 aria
23657 ariot
87 at
8754 atoc
4 c
48 ca
42 ca
42635 cairo
48635 cairo
42153 calor
423 car
4235 caro
42351 carol
487 cat
4875 cato
41263 clair
41235 claro
4157 clot
45 co
4521 coal
4587 coat
456 coi
4569 coif
4563 coir
451 col
4512 cola
453 cor
4532 cora
45321 coral
457 cot
9 f
98 fa
9847 fact
98475 facto
984753 factor
984753621factorial
9863 fair
987 fat
96 fi
9623 fiar
9687 fiat
963 fir
95 fo
9521 foal
95421 focal
956 foi
953 for
9532 fora
6 i
69 if
65 io
6578 iota
632 ira
12 la
124 lac
1263 lair
125 lao
123 lar
123687 lariat
14 lc
15 lo
1589 loaf
1563 loir
153 lor
157 lot
1578 lota
5 o
589 oaf
523 oar
587 oat
548 oca
542 oca
5478 octa
59 of
56 oi
53 or
532 ora
5321 oral
3 r
32 ra
324 rac
3248 raca
326 rai
36 ri
368 ria
362 ria
3621 rial
365 rio
3657 riot
35 ro
354 roc
357 rot
3578 rota
7 t
78 ta
784 tac
7845 taco
78962 tafia
786 tai
78632 taira
785 tao
75 to
751 tol
7512 tola
753 tor
7536 tori

Notice some paths result in the same word.

Chriss

RE: English dictionary fun

Found a teeny bug in mine. Just working on the fix. My code has some similarities to Chriss's in that it usilises numeric paths. The version I have got working deals with repeated letters (as that is not an actual issue with the number ptahs), and as a side effect also doesn't provide duplicate words. In essence:

1. Generate all walkable paths, and represent as numeric strings
2. Use regexp similar to initial puzzle to find all unique 1 - 9 letter words that only use letters from the puzzles word
3. Don't worry that these words are not anagrams of the puzzle word* (e.g. aaa), just check each one against the walkable paths. If there is a match, then we have a legit word


*This means that the letter sorting I mentioned in a previous post is no longer required

RE: English dictionary fun

So, nowhere near as succint as feherke's solution:

CODE

Option Explicit

Public CandidatePath() As String
Dim p As Long

Public Const legitsteps = "12 15 14 21 23 24 25 26 32 35 36 41 42 45 47 48 51 52 53 54 56 57 58 59 62 63 65 68 69 74 75 78 84 85 86 87 89 96 95 98"

' Kick it all off ...
Public Sub DoIt()
    Puzzle "LARCOITAF"
End Sub

Public Sub Puzzle(source As String)
    Dim re As RegExp
    Dim mymatches As MatchCollection
    Dim fso As New FileSystemObject
    Dim lp As Long
    Dim item As Variant
    Dim LegitResults  As New Collection
    
    source = LCase(source)
    
    Permutations ' Generate array containing all legitimate candidate paths (permutationss) of the puzzle

    ' Regexp brute force to get all words that only contain letters from the puzzle word
    Set re = New RegExp
    With re
        .pattern = "^([" & source & "]+)$"
        .MultiLine = True
        .Global = True
        Set mymatches = .Execute(fso.OpenTextFile("d:\downloads\english3.txt", ForReading).ReadAll)
    End With

    ' Originally had a routine hear to prune the word list so that it only contained anagramns and puzzle string (and puzzle substrings).
    ' However this is effectively dealt with by looking for legitimate paths, so we can subsume this functionality into our
    ' path walking verification

    
    ' Check word  fits parameters of puzzle (3 to 9 chars) and is walkable, and add to legitresults if so
    For lp = 0 To mymatches.Count - 1
        If Len(mymatches(lp)) > 2 And Len(mymatches(lp)) <= Len(source) Then
            If NewWalkable(mymatches(lp), source) Then
                LegitResults.Add mymatches(lp), mymatches(lp)
            End If
        End If
    Next

    ' Show results in immediate window
    For lp = 3 To 9
    Debug.Print
        Debug.Print lp; " ";
        For Each item In LegitResults
            If Len(item) = lp Then Debug.Print item; " ";
        Next
    Next
End Sub


' Initiate building walkable permutations of the puzzle grid
' hacked together recursion. Not optimal ...
Sub Permutations()
    ReDim CandidatePath(0)
    p = 0
    RecursePerms "", "123456789"
End Sub

' This does the recursion heavy lifting
Sub RecursePerms(c00, c01)
    Dim pathlen As Long
    Dim lp As Long
    
    If Len(c01) = 0 Then
        pathlen = Verify(c00 & c01)
        If pathlen Then
            ReDim Preserve CandidatePath(p) As String
            If p <> 0 Then
                If CandidatePath(p - 1) = Left(c00 & c01, pathlen) Then p = p - 1 'Try to avoid replicating subpaths
            End If
            CandidatePath(p) = Left(c00 & c01, pathlen) 'Add a verified path to our candidate path list
            p = p + 1
        End If
    Else
        For lp = 1 To Len(c01)
            RecursePerms c00 & Mid(c01, lp, 1), Left(c01, lp - 1) & Right(c01, Len(c01) - lp)
        Next
    End If
End Sub

' Is the proposed solution a viable path?
Public Function Verify(Solution) As Long
    Dim lp As Long
    Dim path As String

    ' Build a path to test
    For lp = 1 To Len(Solution) - 1
        path = path & Mid(Solution, lp, 1) & Mid(Solution, lp + 1, 1) & " "
    Next
    path = Trim(path)
    
    Verify = Len(Solution)
    For lp = 1 To Len(path) Step 3
        If InStr(legitsteps, Mid(path, lp, 2)) = 0 Then
            Verify = (lp \ 3) + 1 ' This is the point the walk fails, capture how far we were successful
            Exit For
        End If
    Next
End Function

Public Function NewWalkable(testword As String, Puzzle As String) As Boolean
    Dim lp As Long
    Dim lp2 As Long
    Dim testee As String
    'Dim source As String
     

    For lp = 1 To UBound(CandidatePath)
        'source = CandidatePath(lp)
        testee = ""
        If Len(CandidatePath(lp)) >= Len(testword) Then
            For lp2 = 1 To Len(testword)
                 testee = testee & LCase(Mid(Puzzle, Mid(CandidatePath(lp), lp2, 1), 1))
                 If InStr(testword, testee) <> 1 Then Exit For
            Next
            If testee = testword Then
               NewWalkable = True
               Exit For
            End If
        End If
    Next
End Function 

RE: English dictionary fun

Reawakaniung this thread temporarily - just interested in the efficiency of Ruby in feherke's solution to the core wordmaze problem. Just curious as to how long it takes to generate all the permutations . For comparison, my recursive VBA routine takes just under 2 seconds which I'd imagine is dramatically slower (but is admittedly far from optimised; could probably be speeded up somewhat if I simply switched to byte arrays rather than using strings)- but just interested to see the scale of the difference.

RE: English dictionary fun

(OP)
I can't speak for Feherke's Ruby solution, but my brute-force Visual FoxPro method takes 0.30 seconds to generate all the valid solutions, including saving the results to a cursor but excluding the time taken to display them on the screen.

A quick update. My program is now fully working, in that it correctly identifies all the solutions. But it still has one fault: it gives false positives in cases where any letter appears more than once in the puzzle. I still have to weed these out by hand. I think I found a way of avoiding that, and have started to amend the code accordingly, but I got side-stepped by some real work (very annoying) and have had to put it on one side for the moment. I want to get back to it as soon as I can.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: English dictionary fun

Why just Ruby?

The Jypiter Python binder I used earlier is already lost and a new binder session I used also already expired. But you can still see it measured ca. 27 ms for finding all paths (without printing them).


I don't know the specs of the server running online, I guess it's having more resources than a usual desktop and without that a timing is less informative. I found it a bit irritating that Python still has no graph implementation, it has dictionaries, sets and the nice concept or arbitrary long integers. But you see you can easily find code for anything.

Before I hijack this as a thread about favorite or most useful programming languages: If your question is driven from whether to step up from VBA to anything else: Yes, pretty much anything is faster. But you see from my profile you can be pretty stuck on legacy languages, too, and it's still lucrative, even becomes more lucrative the more developers turn their back and move to other languages and tools.

I would welcome a discussion and though this forum was meant for logic puzzles more than for puzzling questions about what to learn and use, I think this forum could bring together all kinds of programmers, if pointing to a new thread within your usual forums. The only other mentionable forum under the category "Development Methodologies and Architecture" is forum678: Object-Oriented Analysis & Design. Perhaps there should be a new one more general, as OOP also isn't the only standard paradigm anymore.

Maybe also in forum656: Tech Industry Trends. It's more general than just about programming, but there's nothing wrong about that, too.


Chriss

RE: English dictionary fun

Hi

Well, the earlier posted solution is painfully slow :

CODE --> line

master # time ruby trackword-all-in-maze.rb larcoitaf > /dev/null
0m23.690s 
That because Array#include? is quite slow.

But if we store the word list in a Set instead of Array, becomes faster :

Spoiler:

require 'set'

word_maze = ARGV[0]
word_list = File.readlines('english3.txt', chomp: true).to_set

3.upto 9 do |length|
    [*0...9].permutation length do |order|
        previous = nil
        next unless order.all? {|i| next if previous and ((previous / 3 - i / 3).abs > 1 or (previous % 3 - i % 3).abs > 1); previous = i }

        word = order.map {|i| word_maze[i] }.join

        puts word if word_list.include? word
    end
end 

CODE --> command line

master # time ruby trackword-all-in-maze.rb larcoitaf > /dev/null 
0m0.810s 

Also possible to speed it up by using a single Array#intersection instead of many Array#include?, but that way the words with multiple paths are listed only once :

Spoiler:

word_maze = ARGV[0]
word_list = File.readlines 'english3.txt', chomp: true

candidate_list = []

3.upto 9 do |length|
    [*0...9].permutation length do |order|
        previous = nil
        next unless order.all? {|i| next if previous and ((previous / 3 - i / 3).abs > 1 or (previous % 3 - i % 3).abs > 1); previous = i }

        candidate_list << order.map {|i| word_maze[i] }.join
    end
end

puts word_list.intersection candidate_list 

CODE --> command line

master # time ruby trackword-all-in-maze.rb larcoitaf > /dev/null
0m0.685s 

Feherke.
feherke.github.io

RE: English dictionary fun

Another reason you can't compare the running time is I only measured finding all possible paths. The rest is just an SQL query on the data, I didn't integrate the parts into something running standalone, partly because once you have the data there is no need to not store and reuse it.

Chriss

RE: English dictionary fun

>If your question is driven from whether to step up from VBA to anything else:

Nope. Been programming for over 30 years (although these days more as a dabbler than as a profession). Well aware of strengths and weaknesses of different languages. Was just vaguely curious

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close