suggestions for a spellchecker
suggestions for a spellchecker
(OP)
I've created a little spellchecker.
This script works in this way:
After reading each line of the text, realizes some corrections
thanks to a comparison between a dictionary and the text itself.
When it finds a word that doesn't exist in the dictionary, it corrects the words
(giving one or more suggestions) and pushes it into an array.
Here there's my problem:
I would like to give to the user the possibility to choose the correct word
among the words suggested. Something like this:
We found the word "wlak" in your text which isn't correct.
The suggested possibilities are:
1. walk
2. work
type the number associated to the word or 0 if you can't find the correct word.
Then I would like to replace the correct word on the original text (creating a new .txt).
How can I do this?
This script works in this way:
After reading each line of the text, realizes some corrections
thanks to a comparison between a dictionary and the text itself.
When it finds a word that doesn't exist in the dictionary, it corrects the words
(giving one or more suggestions) and pushes it into an array.
Here there's my problem:
I would like to give to the user the possibility to choose the correct word
among the words suggested. Something like this:
We found the word "wlak" in your text which isn't correct.
The suggested possibilities are:
1. walk
2. work
type the number associated to the word or 0 if you can't find the correct word.
Then I would like to replace the correct word on the original text (creating a new .txt).
How can I do this?
RE: suggestions for a spellchecker
How can I do this?
http://www.xcalcs.com : Online engineering calculations
http://www.megamag.it : Magnetic brakes for fun rides
http://www.levitans.com : Air bearing pads
RE: suggestions for a spellchecker
Without code it's impossible to handle it I know. This is my code.
CODE --> perl
TEXT
French Dictionary
RE: suggestions for a spellchecker
-the word as is
-the word with every character swapped with its neighbor
-the word with every single character deleted
-the word with one character added at every possible place (but parer changed to parler is a bad example, as parer exists in french!)
-the word with every single character replaced by another one.
A few notes on your code:
-the dictionary is created twice
-you should close, after reading them, the files you open
-grep(/^$filetext[$i]$/, @dictionary is better written (faster) as $filetext[$i] ~~ @dictionary
-this code
CODE -->
CODE -->
In essence I think that what you try to do is a gigantic task (not like going to the moon, but...), unless of course this is a divertissement.
Concerning your question, I think that a possible strategy would be to retain the punctuation (but what about guillemets, apostrophes and ...?) in your text array (possibly using split/\b/, though this will also retain the blanks) and then skipping those during the analysis. At the end your text is rebuilt with a join of the text array.
Good luck
http://www.xcalcs.com : Online engineering calculations
http://www.megamag.it : Magnetic brakes for fun rides
http://www.levitans.com : Air bearing pads