Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Can't find unicode character property definition via main-> 1

Status
Not open for further replies.

richdthomas

Programmer
Sep 1, 2002
19
GB
Hi,

I have this problem in a Perl script of mine, I have reproduced the problem with the following short script:

#!/usr/local/ActivePerl-5.6/bin/perl

my $ARGUMENT=$ARGV[0];

printf("Argument is %s\n",$ARGUMENT);

my @OUTPUT=`dir 2>&1`;

my @GREPPED=grep(/$ARGUMENT/i,@OUTPUT);


Here is the result of running the script with slightly different input argouments:


D:\pc\perl>unicode.pl "\program files"
Argument is \program files
Can't find unicode character property definition via main->r or r.pl at unicode/Is/r.pl line 0

D:\pc\perl>unicode.pl "\p rogram files"
Argument is \p rogram files
Can't find unicode character property definition via main-> or .pl at unicode/Is/ .pl line 0

D:\pc\perl>unicode.pl "\pt rogram files"
Argument is \pt rogram files
Can't find unicode character property definition via main->t or t.pl at unicode/Is/t.pl line 0

D:\pc\perl>unicode.pl "program files"
Argument is program files



As you can see, the problem occurs when the argument has "\p" in it. Also the character that follows "main->" in the output is the character that follows the "\p" in the input argument.

I have looked at the Perl documentation and I cannot find a special meaning for "\p". There are other letters that when placed after the \ do have a special meaning. E.G. "\n" is newline etc. I have ActivePerl installed and the page is where I found all of the \ codes is in c:\perl\html\lib\pod\perlop.html.

As I want to make my script simple, I do not want to leave so that the user has to remember that arguments that contain \p will not work. It may be that \\p would work, or I could use / as a directory seperator instead, but it's something that the user would have to remember and I want to keep away from that.

Please could anybody who can help give me some assistance.

Thanks in advance.
 
This is because '\p' means something special inside a regex (just like '\w' and '\d').
From the perlunicode perldoc:
[tt]Named Unicode properties, scripts, and block ranges may be used like character classes via the \p{} "matches property" construct and the \P{} negation, "doesn't match property".[/tt]

You need to 'protect' the variable, and tell perl that everything contained in it is to be taken literally:
[tt]
my @GREPPED=grep(/\Q$ARGUMENT\E/i,@OUTPUT);
[/tt]
jaa
 
Hi Jaa,

Thanks very much for this reply, the protection works in my small reproduceable script and also in the original script that I had the problem with.

That one really gave me a headache and it's great to have an answer.

Thanks again,

Richard Thomas.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top