×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

A bit of a regex problem
2

A bit of a regex problem

A bit of a regex problem

(OP)
I am using a regex to substitute dates appearing in strings to superscript.
The standard format of the strings is a name followed by a period and a date.
ie.
John Smith.1967
Joe Doe.1945
Alan Evans.1943

This works fine although I am sure some of you guys can achieve the same results with a one liner.

CODE

1>	if(($Valid == 0) && ($results[$x] =~ m/\.1/)){
2>		for($k=1800; $k<2030; ++$k){
3>			$DotDate=".$k";
4>			if($results[$x] =~ m/\.$k/){
5>				$results[$x] =~ s/$DotDate/<sup>($k)<\/sup>/g;
6>			}
7>		}
8>	} 

The problem I have is that when line 5 converts a date, if the same date exists in the string, without the period, that is converted too.
I maybe missing something but the routine does not appear to be converting $DotDate (ie .1950) but just 1950.
I wish I had used a colon instead of a period but I am sure there is a solution to this.

Any help greatly appreciated

Keith
www.studiosoft.co.uk

RE: A bit of a regex problem

Hi

I only analyzed your code in big steps, but if there you are performing 230 regular expression matches to replace all year in the 1800..2030 range, then I would strongly suggest to not do that.

Personally I prefer a more single generic regular expression which may match false positives too, then check the found number whether is in range or not :

CODE --> Perl

$str='
John Smith.1967 something without dot 2014
Joe Doe.1945 something with dot.0666 but too small
Alan Evans.1943 something else
';

$str =~ s!\.(\d{4})!$1~~[1800..2030]?".<sup>$1</sup>":$&!ge;

print $str; 

CODE --> output

John Smith.<sup>1967</sup> something without dot 2014
Joe Doe.<sup>1945</sup> something with dot.0666 but too small
Alan Evans.<sup>1943</sup> something else 

Of course, if I misunderstood your goal, please clarify with more examples.

Feherke.
feherke.ga

RE: A bit of a regex problem

As feherke has already stated, you should simplify your logic by taking advantage of the fact that you can execute code in the right side of a regex to selectively replace things. This is pretty close to what he's provided with just a couple small differences:

CODE

use strict;
use warnings;

my $data = do {local $/; <DATA>};

$data =~ s{\.(\d{4})\b}{
	($1 >= 1800 && $1 < 2030)
	? "<sup>$1</sup>"
	: $&
}ge;

print $data;

__END__
John Smith.1967
Joe Doe.1945
Alan Evans.1943
Too Much.2040
Too few.1776
no dot 2010
just right.2014 

Outputs

CODE

John Smith<sup>1967</sup>
Joe Doe<sup>1945</sup>
Alan Evans<sup>1943</sup>
Too Much.2040
Too few.1776
no dot 2010
just right<sup>2014</sup> 

- Miller

RE: A bit of a regex problem

IMHO this is not a task for regexes. Split is much faster and cleaner (and readable) for such a simple task.

CODE -->

if($Valid==0){
  my($name,$year)=split/\./,$results[$x];
  $results[$x]=$name.'<sup>'.$year.'</sup>'if$year>=1800&&$year<2030;
} 
If the format of the string is incorrect (e.g. no period or 2 periods) the resulting string is unchanged.

http://www.xcalcs.com : Online engineering calculations
http://www.megamag.it : Magnetic brakes for fun rides
http://www.levitans.com : Air bearing pads

RE: A bit of a regex problem

(OP)
WOW! Thanks very much, that is most impressive.
I have used regex for many years but obviously I have only been scratching the surface with regards to their capabilities.
This is the first time I have needed to do this kind of thing and saw no reason to learn such complexities.

Am I right in my view of how it works?

CODE

s

Everything between !'s is the search term, in this case a period and 4 decimal columns
!\.(\d{4})!

for $1 = 1800 to 2030
$1~~[1800..2030]

If match print the substituted string (I removed the period from the result)
?"<sup>$1</sup>"

This bit - not sure of
:$&!ge; 
I know global and evaluate right side but what does :$&! do?

Keith
www.studiosoft.co.uk

RE: A bit of a regex problem

(OP)
I couldn't use split as many records contain more than one person's name, associated date and other information.
ie.
Three generations of the family, Joe Doe.1946 with his Father John.Doe.1924 and young Jill Doe.1968.

Keith
www.studiosoft.co.uk

RE: A bit of a regex problem

Hi

Quote (Keith)

I know global and evaluate right side but what does :$&! do?
Well, the colon ( : ) is part of the ternary operator ( ?: ), the $& ( match variable ) holds the entire substring matched by the last regular expression matching ( regardless whether groups were captured or not ) and the exclamation mark ( ! ) is part of the substitution operator's ( s/// ) delimiters.

By the way,

Quote (Keith)


for $1 = 1800 to 2030
$1~~[1800..2030]
Actually that usage of the smartmatch operator ( ~~ ) is more like a grep, not just a for.

Feherke.
feherke.ga

RE: A bit of a regex problem

(OP)
Thanks
I will have to do a bit of study of those useful operators.

Keith
www.studiosoft.co.uk

RE: A bit of a regex problem

As of Perl 5.18, smart match is experimental: "It is clear that smartmatch is almost certainly either going to change or go away in the future. Relying on its current behavior is not recommended."

- Miller

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close