Just looked at Perl for the first time this week. Originally a SAS programmer so trying to translate sas code that worked into perl code for data management. This is sort of complicated so wondering how I might fix the errors?
Thanks!
I put the following code in and it does not work. Steps I attempted to take on raw data that is arranged like this:
PersonID Releasedate Daysmissed
1289 -433 2
1289 -323 .
1289 200 62
1291 -199 .
1291 -100 .
1299 300 10
1300 20 5
1. read in csv file and separate variables(arrays)
this portion worked
2. create max and min functions to calculate the maximum/minimum in a list necessary for algorithm computation
3. write algorithms for conditions in my code so that I can create two variables(arrays) called intdaymiss and daysobs that calculate the days in each observations(row) that are relevant to the 182 day time window we're looking at.
PersonID Releasedate Daysmissed IntDayMiss Daysobs
1289 -433 2
1289 -323 .
1289 200 62
1291 -199 .
1291 -100 .
1299 300 10
1300 20 5
4. Finally create output that looks like this:
PersonID CalculationPerPersonID
1289
1291
1299
1300
Here's my code so far:
## open raw data file
open(FH,"lisinopril3.txt") || die "couldn't open file!";
## open file to output clean data - note the > is essential for windows
open(OUTFILE,">lisinopril3out.txt") || die "Couldn't open the outfile!";
@releasedate = ();
@daysmissed = ();
@patientid = ();
@observation = ();
while($observation = <FH>){
($pid, $site, $drug, $daysupply, $qty, $releasedt, $dod, $mgtab, $mgday, $dosechange, $prefill, $postfill, $daysmiss) = split(/\,/,$observation);
push(@patientid, $pid);
push(@releasedate, $releasedt);
push(@daysmissed, $daysmiss);
}
#foreach $item (@daysmissed) {
# print $item."\n"; }
sub max{
my($max_so_far) = shift @_;
foreach (@_) {
if ($_ > $max_so_far) {
$max_so_far = $_;
}
}
$max_so_far;
}
sub min{
my($min_so_far) = shift @_;
foreach (@_) {
if ($_ < $min_so_far) {
$min_so_far = $_;
}
}
$min_so_far;
}
@intdaymiss = ();
@daysobs = ();
my @observations = <FH>;
for ($i=1, $i<length(@patientid), $i++){
#Creating daymissed variables for each current period "interval"
if (($releasedate[$i-1] < 0) && ($releasedate[$i]< 0)) {
$intdaymiss[$i]=();
}
if (($releasedate[$i-1] < 0) && ($releasedate[$i]>0)) {
$intdaymiss[$i] = &min($releasedate[$i],$daymissed[$i]);
}
if (($releasedate[$i-1]) ge 0) && (($releasedate[$i]) le 182) {
$intdaymiss[$i] = $daymissed[$i];
}
if (($releasedate[$i-1]) < 182) && (($releasedate[$i]) > 182) {
$intdaymiss[$i] = &max(0,($daymissed[$i]-($releasedate[$i]-182)));
}
#Creating daysobserved variables for each current period "interval"
if (($releasedate[$i-1] < 0) && ($releasedate[$i]< 0)) {
$daysobs[$i]=();
}
if (($releasedate[$i-1] < 0) && ($releasedate[$i]>0)) {
$daysobs[$i] = $releasedate[$i];
}
if (($releasedate[$i-1]) ge 0) && (($releasedate[$i]) le 182) {
$daysobs[$i] = releasedate[$i]-$releasedate[$i-1];
}
if (($releasedate[$i-1] < 182) && ($releasedate[$i]>182)) {
$daysobs[$i] = 182 - $releasedate[$i-1];
}
}
foreach $item (@intdaymiss) {
print $item."\n";
}
close (FH);
close (outfile);
Thanks!
I put the following code in and it does not work. Steps I attempted to take on raw data that is arranged like this:
PersonID Releasedate Daysmissed
1289 -433 2
1289 -323 .
1289 200 62
1291 -199 .
1291 -100 .
1299 300 10
1300 20 5
1. read in csv file and separate variables(arrays)
this portion worked
2. create max and min functions to calculate the maximum/minimum in a list necessary for algorithm computation
3. write algorithms for conditions in my code so that I can create two variables(arrays) called intdaymiss and daysobs that calculate the days in each observations(row) that are relevant to the 182 day time window we're looking at.
PersonID Releasedate Daysmissed IntDayMiss Daysobs
1289 -433 2
1289 -323 .
1289 200 62
1291 -199 .
1291 -100 .
1299 300 10
1300 20 5
4. Finally create output that looks like this:
PersonID CalculationPerPersonID
1289
1291
1299
1300
Here's my code so far:
## open raw data file
open(FH,"lisinopril3.txt") || die "couldn't open file!";
## open file to output clean data - note the > is essential for windows
open(OUTFILE,">lisinopril3out.txt") || die "Couldn't open the outfile!";
@releasedate = ();
@daysmissed = ();
@patientid = ();
@observation = ();
while($observation = <FH>){
($pid, $site, $drug, $daysupply, $qty, $releasedt, $dod, $mgtab, $mgday, $dosechange, $prefill, $postfill, $daysmiss) = split(/\,/,$observation);
push(@patientid, $pid);
push(@releasedate, $releasedt);
push(@daysmissed, $daysmiss);
}
#foreach $item (@daysmissed) {
# print $item."\n"; }
sub max{
my($max_so_far) = shift @_;
foreach (@_) {
if ($_ > $max_so_far) {
$max_so_far = $_;
}
}
$max_so_far;
}
sub min{
my($min_so_far) = shift @_;
foreach (@_) {
if ($_ < $min_so_far) {
$min_so_far = $_;
}
}
$min_so_far;
}
@intdaymiss = ();
@daysobs = ();
my @observations = <FH>;
for ($i=1, $i<length(@patientid), $i++){
#Creating daymissed variables for each current period "interval"
if (($releasedate[$i-1] < 0) && ($releasedate[$i]< 0)) {
$intdaymiss[$i]=();
}
if (($releasedate[$i-1] < 0) && ($releasedate[$i]>0)) {
$intdaymiss[$i] = &min($releasedate[$i],$daymissed[$i]);
}
if (($releasedate[$i-1]) ge 0) && (($releasedate[$i]) le 182) {
$intdaymiss[$i] = $daymissed[$i];
}
if (($releasedate[$i-1]) < 182) && (($releasedate[$i]) > 182) {
$intdaymiss[$i] = &max(0,($daymissed[$i]-($releasedate[$i]-182)));
}
#Creating daysobserved variables for each current period "interval"
if (($releasedate[$i-1] < 0) && ($releasedate[$i]< 0)) {
$daysobs[$i]=();
}
if (($releasedate[$i-1] < 0) && ($releasedate[$i]>0)) {
$daysobs[$i] = $releasedate[$i];
}
if (($releasedate[$i-1]) ge 0) && (($releasedate[$i]) le 182) {
$daysobs[$i] = releasedate[$i]-$releasedate[$i-1];
}
if (($releasedate[$i-1] < 182) && ($releasedate[$i]>182)) {
$daysobs[$i] = 182 - $releasedate[$i-1];
}
}
foreach $item (@intdaymiss) {
print $item."\n";
}
close (FH);
close (outfile);