I'm attempting to write a script to take Prologue SundayPlus media files and convert them into plain text format, keeping track of the different slides in the file.
SP media files are a type of RichText file, with \'s everywhere to define the font styles and everything. From dissecting these files in Notepad, I found that \par is a newline and \qc is the code to tell SP that a new slide is beginning (media files work kinda like PowerPoint in that they all have slides).
Here's an example input file:
You can skim through that and easily find where the plaintext is in it. That bit of file data would display, in 96 pt yellow Times New Roman, three slides:
Now first off, here's my code:
On a lot of the files I've tested this against, it works fine. There is one file so far that doesn't split itself up into different slides.
This bit of code should check if \qc exists anywhere in the line, right?
I had the code block here to print out helpful details about that line, but it seems to skip line 49 in the file I'm working with, when I know that line 49 has a \qc in it:
It's the last escaped code in that line before "I will tell"
I've been trying to debug this for the longest time, but it doesn't seem to be doing much good. I've tried specifically printing out $data[48] to show what's on line 49 to the console window, and it shows exactly what IS on line 49. But for some reason, this IF statement just isn't matching here.
The output file looks like this:
The highlighted section should've been broken into different slides with that <para> tag, but for some reason it's just not matching the \qc on these lines... specifically line 49 which is what I've been debugging with.
Does anybody have any idea what could be wrong?
Thanks in advance.
SP media files are a type of RichText file, with \'s everywhere to define the font styles and everything. From dissecting these files in Notepad, I found that \par is a newline and \qc is the code to tell SP that a new slide is beginning (media files work kinda like PowerPoint in that they all have slides).
Here's an example input file:
Code:
[#GLOBAL_RECT: rect(0, 0, 1024, 768), #opacity: 100, #SHADOW_ON: 0, #SHADOW_COLOR: rgb( 0, 0, 0 ), #SHADOW_OPACITY: 100, #SHADOW_POSITION: "RB", #SHADOW_OFFSET: [4, 4], #FILE_TYPE: "Song", #title: "", #Author: "", #Copyright: "", #CCLI: "", #Cell1: [#RTF: "{\rtf1\ansi\deff0 {\fonttbl{\f0\fswiss Arial;}{\f1\fmodern Monotype Corsiva;}{\f2\froman Times New Roman;}}{\colortbl\red0\green0
\blue0;\red0\green0\blue224;\red224\green0\blue0;\red224\green0\blue224;\red102\green102\blue153;\red51\green153\blue102;\red0\green255
\blue0;\red255\green255\blue0;\red248\green248\blue0;}{\stylesheet{\s0\fs24\ql\li0\ri0\fi0\sb0\sa0\sl0 Normal Text;}{\s2\fs24\ql\li0
\ri0\fi0\sb0\sa0\sl0 Normal;}{\s3\fs24\ql\li0\ri0\fi0\sb0\sa0\sl0 Plain Text;}{\s4\fs130\cf4\ql\li0\ri0\fi0\sb0\sa0\sl0 heading 1;}
{\s5\fs192\cf5\ql\li0\ri0\fi0\sb0\sa0\sl0 heading 2;}{\s6\fs96\cf4 Body Text;}{\s7\b\fs96\cf6 Author;}{\s8\b\fs40 CCLI;}{\s9\b\fs40 Copyright;}
{\s10\fs120\cf7 Lyrics;}{\s11\b\f1\fs144 Title;}}\margl1800 \margr1800 \margt1440 \margb1440 \pard \f0\fs24{\pard \b\f2\fs192\cf8
\qc\par
American Dream\par
Casting Crowns\par
}}", #Align: #center], #CELL2: [#RTF: "{\rtf1\ansi\deff0 {\fonttbl{\f0\fswiss Arial;}{\f1\fnil EUROPA;}{\f2\fmodern Monotype Corsiva;}{\f3\froman Times New Roman;}}{\colortbl
\red0\green0\blue0;\red0\green0\blue224;\red224\green0\blue0;\red224\green0\blue224;\red40\green168\blue208;\red102\green102\blue153;
\red51\green153\blue102;\red0\green255\blue0;\red255\green255\blue0;\red248\green248\blue0;}{\stylesheet{\s0\fs24\ql\li0\ri0\fi0\sb0
\sa0\sl0 Normal Text;}{\s2\fs24\ql\li0\ri0\fi0\sb0\sa0\sl0 Normal;}{\s3\fs24\ql\li0\ri0\fi0\sb0\sa0\sl0 Plain Text;}{\s4\fs130\cf5
\ql\li0\ri0\fi0\sb0\sa0\sl0 heading 1;}{\s5\fs192\cf6\ql\li0\ri0\fi0\sb0\sa0\sl0 heading 2;}{\s6\fs96\cf5 Body Text;}{\s7\b\fs96\cf7 Author;}
{\s8\b\fs40 CCLI;}{\s9\b\fs40 Copyright;}{\s10\fs120\cf8 Lyrics;}{\s11\b\f2\fs144 Title;}}\margl1800 \margr1800 \margt1440 \margb1440
\pard \f0\fs24{\pard \b\f3\fs192\cf9\qc\par
All work no play\par
may have made Jack\par
a dull boy\par
}}", #Align: #center], #CELL3: [#RTF: "{\rtf1\ansi\deff0 {\fonttbl{\f0\fswiss Arial;}{\f1\fmodern Monotype Corsiva;}{\f2\froman Times New Roman;}}{\colortbl\red0\green0
\blue0;\red0\green0\blue224;\red224\green0\blue0;\red224\green0\blue224;\red102\green102\blue153;\red51\green153\blue102;\red0\green255
\blue0;\red255\green255\blue0;\red248\green248\blue0;}{\stylesheet{\s0\fs24\ql\li0\ri0\fi0\sb0\sa0\sl0 Normal Text;}{\s2\fs24\ql\li0
\ri0\fi0\sb0\sa0\sl0 Normal;}{\s3\fs24\ql\li0\ri0\fi0\sb0\sa0\sl0 Plain Text;}{\s4\fs130\cf4\ql\li0\ri0\fi0\sb0\sa0\sl0 heading 1;}
{\s5\fs192\cf5\ql\li0\ri0\fi0\sb0\sa0\sl0 heading 2;}{\s6\fs96\cf4 Body Text;}{\s7\b\fs96\cf6 Author;}{\s8\b\fs40 CCLI;}{\s9\b\fs40 Copyright;}
{\s10\fs120\cf7 Lyrics;}{\s11\b\f1\fs144 Title;}}\margl1800 \margr1800 \margt1440 \margb1440 \pard \f0\fs24{\pard \b\f2\fs192\cf8
\qc\par
but all work no God\par
has left Jack with\par
a lost soul.\par
You can skim through that and easily find where the plaintext is in it. That bit of file data would display, in 96 pt yellow Times New Roman, three slides:
Code:
American Dream
Casting Crowns
--------
All work no play
may have made Jack
a dull boy
--------
but all work no God
has left Jack with
a lost soul
Now first off, here's my code:
Code:
use Data::Dumper;
opendir (DIR, ".");
foreach my $ptf (sort(grep(/\.ptf$/i, readdir(DIR)))) {
print "Reading $ptf...\n";
open (FILE, $ptf);
my @data = <FILE>;
close (FILE);
chomp @data;
my $out = {}; # Save output in their slide blocks
my @text = ();
my $i = -1;
# Parse out everything except for text
my $c = 0;
foreach my $line (@data) {
$c++;
print "\t$c\n";
next unless $line =~ /\\par$/i; # \par tends to mean end-of-lines
# Look for a keyword that we're starting a new slide.
if ($line =~ /\\qc/) {
$i++;
# print "\a$c\n";
$out->{$i} = [];
if ($ptf eq 'I Will Sing Of My Redeemer.ptf') {
print "$c: $line\n";
}
}
# Skip the -1 paragraph
next if $i < 0;
# Remove all unnecessary formatting.
$line =~ s/\\(\w+)\s*//ig;
# Change the formatting of special chars.
$line =~ s~\\:~:~ig; # \: = :
$line =~ s~\^\^~"~ig; # ^^ = "
$line =~ s~\^~'~ig; # ^ = '
$line =~ s~{~~ig; # {}
$line =~ s~}~~ig; # {}
# print "\t$line\n";
push (@{$out->{$i}}, $line);
}
print "$c lines\n";
print Dumper($out);
open (OUT, ">$ptf.txt");
for (my $j = 0; exists $out->{$j}; $j++) {
print OUT join ("\n", @{$out->{$j}}) . "<para>\n\n";
}
close (OUT);
}
closedir (DIR);
On a lot of the files I've tested this against, it works fine. There is one file so far that doesn't split itself up into different slides.
This bit of code should check if \qc exists anywhere in the line, right?
Code:
if ($line =~ /\\qc/) {
I had the code block here to print out helpful details about that line, but it seems to skip line 49 in the file I'm working with, when I know that line 49 has a \qc in it:
Code:
{\s10\fs120\cf7 Lyrics;}{\s11\b\fs144 Title;}}\margl1800 \margr1800 \margt1440 \margb1440 \pard \f0\fs24{\pard \b\f2\fs204\cf8\qc I will tell the
It's the last escaped code in that line before "I will tell"
I've been trying to debug this for the longest time, but it doesn't seem to be doing much good. I've tried specifically printing out $data[48] to show what's on line 49 to the console window, and it shows exactly what IS on line 49. But for some reason, this IF statement just isn't matching here.
The output file looks like this:
I will sing of my Redeemer and His wondrous love to me;<para>
Hymn #539
I Will Sing of
My Redeemer
<para>
Sing, O sing
of my Redeemer,
with His blood He
purchased me;<para>
on the cross He
sealed my pardon,
paid the debt and made me free.
wondrous story,
how, my lost
estate to save,
love and mercy,
He the ransom
freely gave.
of my Redeemer,
with His blood He purchased me;<para>
on the cross He
sealed my pardon,
paid the debt and made me free.<para>
I will praise my
dear Redeemer,
His triumphant
pow'r I'll tell,<para>
how the victory He giveth, over sin and death and hell.<para>
Sing, O sing
of my Redeemer,
with His blood He purchased me;<para>
on the cross He
sealed my pardon,
paid the debt and made me free.<para>
I will sing of my Redeemer and His heav'nly love to me;<para>
He from death to
life hath brought me, Son of God with
Him to be.<para>
Sing, O sing
of my Redeemer,
with His blood He purchased me;<para>
on the cross He
sealed my pardon,
paid the debt and made me free.<para>
The highlighted section should've been broken into different slides with that <para> tag, but for some reason it's just not matching the \qc on these lines... specifically line 49 which is what I've been debugging with.
Does anybody have any idea what could be wrong?
Thanks in advance.