Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

removing items from an array

Status
Not open for further replies.

carpeliam

Programmer
Mar 17, 2000
990
US
I have an array representing all the names of files in a certain directory. I'm extracting certain file names out of text files, and then removing these file names from the array.

for example, say I have an array like this:
Code:
@myarray = {"A File - My File1.txt", "A File - My File2.txt"};

and I have a text file including all of the files to be removed from the list, which includes
Code:
"File=A File - My File1.txt"
.

I go through the file like so:

Code:
open (FILENAME, "$textfile") or die "Can't open $playlist: $!";
@lines = <FILENAME>;
close FILENAME;
foreach $line (@lines) {
    chop ($substring = substr ($line,  index ($line, &quot;=&quot;) + 1));
    @myarray = grep(!/$substring/, @myarray) if ($line =~ /^File/);
}

however, nothing happens... I assume that the 'grep' line would take everything that doesn't have the string and keep it in the array. Is there a better way? Could it be evaluating it like this:

Code:
    @myarray = grep(!/A File - My File1.txt/, @myarray) if ($line =~ /^File/);

instead of this:

Code:
    @myarray = grep(!/A File \- My File1\.txt/, @myarray) if ($line =~ /^File/);

? Could that be the problem?

If anybody has any suggestions, I'd be very grateful... What's the best way to remove individual items from an array in Perl?

I've also tried this:
Code:
    @myarray = grep(!/&quot;$substring&quot;/, @myarray) if ($line =~ /^File/);

and a few of the files are excluded, but not all of them... and I'm not sure why it picks these certain files either.

Thanks for any help you can offer. Liam Morley
lmorley@wpi.edu
&quot;light the deep, and bring silence to the world.
light the world, and bring depth to the silence.&quot;
 
to be honest... i don't have the foggiest.. but when you find a good way of doing this I'll be very interested to read this thread.

I'm going to need to do exactly this in the not too distant furture (editing dat files for a perl based message board - big eek)
thanks :)

Sib
Siberdude
siberdude@settlers.co.uk
 
Siberdude,<br><br>working on the code.. I've made it slightly shorter and hopefully more efficient (by taking the substring through $1 in a list context), but still with the same results. I'm quite sure that the grep line is the problem, as I've printed the value of $substring from inside the statement and all the values are correct (with no unnecessary leading/trailing spaces).<br><br>Thanks for the interest.<br><br>Liam Morley<br><A HREF="mailto:"></A><br>&quot;light the deep, and bring silence to the world.<br>light the world, and bring depth to the silence.&quot;
 
Code:
@myarray = (
	&quot;A File - My File1.txt&quot;,
	&quot;A File - My File2.txt&quot;
	);

$line = &quot;File=A File - My File1.txt\n&quot;;

chop ($substring = substr ($line, index ($line, &quot;=&quot;) + 1));
@myarray = grep (!/$substring/, @myarray) if ($line =~ /^File/);

foreach (@myarray) {
	print $_, &quot;\n&quot;;
}

This works for me...

A few of notes:

Perl is interpreting !/$substring/ like this:
@myarray = grep(!/A File - My File1.txt/, @myarray) if (line =~ /^File/);
which shouldn't be a problem in this case. If you want to escape the regular expression metacharacters use !/\Q$substring\E/

It is probably better to use chomp rather than chop when you want to remove end-of-line stuff.

Since you want $substring to match the entire array element, it would work better to use:
@myarray = grep ($_ ne $substring, @myarray) if ...;
This would avoid problems when $substring is a substring of more than one element of @myarray.


 
Thanks, &quot;$_ ne $substring&quot; works perfectly. I was thinking that the first argument of grep had to be an expression as opposed to a boolean. That's what I get for using strongly typed languages too much.

As for chomp vs. chop, I like assigning the value in a list context.. it looks nicer to me. so this works fine.

The idea of the program was to find out which MP3's I have which aren't already in a playlist. I'm only concerned about certain playlists, so I renamed all the playlists I'm concerned with to &quot;My Playlist Whatever (good).pls&quot;. Now I can figure out which songs aren't already in a playlist and add them to a playlist. The next task is to go through the ID3 tags and automatically create playlists containing songs that are of that style. I probably won't do that, though, as my ID3 tags aren't all filled in.

Here's the final version of the code. If you have any improvements or suggestions, please let me know.

Code:
$dir = &quot;D:/My Music/&quot;;
$output_file = &quot;output.txt&quot;;
@mp3s = list_files ($dir, '\.mp3');
@playlists = list_files ($dir, '\(good\)\.pls');
foreach $playlist (@playlists) {
	open (PLAYLIST, &quot;$dir$playlist&quot;) or die &quot;Can't open $playlist: $!&quot;;
	@lines = <PLAYLIST>;
	close PLAYLIST;
	foreach $_ (@lines) {
		if (($substring) = /^File\d+=(.+\.mp3)/i) {
			@mp3s = grep($_ ne $substring, @mp3s);
		}
	}
}

open (OUTPUT, &quot;>$dir&quot; . $output_file) or die &quot;Can't open $output_file: $!&quot;;
foreach $mp3 (sort @mp3s) { print OUTPUT &quot;\&quot;$mp3\&quot;\n&quot;; }
close OUTPUT;
exit;


sub list_files {
	my @mp3list;
	local ($dir, $extension) = @_;
	opendir(SONGDIR, $dir) or die &quot;Can't open $dir: $!&quot;;
	while ($file_name = readdir SONGDIR) {
		push (@mp3list, $file_name) if ($file_name =~ /.*$extension$/i);
	}
	closedir SONGDIR;
	return @mp3list;
}
Liam Morley
lmorley@wpi.edu
&quot;light the deep, and bring silence to the world.
light the world, and bring depth to the silence.&quot;
 
I think your routine will work and in a time acceptable to you.

However using a hash would be more efficient.
Your program is executing grep on your songlist array once for each playlist entry.
Try this algorithm:
- Initialize hash using .mp3 filenames as keys
foreach $songname (@mp3s) {
$m3ps{$songname} = 1;
}
- undefine hash key for song in a playlist
foreach playlist file
foreach currentplaylist entry
undef $mp3s{$songname};

- remaining hash keys are songs that are not in any playlist
print &quot;Missing songs: &quot;, sort keys %mp3s, &quot;\n&quot;;
 
Hmm... interesting. Wouldn't that be less efficient than using an array as you're only using half of the hash (using the key but with no value to correspond)? It seems to me that a hash is essentially two arrays smacked together as far as memory allocation goes (one for the keys and one for the values).. so I would be interested to know how Perl deals with it when only the keys are used, whether it does this efficiently or not.<br><br>You're right, it works, in a time acceptable to me... the concept from the beginning was, &quot;hey, I want something to do something, I'll write something to do it.&quot; But this morning, after it was finished, I was thinking... &quot;what would be the most efficient algorithm?&quot; going over worst-case, best-case scenarios in my head. I realized that I was going through every MP3 twice (once to list and another time to see if it was in a playlist).. some time soon when I have time I'll go through the algorithm with a pencil and paper. <p>Liam Morley<br><A HREF="mailto:"></A><br>&quot;light the deep, and bring silence to the world.<br>light the world, and bring depth to the silence.&quot;
 
Efficiency isn't defined by how much space you use. It's defined by how fast you can use that space. So Hashtables are much faster than arrays because you're only accessing about 5 entries while when you use an array to find that file you end up accessing every one all the time. If you haven't learned what a hash table is, you should study the difference between hash tables and arrays. I took a course on it last semester and I learned how you can access an entry in a hash table in about log(n) time on average and you can access an entry in an array in n time. Since you're not putting anything in the values of the hash table, it's no faster to make an array of filenames than it is to make a hash of filenames. The hash table is essentially a fixed size array with a key generator algorithm and maybe a linked list for the entries for each hash value.
Steve Kiehl
webmaster@nanovox.com
 
I was talking about memory allocation.. and I know what a hash table is. <p>Liam Morley<br><A HREF="mailto:"></A><br>&quot;light the deep, and bring silence to the world.<br>light the world, and bring depth to the silence.&quot;
 
Even if the hashtable is implemented with two arrays, I believe that perl's strings are dynamic so the array containing the data is probably taking up little or no space anyway. The only reason you should really be worried about space is if you were doing this on your cell phone or maybe your pda. For any other purpose on a regular computer, hashtables are just better.
Steve Kiehl
webmaster@nanovox.com
 
I wasn't worried, it was just a theoretical question.. but thank you for your input. <p>Liam Morley<br><A HREF="mailto:"></A><br>&quot;light the deep, and bring silence to the world.<br>light the world, and bring depth to the silence.&quot;
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top