INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Change content of files

Change content of files

(OP)
Hi all, i have a question to ask..i want to read an input file and write it to a new output file,
the content of my input file as follows (not limited to these item only, can have up to hundreds items):

Device <blank space> Type <blank space> Year <blank space> Status <blank space> Company
electronic/trend/latest/mp3 <blank space> ipod <blank space> 2012 <blank space> secondhand <blank space> apple
electronic/trend/latest/phone <blank space> Samsungs5<blank space> 2014 <blank space> secondhand <blank space> samsung
electronic/trend/latest/laptop <blank space> EliteBook<blank space> 2011 <blank space> secondhand <blank space> hp
electronic/trend/latest/iphone <blank space> iphone6 <blank space> 2014 <blank space> new <blank space> apple
electronic/trend/latest/phone <blank space> Samsungs5<blank space> 2014 <blank space> new <blank space> samsung
electronic/trend/latest/monitor <blank space> xpro <blank space> 2012 <blank space> secondhand <blank space> dell

First i need to remove the duplicate path + device and take the last occurence (in this example Samsungs5 has duplicate path and device, only take last occurence)

Second is need to display in the following format:


Path <blank space> Device <blank space> Type <blank space> Status
electronic/trend/latest <blank space> mp3 <blank space> ipod <blank space> secondhand
electronic/trend/latest <blank space> laptop <blank space> EliteBook<blank space> secondhand
electronic/trend/latest <blank space> iphone <blank space> iphone6 <blank space> new
electronic/trend/latest <blank space> phone <blank space> Samsungs5<blank space> new
electronic/trend/latest <blank space> monitor <blank space> xpro <blank space> secondhand

the Year, Company and the duplicate item is remove and only display last occurence of Samsungs5.

thanks in advance :)

RE: Change content of files

Hi merang,
I would use hash to print only unique device type. Here is an example:

merang.tcl

CODE

# input file
set fname "merang.txt"
set input_file [open $fname "r"]
# output file
set new_fname "merang_result.txt"
set output_file [open $new_fname "w"]

while { [gets $input_file line] != -1 } {
  # skip empty lines
  if {$line=={}} { continue }  
  # return a list with the substrings matched by the regex
  set line_list [regexp -all -inline {\S+} $line]
  # extract fields from list 
  foreach {path_device type year status company} $line_list {}
  # extract path and device from path_device
  set path_device_list [split $path_device "/"]
  set path [join [lrange $path_device_list 0 end-1] "/"]
  set device [lindex $path_device_list end]
  # create output line
  set out_line "$path $device $type $status"

  # create or overwrite hash entry
  set myhash($type) $out_line

  # write line to the screen
  puts "input_line:   '$line'"  
  puts "path_device = '$path_device'"
  puts "path        = '$path'"
  puts "device      = '$device'"
  puts "type        = '$type'"
  puts "year        = '$year'"
  puts "status      = '$status'"
  puts "company     = '$company'"
  puts "*"
  puts "output line:  '$out_line'"
  puts "***"
}

# print hash entries to the screen and in the file
puts "\nHash entries:"
foreach key [array names myhash] {
  # print all hash values key => list
  puts "$key => '$myhash($key)'"
  puts $output_file $myhash($key)
}

# close files
close $input_file
close $output_file 

Now when I have this input file
merang.txt

CODE

electro/trend/latest/mp3     ipod      2012 secondhand apple
electro/trend/latest/phone   Samsungs5 2014 secondhand samsung
electro/trend/latest/laptop  EliteBook 2011 secondhand hp
electro/trend/latest/iphone  iphone6   2014 new        apple 
electro/trend/latest/phone   Samsungs5 2014 new        samsung
electro/trend/latest/monitor xpro      2012 secondhand dell 

and run the script above, it prints some output to the screen and produces the output file merang_result.txt:

CODE

C:\_mikrom\Work>tclsh merang.tcl
input_line:   'electro/trend/latest/mp3     ipod      2012 secondhand apple'
path_device = 'electro/trend/latest/mp3'
path        = 'electro/trend/latest'
device      = 'mp3'
type        = 'ipod'
year        = '2012'
status      = 'secondhand'
company     = 'apple'
*
output line:  'electro/trend/latest mp3 ipod secondhand'
***
input_line:   'electro/trend/latest/phone   Samsungs5 2014 secondhand samsung'
path_device = 'electro/trend/latest/phone'
path        = 'electro/trend/latest'
device      = 'phone'
type        = 'Samsungs5'
year        = '2014'
status      = 'secondhand'
company     = 'samsung'
*
output line:  'electro/trend/latest phone Samsungs5 secondhand'
***
input_line:   'electro/trend/latest/laptop  EliteBook 2011 secondhand hp'
path_device = 'electro/trend/latest/laptop'
path        = 'electro/trend/latest'
device      = 'laptop'
type        = 'EliteBook'
year        = '2011'
status      = 'secondhand'
company     = 'hp'
*
output line:  'electro/trend/latest laptop EliteBook secondhand'
***
input_line:   'electro/trend/latest/iphone  iphone6   2014 new        apple '
path_device = 'electro/trend/latest/iphone'
path        = 'electro/trend/latest'
device      = 'iphone'
type        = 'iphone6'
year        = '2014'
status      = 'new'
company     = 'apple'
*
output line:  'electro/trend/latest iphone iphone6 new'
***
input_line:   'electro/trend/latest/phone   Samsungs5 2014 new        samsung'
path_device = 'electro/trend/latest/phone'
path        = 'electro/trend/latest'
device      = 'phone'
type        = 'Samsungs5'
year        = '2014'
status      = 'new'
company     = 'samsung'
*
output line:  'electro/trend/latest phone Samsungs5 new'
***
input_line:   'electro/trend/latest/monitor xpro      2012 secondhand dell'
path_device = 'electro/trend/latest/monitor'
path        = 'electro/trend/latest'
device      = 'monitor'
type        = 'xpro'
year        = '2012'
status      = 'secondhand'
company     = 'dell'
*
output line:  'electro/trend/latest monitor xpro secondhand'
***

Hash entries:
ipod => 'electro/trend/latest mp3 ipod secondhand'
xpro => 'electro/trend/latest monitor xpro secondhand'
iphone6 => 'electro/trend/latest iphone iphone6 new'
EliteBook => 'electro/trend/latest laptop EliteBook secondhand'
Samsungs5 => 'electro/trend/latest phone Samsungs5 new' 

merang_result.txt

CODE

electro/trend/latest mp3 ipod secondhand
electro/trend/latest monitor xpro secondhand
electro/trend/latest iphone iphone6 new
electro/trend/latest laptop EliteBook secondhand
electro/trend/latest phone Samsungs5 new 

RE: Change content of files

(OP)
Tq for such a good explaination sir..one question, if some of the path is electro/trend/latest/phone and other path got electro/trend/latest/tech/hardware/phone, is the code still can be use sir? To get the device is phone only..

Thanks in advance :)

RE: Change content of files

IMO, yes - it could be used, because $device is set to the last element of the path (e.g.: electro/../../phone) separated by "/" i.e.:

CODE

set device [lindex $path_device_list end] 
But you can try it easily self...

For example, when I add to the input file a line with longer path

CODE

electro/trend/latest/mp3     ipod      2012 secondhand apple
electro/trend/latest/phone   Samsungs5 2014 secondhand samsung
electro/trend/latest/laptop  EliteBook 2011 secondhand hp
electro/trend/latest/iphone  iphone6   2014 new        apple 
electro/trend/latest/phone   Samsungs5 2014 new        samsung
electro/trend/latest/monitor xpro      2012 secondhand dell
electro/trend/latest/tech/hardware/phone Samsungs5 2014 newest samsung 
I get this output

CODE

electro/trend/latest mp3 ipod secondhand
electro/trend/latest monitor xpro secondhand
electro/trend/latest iphone iphone6 new
electro/trend/latest laptop EliteBook secondhand
electro/trend/latest/tech/hardware phone Samsungs5 newest 

RE: Change content of files

(OP)
OIC..sir,which part of the code that detect the same path and device, remove the duplicate and take only the last occurence? FYI the code should remove the duplicate if and only if the path and device are the same..

From the above example..we have 3 samsung right:

electro/trend/latest/phone Samsungs5 2014 secondhand samsung

electro/trend/latest/phone Samsungs5 2014 new samsung

electro/trend/latest/tech/hardware/phone Samsungs5 2014 newest samsung

The program should only remove the first two and take the last occurrence while the one that u added just now is stay. Is the code do that sir?

RE: Change content of files

If so you have to modify the code.

RE: Change content of files

(OP)
set line_list [regexp -all -inline {\S+} $line], is this the code to detect the duplicate? Need to modify it so that can meet the purpose of this program?

Or I need to come out with a new condition? Some example sir?

Thanks in advance. .

RE: Change content of files

To get 2 phones with different path, we need to modify the source.
We used for hash key $device, now we will use $path_device.
So modify please this line:

CODE

# create or overwrite hash entry
set myhash($path_device) $out_line 


Then we get for this input

CODE

electro/trend/latest/mp3     ipod      2012 secondhand apple
electro/trend/latest/phone   Samsungs5 2014 secondhand samsung
electro/trend/latest/laptop  EliteBook 2011 secondhand hp
electro/trend/latest/iphone  iphone6   2014 new        apple 
electro/trend/latest/phone   Samsungs5 2014 new        samsung
electro/trend/latest/monitor xpro      2012 secondhand dell
electro/trend/latest/tech/hardware/phone Samsungs5 2014 newest samsung 
the following output

CODE

electro/trend/latest iphone iphone6 new
electro/trend/latest monitor xpro secondhand
electro/trend/latest laptop EliteBook secondhand
electro/trend/latest mp3 ipod secondhand
electro/trend/latest/tech/hardware phone Samsungs5 newest
electro/trend/latest phone Samsungs5 new 
You see there are now two different phones in the output. Is this what you needed?

RE: Change content of files

(OP)
yes, it answers my question sir.. i have modified the code with my own code to fulfill the purpose of my question.

for the header i use another command to set the header. is it possible to use the input file as the header?

Device <blank space> Type <blank space> Year <blank space> Status <blank space> Company
electronic/trend/latest/mp3 <blank space> ipod <blank space> 2012 <blank space> secondhand <blank space> apple

the expected Output:

Path <blank space> Device <blank space>Type <blank space>Status
electro/trend/latest iphone iphone6 new

thanks :)

RE: Change content of files

Quote (merang)


is it possible to use the input file as the header?
Of course it's possible.

RE: Change content of files

(OP)
Hi,
i have refer to your code and tried my own code, and it works well. i see that this code is the HARDCODE. if i want it to become a SOFTCODE, as an example if another input file has other added elements beside the previous one (Device Type Year Status Company Price), how do i modify the previous code so that it can become a SOFTCODE (more robust or universal). Means that if i have any input file regardless the addition of elements, i can put the same output, not just do the HARDCODE.

i think should use different arrays to access each elements, i want to know what is your opinion and an example if can. Thanks in advance.

RE: Change content of files

IMO, when you have more fields in the line, you only need to change the field extraction method, that is to replace this loop

CODE

# extract fields from list 
foreach {path_device type year status company} $line_list {} 
wit another method

RE: Change content of files

I looked at it, use lindex and extract only the fields you need, so:

CODE

...
# extract fields from list 
#foreach {path_device type year status company} $line_list {}
set path_device [lindex $line_list 0]
set type [lindex $line_list 1]
set year [lindex $line_list 2]
set status [lindex $line_list 3]
set company [lindex $line_list 4]
... 

now for the modified input file (with longer lines)

CODE

electro/trend/latest/mp3     ipod      2012 secondhand apple   
electro/trend/latest/phone   Samsungs5 2014 secondhand samsung 111  and other junk
electro/trend/latest/laptop  EliteBook 2011 secondhand hp 222 and other junk
electro/trend/latest/iphone  iphone6   2014 new        apple 333  and other junk
electro/trend/latest/phone   Samsungs5 2014 new        samsung 123 and other junk
electro/trend/latest/monitor xpro      2012 secondhand dell 223 and other junk
electro/trend/latest/tech/hardware/phone Samsungs5 2014 newest samsung 323 and other junk 

the program delivers the same result as before:

CODE

electro/trend/latest iphone iphone6 new
electro/trend/latest monitor xpro secondhand
electro/trend/latest laptop EliteBook secondhand
electro/trend/latest mp3 ipod secondhand
electro/trend/latest/tech/hardware phone Samsungs5 newest
electro/trend/latest phone Samsungs5 new 

RE: Change content of files

(OP)
Yes, because the loop is hardcode, it is fixed for that particular elements only..it is hard to change the elements in the loop everytime when there are changes in the input file.

To make the code more robust, can u give me an example of the another method that you mentioned before?

RE: Change content of files

Look at for the solution at my previous post

RE: Change content of files

(OP)
Emm lets say the price is not added at the last but instead add in any place..so I need to change the example that u give in previous post because it assign <set year [lindex $line_list 2]>...I need to modify them like <set price [lindex $line_list 2]> so that I can print out the output of price right?using this method I still need to change the code everytime the input file is changed (correct me if I am wrong)

If we use multidimensional array such as <set myhash($path_device, $type, $year, $status, $company) $out_line , is it possible to do like this so that we can access any elements easily?

RE: Change content of files

Hi merang,
I posted some examples to help you to start with solving your problem.
Now, I hope you will understand how the program works and be able to make such small changes self.

Quote:


I still need to change the code everytime the input file is changed (correct me if I am wrong)

No, but you should first analyze exactly what type of input lines are possible and which fields are relevant for your output.
Then either modify the source I posted so that it will satisfy your needs, or write your own.

RE: Change content of files

(OP)
Ok..thanks for your time sir :)

RE: Change content of files

You're welcome!

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close