INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

EXTRACT DATA FROM PARENTHESIS

EXTRACT DATA FROM PARENTHESIS

(OP)
Hi ,
i want to retrive in between data from parenthesis and i a getting errors while run the awk.i have 2 files and want to process 1st file pattern to 2nd file.

pattern_file.txt
--------------
ABCD
PQRS
XYZ

INPUT FILE.TXT
----------------

CRAETE TABLE ABCD
(
A,
B,
C
);
CREATE TABLE PQRS
(
P,
R
);

SO HERE IN CASE 1ST PATTERN READS AND LOOK INTO INPUT FILE.TXT AND IF FINDS THE MATCH IT SHOULD GET THE DATA BETWEEN PARENTHESIS
A,
B,
C
NEXT PATTERN PQRS AND IT MATCHES AND GETS THE RESULT

P,
R

NEXT XYZ COMES AND NO MATCH SO NOTHING RETURNED.

MY CODE
-------
for i in `cat pattern_file.txt`
do
awk '/'{print "$i"}'/{flag=1;next}/);/{flag=0}flag' INPUT_FILE.txt
done

RE: EXTRACT DATA FROM PARENTHESIS

Hi

Try this :

CODE

while IFS='' read -r pattern; do
    echo "=== filtering for $pattern ==="
    awk '
        $0 ~ pattern { flag = 1 }
        /\(/, /\)/ { if (flag && ! /[()]/) print }
        /\)/ { flag = 0 }
    ' pattern="$pattern" INPUT_FILE.TXT
done < pattern_file.txt 

Next time please use TGML tags to make the code ( see [code] .. [/code] ) and input data ( see [pre] .. [/pre] ) easier to read. Also would appreciate if you avoid posting the same question in multiple forums.

Feherke.
feherke.ga

RE: EXTRACT DATA FROM PARENTHESIS

I flagged the other post and D&D will probably delete it. andrew_121, for future reference, do not post the same question in multiple forums.

==================================
advanced cognitive capabilities and other marketing buzzwords explained with sarcastic simplicity


RE: EXTRACT DATA FROM PARENTHESIS

(OP)
Sure feherke but i am new to AWK and faced issues in this scenario.Other forum people with out listening the actual problem they close the thread.

Here also i am facing while running your code .Based on the Given example it is working fine.But when i tested with real data it fails.i haven given sample file below. Input.txt and Pattern.txt and while running getting only the 1st record of both files.

INPUT_FILE.Txt
 
CREATE TABLE XXX_ALUEUTRAN_DB.ABC
 
(
        ABC            DATE NOT NULL,
        ABC_123           VARCHAR(6) NOT NULL,
        ABC_1234          VARCHAR(32) NOT NULL,
        XYZ123          VARCHAR(32) NOT NULL,
        AVC_ID               VARCHAR(32) NOT NULL,
        LTECELL_ID           VARCHAR(32) NOT NULL,
        ID                   VARCHAR(32) NOT NULL
 
);
 
CREATE TABLE XXX_ALUEUTRAN_DB.XYZ
 
(
        START_DATE           DATE NOT NULL,
        START_TIME           VARCHAR(6) NOT NULL,
        SERVER_NAME          VARCHAR(32) NOT NULL,
        ENBEQUIPMENT_ID      VARCHAR(32) NOT NULL,
        ENB_ID               VARCHAR(32) NOT NULL
 
);  


PATTERN_FILE.txt
 
XXX_ALUEUTRAN_DB.ABC
XXX_ALUEUTRAN_DB.XYZ
 


CODE

while IFS='' read -r pattern; do
   # echo "=== filtering for $pattern ==="
    awk '
        $0 ~ pattern { flag = 1 }
        /\(/, /\)/ { if (flag && ! /[()]/) print }
        /\)/ { flag = 0 }
    ' pattern="$pattern" Input_File.txt > output.txt
done < PatternFile.txt 

Expected Output:

 
ABC,       
ABC_123,   
ABC_1234,  
XYZ123,    
AVC_ID,    
LTECELL_ID,
ID        

 
START_DATE,     
START_TIME,     
SERVER_NAME,    
ENBEQUIPMENT_ID,
ENB_ID         
        
 

Currently what i am getting by running this code,only the 1st record of all files

CODE

ABC            DATE NOT NULL,
        START_DATE           DATE NOT NULL, 

RE: EXTRACT DATA FROM PARENTHESIS

Hi

I see. The intermediary "(" and ")" are messing up the logic. The easiest will be to only handle "(" when is at the beginning of line and ")" when is followed by semicolon ( ; ) :

CODE --> Awk

$0 ~ pattern { flag = 1 }
/^\(/, /\);/ { if (flag && ! /^\(|\);/) print }
/\);/ { flag = 0 } 

Feherke.
feherke.ga

RE: EXTRACT DATA FROM PARENTHESIS

(OP)
 

 Hi Feherke,Thanks this is working fine with my requirement.one small help on Now i need to include one more thing based on that pattern i need to grep from the 
below file and there will be only 1 entry in file but they may come in different styles like in single line or in 2 line or in 3 line .Like the same line in 3 formats.
But always file will contain one line inside file for each pattern.

 

PRIME_FILE.TXT

CODE

ALTER TABLE XXX_ALUEUTRAN_DB.ABC ADD CONSTRAINT ABC_2_PK
	PRIMARY KEY (ABC,ABC_123,ABC_1234,ID);

ALTER TABLE XXX_ALUEUTRAN_DB.ABC ADD CONSTRAINT ABC_2_PK 	PRIMARY KEY (ABC,ABC_123,ABC_1234,ID);

ALTER TABLE XXX_ALUEUTRAN_DB.ABC 
    ADD CONSTRAINT ABC_2_PK
	PRIMARY KEY (ABC,ABC_123,ABC_1234,ID); 

I want to include this inside the existing code put into separate file.

CODE

Expected Output
-----------------
(ABC,ABC_123,ABC_1234,ID) 
not getting appropriate result after doing grep

CODE

Sample code

while IFS='' read -r pattern; do
   # echo "=== filtering for $pattern ==="
    awk '
        $0 ~ pattern { flag = 1 }
         /^\(/, /\);/ { if (flag && ! /^\(|\);/) print }
         /\);/ { flag = 0 }
    ' pattern="$pattern" sa.txt >output.txt
##need to add a command which can get the output###
grep $pattern PRIME_FILE.TXT > prime_out.txt     ##single line command to get the output (ABC,ABC_123,ABC_1234,ID)  
done < pat.txt 

RE: EXTRACT DATA FROM PARENTHESIS

Hi

I would go with a similar Awk code, using range pattern :

CODE

awk '
    $0 ~ pattern, /;/ {
        if (match($0, /\([^()]+\)/, found)) print found[0]
    }
' pattern="$pattern" PRIME_FILE.TXT 

Feherke.
feherke.ga

RE: EXTRACT DATA FROM PARENTHESIS

(OP)
 
Hi feherke,Thanks a lot .This is working and suits my requirement.Only thing the problem is AWK has to check for the exact pattern matching  which is failing now.
 

EXP:

CODE

i have 2 pattern file 
XXX_ALUEUTRAN_DB.ABC
XXX_ALUEUTRAN_DB.ABC_1 
 
so in this case when the 1st pattern(XXX_ALUEUTRAN_DB.ABC) comes it has to check XXX_ALUEUTRAN_DB.ABC but it is checking both 
XXX_ALUEUTRAN_DB.ABC and XXX_ALUEUTRAN_DB.ABC_1 which is making duplicates.i need to make exact pattern match.
 



RE: EXTRACT DATA FROM PARENTHESIS

Hi

Nothing the \< and \> anchors can not solve :

CODE --> Awk

$0 ~ "\\<" pattern "\\>" { flag = 1 }
/^\(/, /\);/ { if (flag && ! /^\(|\);/) print }
/\);/ { flag = 0 } 

CODE --> Awk

$0 ~ "\\<" pattern "\\>", /;/ { if (match($0, /\([^()]+\)/, found)) print found[0] } 

Feherke.
feherke.ga

RE: EXTRACT DATA FROM PARENTHESIS

(OP)
Thanks...It is working now

RE: EXTRACT DATA FROM PARENTHESIS

(OP)
Hi feherke,

if i need to do a small change on the requirement like from 1st example if i want to get the data including the two parenthesis() then where need to change .As per your solution now the parenthesis not coming.So i want data in between from pattern and ; . i thnk its a samll change....i tried to remove but got syntax error

CODE

CRAETE TABLE ABCD
(
A,
B,
C
);
CREATE TABLE PQRS
(
P,
R
); 

CODE

earlier output
A,
B,
C 
Now Want

CODE

(
A,
B,
C
) 

Actual code

CODE

awk '
        $0 ~ "\\<" pattern "\\>" { flag = 1 }
         /^\(/, /\;/ { if (flag && ! /^\(|\;/) print }
         /\;/ { flag = 0 }
    ' pattern="$pattern" CREATE_TABLE.txt 

RE: EXTRACT DATA FROM PARENTHESIS

Hi

Indeed a small change. The range pattern includes the delimiting lines too, I explicitly added a condition to skip them.

CODE --> Awk

$0 ~ "\\<" pattern "\\>" { flag = 1 }
/^\(/, /\;/ { if (flag) print }
/\;/ { flag = 0 } 

Feherke.
feherke.ga

RE: EXTRACT DATA FROM PARENTHESIS

(OP)
Hi,
Thanks for your help.This code is working most of scenario but getting different output in one case and breaks.

Below scenario code is Working Code and giving correct output.
 
exp:Source data

CODE

CREATE TABLE ALU.ABCD_1
(
  ABCD VARCHAR(64) NOT NULL ,
 WXYZ NUMERIC(37,10) NOT NULL 
); 

CODE

while IFS='' read -r pattern; do
 
    awk '
        $0 ~ "\\<" pattern "\\>" { flag = 1 }
         /^\(/, /\;/ { if (flag && ! /^\(|\;/)  print }
         /\;/ { flag = 0 }
    ' pattern="$pattern" Src_SQL.sql |awk '{print $1}' >Ou.txt
done < PATTERN.txt 

Output

CODE

ABCD
WXYZ 

In the below scenario it is not giving correct output with same code
 

CODE

CREATE TABLE ALU.ABC_1
(
  ABC VARCHAR(64) NOT NULL 
, XYZ NUMERIC(37,10) NOT NULL 
); 

CODE

while IFS='' read -r pattern; do
 
    awk '
        $0 ~ "\\<" pattern "\\>" { flag = 1 }
         /^\(/, /\;/ { if (flag && ! /^\(|\;/)  print }
         /\;/ { flag = 0 }
    ' pattern="$pattern" Src_SQL.sql |awk '{print $1}' >Ou.txt
done < PATTERN.txt 
here the output is deleting the last record and missing the XYZ
output

CODE

ABC
, 

The problem comes if the comma(,) is before the column and in other examples comma(,) is in the same line.
Also getting error in below scenario..it deletes the 1st record
 

CODE

CREATE TABLE ALU.ABC_1
(  ABC VARCHAR(64) NOT NULL ,
 XYZ NUMERIC(37,10) NOT NULL 
);
OUTPUT
xyz

it deletes the 1st line (ABC) as it is with the same line of parenthesis'(' 

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close