Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

join to files does not work

join to files does not work

join to files does not work

Hi all,

I have these two .dat files (I only show the first 20 lines for both):

CODE --> awk

GO:0005509	PDCD6
GO:0004672	CDK1
GO:0005524	CDK1
GO:0005634	CDK1
GO:0005737	CDK1
GO:0006468	CDK1
GO:0005615	SERPINB6
GO:0006629	APOC2
GO:0006869	APOC2
GO:0008047	APOC2
GO:0042627	APOC2
GO:0043085	APOC2
GO:0001932	TADA2L
GO:0003677	TADA2L
GO:0005671	TADA2L
GO:0006357	TADA2L
GO:0007067	TADA2L
GO:0008270	TADA2L
GO:0016573	TADA2L

CODE --> awk

GO:0000001	mitochondrion inheritance
GO:0000002	mitochondrial genome maintenance
GO:0000003	reproduction
GO:0000005	ribosomal chaperone activity
GO:0000006	high affinity zinc uptake transmembrane transporter activity
GO:0000007	low-affinity zinc ion transmembrane transporter activity
GO:0000008	thioredoxin
GO:0000009	alpha-1,6-mannosyltransferase activity
GO:0000010	trans-hexaprenyltranstransferase activity
GO:0000011	vacuole inheritance
GO:0000012	single strand break repair
GO:0000014	single-stranded DNA specific endodeoxyribonuclease activity
GO:0000015	phosphopyruvate hydratase complex
GO:0000016	lactase activity
GO:0000017	alpha-glucoside transport
GO:0000018	regulation of DNA recombination
GO:0000019	regulation of mitotic recombination
GO:0000020	negative regulation of recombination within rDNA repeats

When I try to make a join for both files, I only get a few results (exactly 10). The complete code is:

CODE --> awk

ls *gene_association* | while read file;
echo @@@ Archivo: $file;

# New file "assoc_specie.txt"
IFS='_' read -r -a array <<< "$file"

#Filtering comments (!comment...)
cat $file | grep -v '!' > assoc_$SPECIE.txt;
gawk 'BEGIN{OFS="\t";FS="\t"}{print $5, $3}' assoc_$ESPECIE.txt > goTerms_$ESPECIE.dat;
join goTerms_$SPECIE.dat gene_ontology.dat > join.dat


I don't know what I am doing wrong, but it's obvious that join is not showing all the results.

Thanks in advance

PS: assoc_specie.txt file has this format (only showing first line):

CODE --> awk

UniProtKB	A0A024QZ42	PDCD6		GO:0005509	GO_REF:0000002	IEA	InterPro:IPR002048	F	HCG1985580, isoform CRA_c	A0A024QZ42_HUMAN|PDCD6|hCG_1985580	protein	taxon:9606	20160312	InterPro

RE: join to files does not work

The two files must be sorted on the join fields so it looks like your first .DAT file needs to be sorted

In order to understand recursion, you must first understand recursion.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close