join to files does not work
join to files does not work
(OP)
Hi all,
I have these two .dat files (I only show the first 20 lines for both):
When I try to make a join for both files, I only get a few results (exactly 10). The complete code is:
I don't know what I am doing wrong, but it's obvious that join is not showing all the results.
Thanks in advance
PS: assoc_specie.txt file has this format (only showing first line):
I have these two .dat files (I only show the first 20 lines for both):
CODE --> awk
GO:0005509 PDCD6 GO:0004672 CDK1 GO:0005524 CDK1 GO:0005634 CDK1 GO:0005737 CDK1 GO:0006468 CDK1 GO:0005615 SERPINB6 GO:0006629 APOC2 GO:0006869 APOC2 GO:0008047 APOC2 GO:0042627 APOC2 GO:0043085 APOC2 GO:0001932 TADA2L GO:0003677 TADA2L GO:0005671 TADA2L GO:0006357 TADA2L GO:0007067 TADA2L GO:0008270 TADA2L GO:0016573 TADA2L (...)
CODE --> awk
GO:0000001 mitochondrion inheritance GO:0000002 mitochondrial genome maintenance GO:0000003 reproduction GO:0000005 ribosomal chaperone activity GO:0000006 high affinity zinc uptake transmembrane transporter activity GO:0000007 low-affinity zinc ion transmembrane transporter activity GO:0000008 thioredoxin GO:0000009 alpha-1,6-mannosyltransferase activity GO:0000010 trans-hexaprenyltranstransferase activity GO:0000011 vacuole inheritance GO:0000012 single strand break repair GO:0000014 single-stranded DNA specific endodeoxyribonuclease activity GO:0000015 phosphopyruvate hydratase complex GO:0000016 lactase activity GO:0000017 alpha-glucoside transport GO:0000018 regulation of DNA recombination GO:0000019 regulation of mitotic recombination GO:0000020 negative regulation of recombination within rDNA repeats (...)
When I try to make a join for both files, I only get a few results (exactly 10). The complete code is:
CODE --> awk
ls *gene_association* | while read file; do echo; echo @@@ Archivo: $file; echo; # New file "assoc_specie.txt" IFS='_' read -r -a array <<< "$file" SPECIE=${array[2]} #Filtering comments (!comment...) cat $file | grep -v '!' > assoc_$SPECIE.txt; gawk 'BEGIN{OFS="\t";FS="\t"}{print $5, $3}' assoc_$ESPECIE.txt > goTerms_$ESPECIE.dat; join goTerms_$SPECIE.dat gene_ontology.dat > join.dat echo done;
I don't know what I am doing wrong, but it's obvious that join is not showing all the results.
Thanks in advance
PS: assoc_specie.txt file has this format (only showing first line):
CODE --> awk
UniProtKB A0A024QZ42 PDCD6 GO:0005509 GO_REF:0000002 IEA InterPro:IPR002048 F HCG1985580, isoform CRA_c A0A024QZ42_HUMAN|PDCD6|hCG_1985580 protein taxon:9606 20160312 InterPro (...)
RE: join to files does not work
In order to understand recursion, you must first understand recursion.