Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gmmastros on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

AWK for file .fastq

Status
Not open for further replies.

WayneSzalinsky

Technical User
Feb 20, 2015
1
0
0
IT
Hi folks!
this is a .fastq file example:

@IRIS:7:1:17:800#0/1
GGAAACACTACTTAGGCTTATAAGATCNGGTTGCGG
+IRIS:7:1:17:800#0/1
ababbaaabaaaaa`]`ba`]`aaaaYD\\_a``XT
@IRIS:7:1:17:150#0/1
TGATGTACTATGCATATGAACTTGTATGCAAAGTGG
+IRIS:7:1:17:150#0/1
abaabaa`aaaaaaa^ba_]]aaa^aaaaa_^][aa
@IRIS:7:1:18:16#0/1
AGGTTCGTGTTGAGTGTTGCCTCTTTTTCTGTTAAT
+IRIS:7:1:18:16#0/1
abb`bbbabbbaWa`baba_Z``babaa[_aa`\X_
@IRIS:7:1:19:1343#0/1
GGCATCTCCAGAGGAGGCTGTACCTGTGGAATAGCA
+IRIS:7:1:19:1343#0/1
abbbbbbbbababaaba_a\FXQXNVKO^F[RWYSQ

and so on for thousand of record
Each record is made of four line i.e

@IRIS:7:1:17:800#0/1
GGAAACACTACTTAGGCTTATAAGATCNGGTTGCGG
+IRIS:7:1:17:800#0/1
ababbaaabaaaaa`]`ba`]`aaaaYD\\_a``XT

Now,
I was wondering if anyone knows how to get through AWK or SED, RANDOM records from a fastq file.
thanks!

 
Hi

Code:
gawk '[b]ENDFILE[/b][teal]{[/teal][COLOR=FF6600]srand()[/color][teal];[/teal][b]print[/b] [navy]r[/navy][teal]=[/teal][COLOR=FF6600]int[/color][teal]([/teal]NR[teal]/[/teal][purple]4[/purple][teal]*[/teal][COLOR=FF6600]rand()[/color][teal])*[/teal][purple]4[/purple][teal]}[/teal]FNR[teal]!=[/teal]NR[teal]&&[/teal]FNR[teal]>[/teal]r[teal]&&[/teal]FNR[teal]<=[/teal]r[teal]+[/teal][purple]4[/purple]' .fastq .fastq
( Note the input file name specified twice. )

Tested with GNU [tt]awk[/tt], requires version 4.0.0. ( If necessary, is possible to replace the [tt]ENDFILE[/tt] pattern with a workaround. )


Feherke.
feherke.ga
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top