1. Find comma "," and replace with "nothing"
2. Find \s{3,5} and replace with comma space ", "
1. Find \s\w.\s and replace with two spaces " "
2. Find (\w+), (\w+) and replace with (\2 \1)
3. Find (\w+\d) and replace with \(\1)
4. Find (\w{3})(/r) and replace with (\1\)
5. Find \s{2,} and replace with a single space " "
1. Find , \w+ \w+ and replace with "nothing"
1. Find , (\w)\w{5,}"space" and replace with , \1_\2
1. Find , (\w)\w{5,}"space" and replace with , \1_\2
2. Find (\w_)(\w{3})(\w+) and replace with \1\2.
1. Downloaded Xenopus laevis file from NCBI
2. used the command: grep '^>' rna.fna > Xenopus_headers.txt
>
so grep prints the lines that
start with >
. The ^
regular expression
specifies that is it the start of a line. Those results are then printed
into a new file called Xenopus_headers.txt. Above shows the beginning of
the txt file.1. sed 's/>/ \n>/g' rna.fna | sed -n '/rRNA/,/^ /p' > rRNA.txt
s'
and
g'
in the sed command replaces all occurrences of the
string in the line. Then the pipe passes the output from the ‘rna.fna’
file to the second command, with the -n
option and the
/p
print flag, which displays the printed lines only once
that include rRNA. Then this new information is printed in a new file
called rRNA.txt.