DHara: linux - Regex replace on specific column with SED/AWK -

Monday, 15 February 2010

linux - Regex replace on specific column with SED/AWK -

I have data that looks like this (Tab delimited):

  Organ cluster No Analysis Ln 200 C12 Gene Atolology Ln 200 C 116 Gene Onology CN-200C2 Gene Onatology

What do I do 3 or <3> every line To exclude C column, , excluding the header row:

 Organ clust is not analyzed LN 20012 Jane Antronology LN 2002 116 Jean Aunt It will not do this because it will affect other column and header row:    sed 's / c / / ' 
  What is the right way to do  
 
  Good tool for this: 
   $ awk - F '\ t' -v OFS = '\ t' 'NR & gt; = 2 {Sub (/ ^ c /, "", $ 3)} 1 'file organ K-clust no analysis LN 200 200 gene ontology LN 2002 116 Gene otology CN K200 2 gene onetologia  
  how it works

    -F '\ t'  
  on the input field as the delimiter Use tabs. 
 
    -v OFS = '\ t'  
  Use as tab

NR & gt; = 2 {sub (/ ^ c /, "", $ 3)}
Field delimiter on the output C remove line only 3.
1

It's a secret cloak of awk for print.

Use of sed

  $ sed -r '2, $ s / cna -200 gene onterology LN 2002 116 Gene Entrepreneur CN K -200 2 Jane Onitalology

ul>

-r

Use extended regular expressions. (On Mac OSX or other BSD platform, use -E instead.)

2, $ s / (([ ^ \ T] + \ T) {2}) C / 1 /

This replacement is applied only at the end of the file for line 2.

(([^ \ T] + \ t) {2}) matches the first column separated by two tabs. It assumes that only one tab separates each column. Because regex is enclosed in parens, the one that matches it will later be available as \ 1

C C .

\ 1 replaces the matched text with the first two columns, not the C ..

DHara

Monday, 15 February 2010

linux - Regex replace on specific column with SED/AWK -

how it works

Use of sed

No comments:

Post a Comment