Sunday 15 January 2012

Processing the data in a matrix in r -


I'm totally new to R and problems like this I have a matrix with the following data < pre> [1] NA "/home/psycodelic/Desktop/r_source//AAIT.csv" [2,] "50.6606864064485" "/ home / psycodelic / desktop / r_source // AAPL.csv" [3,] "20.6 9 7618553608" "/home/psycodelic/Desktop/r_source//BSFT.csv" [4,] ".585775171228343" "/ home / psycodelic / desktop / r_source / /BSPM.csv" [5,] "1.07713703069294" "/home/psycodelic/Desktop/r_source//BSQR.csv" [6,] NA "/home/psycodelic/Desktop/r_source//CAPN.csv" [7,] NA "/ home / psycodelic / Desktop / r_source / /CAPNW.csv "

I remove this file

~ Areas with'm trying to do the following name, ie only convert part around the .csv

remove all entries containing ~ NA

integer values ​​~

~ descending order according to value (lowest on top)

~ Create a new matrix with only the top 10 (i.e. less than 10) values ​​and print it Please.

Can anyone help me with it? / P>

Note: Your question for this question is not suitable for this site . You should include one. Include the code in which you have tried and where it is causing you problems. At this point, you are likely to continue to vote, and your question could possibly be closed. Be careful about that please see to improve your question.

He said , you are new to R and I will try to help you get started on it. But please consult the help center to get started on stack overflow.


Assume that your matrix name is stored with the name mat . Convert it to a data frame first.

  df < - as.data.frame (mat)  

This column will have a column called V1 and V2 . If you select colnames (df) & lt; If you want to use - c ("whatever", "some") you can rename them I agree that you have named the original name.

Now get the file name from the second column.

  df $ file.name & lt; - with (df, tail (strsplit (V2, "/") [[1]], n = 1))  

A lot is going on here, so let's move on It is time to look at

  & gt; strsplit ( "/ home / psycodelic / desktop / r_source // AAPL.csv", "/") [[1]] [1] "" "Home" "psycodelic" "Desktop" "r_source" "" "AAPL.csv " 

This is an list with one element. We can use that code as [[1]] . That element has a vector of strings, which is the result of dividing the input on the / character.

You only want the file name, which is the last part of the file path. Thus we type the tail (..., n = 1) .

With the (df, ...) It ensures that the data frame on V2 Is defined in the context of df .

Then at the point df three columns: one with the original, plus the filename. The next step is to remove the NA values, we can do this by using the na.omit () function.

  df < - na.omit (df)  

You can blindly V1 in as.integer () . Change the float in

  DF $ V1  - like. Numerical (df $ V1)  

is shutting down! As soon as climb:

  df < - df [(DF, order (V1)),]  

Now create a new data frame and print it.

  (df2 <-> head (df, n = 10))  

Around the assignments in parentheses, both will print it and Will assign.


No comments:

Post a Comment