Friday, 15 April 2011

loops - R ddply rollingmean help: Need to capture rolling mean by Unique ID -


I am struggling to use the desired output to use DDiplay. I believe that I am on the right track but I think that I failed in output data from a loop inside the loop ... Sample data:

  The player, Career_Game, Date, Era, Piece Geo Gonzalez, 176, August 1, 3.0, 86 Giao Gonzalez, 177, August 5, 4.01, 89 Geo Gonzalez, 178, 10 August, 4, 11 GO Gonzalez, 179, 16 August, 4.06 , 102 GO Gonzalez, 180, August 21, 3.83, 97 .............. Jordan Zimmermann, 114, 4 April, 1.8, 81 Jordan Zimmermann, 115, April 9, 8.1, 57 Jord Neither Jasmine, 116, April 14, 5.27, 93 Jordan Zimmermann, 117, April 19, 3.92, ..............  

Ill calls this data frame, BB

So what I am trying to accomplish, I want to get the average of the past, for every player of every game I say 5 games ... ....

  Pitchers_5 = data.frame (ddply (BB, ~ player, tail, n = 5, numcolwise (mean)))  

It successfully scored ES (Career_Gums 176 to 180) for the last five Gam players. However, I would like to get this average for every observation. So for Carrier_Gate 177, the code will count 172 to 176, then the previous 5 games will be an example for 177, then the example of 178 will continue, and the recalculation of the previous 5 games and so on. .. So, using the data from above, when the code was found in the Gayo Gonzalez 181 career game, it looks like this (average of the last 5 games)

  Geo Gonzalez, 178, Date (not required), 3.78, 77  

Update: Matri The comment has inspired me to look at the rollman function of the zoo package. I have read a few posts and answers similar to my problem since then, but looking forward (see guidance). This link solves a very similar problem except 2 areas, it calculates the rolling mean of blood pressure by a unique ID in a new area, where I want to calculate the rolling mean of many areas. It also includes an overview of blood pressure which means that it is calculated. For example, I see ....
If I want to calculate the rolling medium of the 180th game of Geo Gonzalez, then I mean the game 175 in 179. The results of the game of 180 are also not included.

Assume that you want the rolling mean of the era < Due to the size of sample data set / code> and pitches and for illustration using 3 instead of 3:

  Library pdf library (zoo) cbind (BB, DDPA (BB, ~ Player, Function (X) RollApple (X [C ("Era", "Pitch"), List (- (1: 3)), mean, Fill = NA)) [- 6 ]  

is giving:

  player career_game date ERA pitches ERA.1 pit.1.1 geo gonzes Village 176 August 1 3.00 86 NA 2 GO Gonzalez 177 August 5 4.01 89 NA NA 3 GO Gonzalez 178 August 10 4.00 11 NA 4 GO Gonzalez 17 9 Aug 16 4.06 102 3.670000 62.00000 5 GO Gonzalez 180 August 21 3.83 97 4.023333 67.33333 6 Jordan Zimmerman 114 April 4 1.80 81 NA 7 Jordan Zimmerman 115 April 9 8.10 57 NA NA 8 Jordan Zimmermann 116 April 14 5.27 93 NA 9 9 Jordan Zimmermann 117 April 19 3.92 100 5.056667 77.00000 If there are fewer than 4 rows in some groups, then its Use the. If there is a row, then the NAC comes back. If there are at least 4 rows, then it reduces  k , so that it still gives something. 

  f <-> function (x) {x & lt; - as.matrix (x, c ("era", "pitch")) k <- min (3, nev (x) -1) if (k) rollflif (x, list (- (1: k)) , It means, filled = NA) other NA * x} cibind (BB, DDPA (BB, ~ player, F)) [- 6]  

Note: / Strong> We used to use it:

  Lines < - "Player, Career_Game, Date, Era, Pitch Giao Gonzalez, 176, August 1, 3.0, 86 Giao Gonzalez, 177, 5 August, 4.01, 89 GO Gonzalez, 178, August 10, 4, 11 GO Gonzalez, 179, 16 August, 4.06, 102 GO Gonzalez, 180, 21 August, 3.83, 97 Jordan Zimmerman, 114, 4 April, 1.8, 81 Jordan Zimmerman, 115, April 9, 8.1, 57 Jordan Zimmermann, 116 April 14, 5.27, 93 Jordan Zimmermann, 117, April 9, 3.92, 100 "BB & lt;  

Renew to use the ply as a request - read.csv (text = rows, strip white = TRUE, as.is = TRUE) Additionally added changes which also handles smaller groups.


No comments:

Post a Comment