Sunday 15 March 2015

r - Creating a random sample from a dataframe with a nested structure -


This question builds from a SO post

I was trying to remove a random sample in a row I'll frame the data using the nesting situation.

The use of the following dummy dataset (modified by iris ):

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 Setosa 2 4.9 3.0 1.4 0.2 Setosa 3 4.7 3.2 1.3 0.2 Setosa 4 5.3 2. 9 1.5 0.2 Setosa 5 5.2 3.7 1.3 0.2 Virginia 6 4.7 3.2 1.5 0.2 Virginia 7 3. 9 3.1 1.4 0.2 Virginia 8 4.7 3.2 1.3 0.2 Virginia 9 4.0 3.1 1.5 0.2 Verilikor 10 5.0 3.6 1.4 0.2 Verilikor 11 4.6 3.1 1.5 0.2 Verilikor 12 5.0 3.6 1.5 0.2 Written code  

The code given below works fine to take a simple sample of 2 rows Does:

  ir though, i For each level of a particular variable, I want to take a sample of 2 rows, however, what I want to do [sample (neuro (iris), 2),]  

for example Create a random sample of 2 rows for each level of variable 'species', such as:

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 Setosa 4 5.3 2. 9 1.5 0.2 Setosa 6 4.7 3.2 1.5 0.2 Virginia 7 3. 9 3.1 1.4 0.2 Virginia 11 4.6 3.1 1.5 0.2 Verilikor 12 5.0 3.6 1.5 0.2 Whirlilor  

Your Thanks for the help!

dplyr is very easy with:

  Library (Dplyr) Iris% & gt;% group_by (species)%>% sample_n (size = 2) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 4.6 4.6 1.4 1.4 Setosa # 2 5.2 3.5 1.5 0.2 setosa # 3 6.5 2.8 4.6 1.5 versicolor # 4 5.7 2.8 4.5 1.3 versicolor # 5 5.8 2.8 5.1 2.4 Virgina # 6 7.7 2.6 6.9 2.3 Virgo  

The number of columns you want Group can

  CO2%>% group_by (type, treatment)%>% sample_n (size = 2)  

1 comment:

  1. Very helpful suggestions that help in the optimizing website. Thank you for valuable suggestions.Rhino 5.3.2

    ReplyDelete