Tuesday 15 February 2011

python - How to split a pandas dataframe that contains lists as fields into a multi-indexed dataframe? -


I have a panda dataframe in which lists are included as elements (not noded arrays) and it's sorted indexed The data structure you want to break in. Here's an example of what I'm trying to achieve: I have the dataframe of this form:

  | Model | Company | Url | Criteria | Rating 1 | Mode 11 Company 1 | Url1 [Criteria 1, criterion 2]. [Rating 1, Rating 2] 2 | Mode 12 Company 2 | Url2 [Criteria 4, criterion 5] | [Rating 4, Rating 5]  

In

  | Model | Company | Url | Rating ---------------------------------- Criterion 1 | Model 1 | Company1 | Url1 Rating 1 criteria 2 | Model 1 | Company1 | Url1 Rating 2 criteria 3 | Model 1 | Company1 | Url1 Rating 3 criteria 4 | Model 2 | Company2 | Url2 Ratings 4 criteria 5 | Model 2 | Company2 | Url2 Rating 5 criteria 6 | Model 2 | Company2 | Url2 Rating 6  

  import system # You do not need a string IO if you If you are reading data from a file then sys.version_info [0] & lt; 3: StringIO import from StringIO other: StringIO import pandes to PD dfstr = StringIO ("| | | | Company | URL | criteria | Rating 1 | Mode1 | Company1 | URL 1 | [criteria1, criterion2] | [Ratings 1, Rating 2] Mode2 | Company2 | URL 2 | [Criterion 4, Criterion] | [Ratings 4, Rating 5] "" ") df = pd.DataFrame.from_csv (dfstr, sep = '|' 'Df.columns = [' Model '' Company ',' URL ',' Criteria ',' Rating '] # No White Location Def Slist (Listed): # List-without-quotation is a pain reliever list. Strip ('[]'). Partition (', Def line 2 df (boonero): # Create a little sub-datafree for each line tmp = pd.DataFrame (Zip (slit (line [1].), Slit (line [1] .Rating)) val ['Model', 'Company', 'URL'] for line in Df.iterrows ()]: TMP [VAL] = Line [1] [VAL] Returns TMP Outdf = PD.cocket ([Line 2 DF] (Outline), # and criteria outdf.drop (0, axis = 1, inplace = true) outdf.column = ['rating', 'model', 'company', 'url'] #  

No comments:

Post a Comment