I have data of large data that I have a process and generated a dictionary. Now I make dataframe from this dictionary Want to The dictionary has a list of Valle Tuples. From those standards, I need to know the unique values for creating dataframe columns:
d = '' 0001 ': [(' Skiing ', 0.789) '(' Snow '0.65), (' winter ', 0.56)],' 0002 ': [(' drama ', 0.8 9), (' comedy ', 0.678), (' action ', - 0.42) (' winter '(' Children ', 0.12)],' 0003 ': [(' Action ', 0.89), (' funny ', 0.58), (' game ', 0.12)],' 0004 ': [ ('Dark', 0.8 9), ('cartoon', -0.89), ('comedy', 0.678), ('mystery', 0.678), ('crime', 0.12), ('adult', - 0.423 ), '0005': ('Action', 0.12)], '0006': [('drama', -0.49), ('funny', 0.378), (' Spence ', 0.12), (' Thriller ', 0.78)],' 0007 ': ((' Dark ', 0.79), (' Mystery ', 0.88), (' Crime ', 0.32), (' Adult ' (Approximately 800,000 records of the word dictionary) I repeat on the dictionary to find unique headers:
I believe it takes a long time to process its sub There can also be a problem with the code, because it is very slow and further, when I create raw data frames by raw, it slows further process:
In d in d: df.loc [K] = pd.Series (d [k]) df.fillna (0.0, axis = 1) in df = pd.DataFrame (column = col_headers, index = entities) k )
How can I move this process to reduce the process time?
But you should also open the internal key-value pair With a dictionary
df = pd.DataFrame.from_dict ({k: dict (v) for k, v in d.items ()}, orient = "index"). Filling (0)
Alternatively, if you want to unify the style of column headings:
df.columns = [c.lower () For df.columns]
If you wanted to be completely crazy, then you can sort the columns:
df = df.sort (axis = 1)
No comments:
Post a Comment