Friday, 15 May 2015

python - Pandas filter not working as expected -


I have a pandus dataframe in which I need to delete some lines that match a regex pattern in a given column Not account. The columns I need to run against regex are formatted: last name, first name , and I want to remove all the rows in that column that do not match that format. I'm trying to use the panda filter method, and I have tried to use the command like this:
edited_df = idf ['name']. Filter (regex = "([aA-zZ] *) ([,] {1}) ([aa-zZ] *)") and edited_df = idf ['name']. Filter (regex = "/ ([aA-zZ] *) ([,] {1}) ([aA-zZ] *) /") .
However, this error is caused by doing this:
TypeError: can not use string pattern on a byte-like object
type (idf ['name ']) Is a series of , and each entry in it is a string, type (IDF [' CIO '] [1]) .
I have seen this question, but I want to make my schedule more modular and every time a name is added, it does not need to be adjusted in a list of names.
I have done my regex test with the test string and it is in line with the requirement, so I assume that I am using the filter method incorrectly. Any help is greatly appreciated.
In addition, one less important thing is that it is possible to access the captured group created with regex in the modified dataframe.

Thanks to the comments of the ad custom, here's how to solve this problem:
, Leave values ​​using Nyan :
Idf.dropna (subtitle = ['name'], inplace = true)
Then, filter Use str.contains instead:
edited_df = idf [idf ['name'] Str.contains (r "([aA-zZ] *) ([,] {1}) ([aA- ZZ] *) ")]


No comments:

Post a Comment