Skip to content

Inconsistent DataFrame.groupby returned type when grouped value is unique #2893

Closed
@sv3ndk

Description

@sv3ndk

I am using groupby().apply() to compute new columns in a dataframe, for example like this:

df1 = DataFrame([{"val1": 1, "val2" : 20}, {"val1":1, "val2": 19}, {"val1":2, "val2": 27}, {"val1":2, "val2": 12}])
def func(dataf):
    return dataf["val2"]  - dataf["val2"].mean()
print type(df1.groupby("val1").apply(func))     # this is a Series
df1["centered"] = df1.groupby("val1").apply(func)
print df1

However, if the set of values of the grouped by column ("val1") is unique, the groupby above returns a dataframe as opposed to a Serie, in which case the assignment of the result to a column fails:

df2 = DataFrame([{"val1": 1, "val2" : 20}, {"val1":1, "val2": 19}, {"val1":1, "val2": 27}, {"val1":1, "val2": 12}])
def func(dataf):
    return dataf["val2"]  - dataf["val2"].mean()
print type(df2.groupby("val1").apply(func))     # this is a DataFrame
df2["centered"] = df2.groupby("val1").apply(func)           # this fails: cannot assign a DataFrame to a column
print df2

As a result, my code is littered by check of uniqueness of grouped by parameter:

if len(dataframe["val1"].unique()) == 1:
  df2["centered"] = func(df2)
else:
 df2["centered"] = df2.groupby("val1").apply(func)

Am I taking the wrong approach there? If not, would it be possible to have a consistent returned type?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

        翻译: