df_count¶

hhpy.ds.df_count(x: str, df: pandas.core.frame.DataFrame, hue: Optional[str] = None, sort_by_count: bool = True, top_nr: int = 5, x_base: Optional[float] = None, x_min: Optional[float] = None, x_max: Optional[float] = None, other_name: str = 'other', other_to_na: bool = False, na: Union[bool, str] = 'drop') → pandas.core.frame.DataFrame[source]¶

Create a DataFrame of value counts. Supports hue levels and is therefore useful for plots, for an application see countplot()

Parameters:

x – Main variable, name of a column in the DataFrame or vector data
df – Pandas DataFrame containing the data, other objects are implicitly cast to DataFrame
hue – Name of the column to split by level [optional]
sort_by_count – Whether to sort the DataFrame by value counts [optional]
top_nr – Number of unique levels to keep when applying top_n_coding() [optional]
x_base – if supplied: cast x to integer multiples of x_base, useful when you have float data that would result in many unique counts for close numbers [optional]
x_min – limit the range of valid numeric x values to be greater than or equal to x_min [optional]
x_max – limit the range of valid numeric x values to be less than or equal to x_max [optional]
other_name – Name of the levels grouped inside other [optional]
other_to_na – Whether to cast all other elements to NaN [optional]
na – whether to keep (True, ‘keep’) na values and implicitly cast to string or drop (False, ‘drop’) them [optional]

Returns:

pandas DataFrame containing the counts by x (and by hue if it is supplied)