df_count

hhpy.ds.df_count(x: str, df: pandas.core.frame.DataFrame, hue: Optional[str] = None, sort_by_count: bool = True, top_nr: int = 5, x_base: Optional[float] = None, x_min: Optional[float] = None, x_max: Optional[float] = None, other_name: str = 'other', other_to_na: bool = False, na: Union[bool, str] = 'drop') → pandas.core.frame.DataFrame[source]

Create a DataFrame of value counts. Supports hue levels and is therefore useful for plots, for an application see countplot()

Parameters:
  • x – Main variable, name of a column in the DataFrame or vector data
  • df – Pandas DataFrame containing the data, other objects are implicitly cast to DataFrame
  • hue – Name of the column to split by level [optional]
  • sort_by_count – Whether to sort the DataFrame by value counts [optional]
  • top_nr – Number of unique levels to keep when applying top_n_coding() [optional]
  • x_base – if supplied: cast x to integer multiples of x_base, useful when you have float data that would result in many unique counts for close numbers [optional]
  • x_min – limit the range of valid numeric x values to be greater than or equal to x_min [optional]
  • x_max – limit the range of valid numeric x values to be less than or equal to x_max [optional]
  • other_name – Name of the levels grouped inside other [optional]
  • other_to_na – Whether to cast all other elements to NaN [optional]
  • na – whether to keep (True, ‘keep’) na values and implicitly cast to string or drop (False, ‘drop’) them [optional]
Returns:

pandas DataFrame containing the counts by x (and by hue if it is supplied)