top_n_coding

hhpy.ds.top_n_coding(s: Sequence[T_co], n: int, other_name: str = 'other', na_to_other: bool = False, other_to_na: bool = False, w: Optional[Sequence[T_co]] = None) → pandas.core.series.Series[source]

Returns a modified version of the pandas series where all elements not in top_n become recoded as ‘other’

Parameters:
  • s – Pandas Series to adjust
  • n – How many unique elements to keep
  • other_name – Name of the other element [optional]
  • na_to_other – Whether to cast missing elements to other [optional]
  • other_to_na – Whether to cast all other elements to NaN [optional]
  • w – Weights, if given the weights are summed instead of just counting entries in s [optional]
Returns:

Adjusted pandas Series