top_n_coding

hhpy.ds.top_n_coding(s: Sequence[T_co], n: int, other_name: str = 'other', na_to_other: bool = False, w: Optional[Sequence[T_co]] = None) → pandas.core.series.Series[source]

returns a modified version of the pandas series where all elements not in top_n become recoded as ‘other’

Parameters:
  • s – pandas Series to adjust
  • n – how many elements to keep
  • other_name – name of the other element [optional]
  • na_to_other – whether to cast missing elements to other [optional]
  • w – weights, if given the weights are summed instead of just counting entries in s [optional]
Returns:

adjusted pandas Series