pairwise_corrplot

hhpy.plotting.pairwise_corrplot(data: pandas.core.frame.DataFrame, corr_cutoff: float = 0, col_wrap: int = 4, hue: str = None, hue_order: Union[list, str] = None, width: float = 7, height: float = 7, trendline: bool = True, alpha: float = 0.75, ax: matplotlib.axes._axes.Axes = None, target: str = None, palette: Union[Mapping[KT, VT_co], Sequence[T_co], str] = ['xkcd:blue', 'xkcd:red', 'xkcd:green', 'xkcd:cyan', 'xkcd:magenta', 'xkcd:golden yellow', 'xkcd:dark cyan', 'xkcd:red orange', 'xkcd:dark yellow', 'xkcd:easter green', 'xkcd:baby blue', 'xkcd:light brown', 'xkcd:strong pink', 'xkcd:light navy blue', 'xkcd:deep blue', 'xkcd:deep red', 'xkcd:ultramarine blue', 'xkcd:sea green', 'xkcd:plum', 'xkcd:old pink', 'xkcd:lawn green', 'xkcd:amber', 'xkcd:green blue', 'xkcd:yellow green', 'xkcd:dark mustard', 'xkcd:bright lime', 'xkcd:aquamarine', 'xkcd:very light blue', 'xkcd:light grey blue', 'xkcd:dark sage', 'xkcd:dark peach', 'xkcd:shocking pink'], max_n: int = 10000, random_state: int = None, sample_warn: bool = True, return_fig_ax: bool = True, **kwargs) → Optional[tuple][source]

print a pairwise_corrplot to for all variables in the df, by default only plots those with a correlation coefficient of >= corr_cutoff

Parameters:
  • data – Pandas DataFrame containing named data
  • corr_cutoff – Filter all correlation whose absolute value is below the cutoff [optional]
  • col_wrap – After how many columns to create a new line of subplots [optional]
  • hue – Further split the plot by the levels of this variable [optional]
  • hue_order

    Either a string describing how the (hue) levels or to be ordered or an explicit list of levels to be used for plotting. Accepted strings are:

    • sorted: following python standard sorting conventions (alphabetical for string, ascending for value)
    • inv: following python standard sorting conventions but in inverse order
    • count: sorted by value counts
    • mean, mean_ascending, mean_descending: sorted by mean value, defaults to descending
    • median, mean_ascending, median_descending: sorted by median value, defaults to descending
  • width – Width of each individual subplot [optional]
  • height – Height of each individual subplot [optional]
  • trendline – Whether to add a trendline [optional]
  • alpha – Alpha transparency level [optional]
  • ax – The matplotlib.pyplot.Axes object to plot on, defaults to current axis [optional]
  • target – Target variable name, if specified only correlations with the target are shown [optional]
  • palette – Collection of colors to be used for plotting. Can be a dictionary for with names for each level or a list of colors or an individual color name. Must be valid colors known to pyplot [optional]
  • max_n – Maximum number of samples to be used for plotting, if this number is exceeded max_n samples are drawn ‘ ‘at random from the data which triggers a warning unless sample_warn is set to False. ‘ ‘Set to False or None to use all samples for plotting. [optional]
  • random_state – Random state (seed) used for drawing the random samples [optional]
  • sample_warn – Whether to trigger a warning if the data has more samples than max_n [optional]
  • return_fig_ax – Whether to return the figure and axes objects as tuple to be captured as fig,ax = …, If False pyplot.show() is called and the plot returns None [optional]
  • kwargs – other keyword arguments passed to pyplot.subplots
Returns:

if return_fig_ax: figure and axes objects as tuple, else None