Categorical plots¶

A categorical plot combines the features of a bar plot and a scatter plot to visualize categorical data, typically showing how different groups or categories relate to each other based on one continuous variables.

By plotting a continuous variable against each category, categorical plots provide a clear and intuitive way to compare distributions, identify correlations, and explore patterns in the data.

import plotly.express as px
import statsplotly

Styling plots¶

Using marker size to cue the tip :

df = px.data.tips()

fig = statsplotly.catplot(
    data=df,
    y="total_bill",
    x="sex",
    plot_type="stripplot",
    size="tip",
    slicer="sex",
)
fig.show()

Using marker opacity to cue the tip :

df = px.data.tips()

fig = statsplotly.catplot(
    data=df,
    y="total_bill",
    x="sex",
    plot_type="stripplot",
    opacity="tip",
    slicer="sex",
)
fig.show()

Controlling data “spread”¶

The amount of spread over the categorical axis can be controlled with the jitter parameter :

df = px.data.tips()

fig = statsplotly.catplot(
    data=df,
    y="total_bill",
    x="sex",
    plot_type="stripplot",
    opacity="tip",
    jitter=0.2,
)
fig.show()

Controlling data points color¶

Color can be specified independently of the slicer :

df = px.data.tips()

fig = statsplotly.catplot(
    data=df,
    y="total_bill",
    x="time",
    plot_type="stripplot",
    opacity=0.8,
    color="sex",
    slicer="day",
    jitter=0.5,
)
fig.show()

Specifying categorical plot type¶

The plot_type parameter controls the kind of categorical plot (see categorical plot type) :

df = px.data.tips()

fig = statsplotly.catplot(
    data=df,
    y="total_bill",
    x="sex",
    plot_type="violinplot",
    size=8,
    slicer="smoker",
    marker="diamond",
)
fig.show()

Processing data under the hood¶

Slices can be re-ordered and filtered by providing the slice_order argument.

One can also normalize the data, e.g., centering (i.e., make data 0 mean) (see normalization type):

df = px.data.tips()

fig = statsplotly.catplot(
    data=df,
    y="total_bill",
    x="sex",
    plot_type="boxplot",
    slicer="day",
    normalizer="center",
    slice_order=["Fri", "Sat", "Sun"],
)
fig.show()

Full details of the API : catplot().