Bar plots¶

A bar plot uses bars of different heights or colors to compare categorical data across different groups or categories.

Bar plots are often used to represent a measure of central tendency, with an estimation of the associated error.

import plotly.express as px
import statsplotly

df = px.data.stocks()

fig = statsplotly.barplot(
    data=df.set_index("date").melt(
        ignore_index=False, var_name="company", value_name="stock_value"
    ),
    y="stock_value",
    x="date",
    barmode="stack",
    slicer="company",
)
fig.show()

Controlling bar colors¶

Color can be specified independently of the slicer by providing the color parameter.

To keep track of data slices, the slicer identifier is indicated on the corresponding bars.

df = px.data.medals_long()

fig = statsplotly.barplot(
    data=df,
    barmode="group",
    x="nation",
    y="count",
    color="count",
    slicer="medal",
)
fig.show()

Aggregating data and displaying error bars¶

Barplot are often used to summarize the central tendency of data. This is accomplished with the aggregation_func argument. Error bars can also be specified.

Supplying only one dimension is equivalent to a count plot, with optional normalization.

Below we aggregate the fraction of tips distributed each day:

df = px.data.tips()

fig = statsplotly.barplot(
    data=df,
    x="day",
    slicer="sex",
    color_palette="tab10",
    aggregation_func="fraction",
)

fig.show()

Supplying xand y dimensions with an aggregation_func argument aggregate the numeric dimension across values of the other dimensions.

Here we perform bootstrap resampling to draw a 95% confidence interval error bar:

df = px.data.tips()

fig = statsplotly.barplot(
    data=df,
    y="total_bill",
    x="day",
    slicer="sex",
    color_palette="tab10",
    aggregation_func="mean",
    error_bar="bootstrap",
)

fig.show()

The aggregation_func and error_bar arguments also accepts Callable arguments :

import numpy as np

df = px.data.tips()
fig = statsplotly.barplot(
    data=df,
    x="day",
    y="total_bill",
    slicer="sex",
    aggregation_func=np.max,
    error_bar=lambda x: (x.min(), None),
)

fig.show()

Horizontal bar plots¶

The numeric dimension is the default aggregated dimension. Swapping the x and y dimensions thus produces an horizontal plot.

Below we plot the median with the interquartile range:

df = px.data.tips()
fig = statsplotly.barplot(
    data=df,
    y="day",
    x="total_bill",
    slicer="sex",
    aggregation_func="median",
    error_bar="iqr",
)

fig.show()

Full details of the API : barplot().