Bar plots¶
A bar plot uses bars of different heights or colors to compare categorical data across different groups or categories.
Bar plots are often used to represent a measure of central tendency, with an estimation of the associated error.
Show code cell source
import plotly.io as pio
pio.renderers.default = "sphinx_gallery"
import plotly.express as px
import statsplotly
df = px.data.stocks()
fig = statsplotly.barplot(
data=df.set_index("date").melt(
ignore_index=False, var_name="company", value_name="stock_value"
),
y="stock_value",
x="date",
barmode="stack",
slicer="company",
)
fig.show()
Controlling bar colors¶
Color can be specified independently of the slicer by providing the color
parameter.
To keep track of data slices, the slicer
identifier is indicated on the corresponding bars.
df = px.data.medals_long()
fig = statsplotly.barplot(
data=df,
barmode="group",
x="nation",
y="count",
color="count",
slicer="medal",
)
fig.show()
Aggregating data and displaying error bars¶
Barplot are often used to summarize the central tendency of data. This is accomplished with the aggregation_func
argument. Error bars can also be specified.
Supplying only one dimension is equivalent to a count plot, with optional normalization.
Below we aggregate the fraction of tips distributed each day:
df = px.data.tips()
fig = statsplotly.barplot(
data=df,
x="day",
slicer="sex",
color_palette="tab10",
aggregation_func="fraction",
)
fig.show()
Supplying x
and y
dimensions with an aggregation_func
argument aggregate the numeric dimension across values of the other dimensions.
Here we perform bootstrap resampling to draw a 95% confidence interval error bar:
df = px.data.tips()
fig = statsplotly.barplot(
data=df,
y="total_bill",
x="day",
slicer="sex",
color_palette="tab10",
aggregation_func="mean",
error_bar="bootstrap",
)
fig.show()
The aggregation_func
and error_bar
arguments also accepts Callable
arguments :
import numpy as np
df = px.data.tips()
fig = statsplotly.barplot(
data=df,
x="day",
y="total_bill",
slicer="sex",
aggregation_func=np.max,
error_bar=lambda x: (x.min(), None),
)
fig.show()
Horizontal bar plots¶
The numeric dimension is the default aggregated dimension. Swapping the x
and y
dimensions thus produces an horizontal plot.
Below we plot the median with the interquartile range:
df = px.data.tips()
fig = statsplotly.barplot(
data=df,
y="day",
x="total_bill",
slicer="sex",
aggregation_func="median",
error_bar="iqr",
)
fig.show()
Full details of the API : barplot()
.