PYTHON FOR DATA SCIENCE: FROM ARRAYS TO ANALYSIS / L07SEABORN: STATISTICAL DATA VISUALIZATION
LESSONS · 10 · 07 / 10
LESSON 07 · INTERMEDIATE · 60 MIN · ◆ 2 INSTRUMENTS

Seaborn: Statistical Data Visualization

Elevate your visualizations with seaborn. Create beautiful statistical graphics with minimal code: distributions, relationships, and categorical plots.

TIP

Learning Objectives: After this lesson, you'll create beautiful statistical graphics with minimal code using seaborn—distributions, relationships, categorical comparisons, and styled presentations.

TIP

How to read this lesson. Seaborn isn't bundled in this in-browser Python sandbox, but matplotlib is — and seaborn is a thin, opinionated layer over matplotlib. So every runnable cell below renders a real chart with matplotlib, and each block names the one-line seaborn equivalent (marked ≡ sns.…). You learn what the chart is, and the exact seaborn call you'd reach for in your own environment.

Why Seaborn?

Before writing any code, get a feel for what a statistical chart actually communicates. The plotter below shows a frequency distribution—the same kind of shape sns.histplot() produces.

FIG. 02Graph Plotter
INTERACTIVE
LOADING INSTRUMENT
Fig. 02Interactive plotting tool for visualizing data and relationships

Try it: Switch the chart type and tweak the values, then watch the shape redraw. Notice how a tall center with tapering sides reads as a "normal" distribution—the exact pattern the histogram code below generates.

Seaborn is built on matplotlib but provides a high-level interface for creating attractive statistical graphics. What takes ten lines in matplotlib often takes one in seaborn.

Plotting one scatter per category in raw matplotlib is a loop:

import matplotlib.pyplot as plt fig, ax = plt.subplots() for group in data['category'].unique(): subset = data[data['category'] == group] ax.scatter(subset['x'], subset['y'], label=group) ax.legend() ax.set_xlabel('X') ax.set_ylabel('Y') plt.show()

Seaborn collapses the whole loop — grouping, coloring, and the legend — into a single call:

sns.scatterplot(data=data, x='x', y='y', hue='category')

That economy is the whole point. Seaborn gives you beautiful defaults, built-in statistical aggregation (means, confidence intervals, KDEs), automatic legends and labels, native pandas integration, and a deep catalog of specialized statistical plots.

Setting Up Seaborn

Seaborn ships with sensible defaults you set once per session:

  • Styles (sns.set_theme(style=...)): whitegrid and darkgrid add gridlines; white, dark, and ticks keep things minimal.
  • Palettes (sns.set_palette(...)): deep is the default, muted/pastel are softer, husl/bright are colorful, and colorblind is accessibility-safe.

In a real project you import seaborn and pick a theme and palette once, up front:

import seaborn as sns import matplotlib.pyplot as plt import pandas as pd sns.set_theme() # Apply seaborn's default look # Themes — overall background + grid sns.set_theme(style='whitegrid') # Clean, with a grid (great for stats) sns.set_theme(style='darkgrid') # Dark background with grid sns.set_theme(style='white') # Minimal, no grid sns.set_theme(style='ticks') # Axis ticks, no grid # Palettes — the sequence of colors used for categories sns.set_palette('deep') # Default sns.set_palette('muted') # Softer sns.set_palette('colorblind') # Accessibility-safe

The single most useful habit: reach for colorblind or a perceptually-uniform palette like viridis so your charts stay readable for everyone.

Distribution Plots

Understanding data distributions is fundamental to data analysis.

Histograms and KDE

(The frequency-distribution plotter at the top of this lesson previews the shape this code produces.)

FIG. 04Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 04Interactive Python code execution environment

Box Plots and Violin Plots

FIG. 06Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 06Interactive Python code execution environment

Relationship Plots

Explore relationships between variables.

Scatter Plots with Regression

FIG. 08Graph Plotter
INTERACTIVE
LOADING INSTRUMENT
Fig. 08Interactive plotting tool for visualizing data and relationships
FIG. 10Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 10Interactive Python code execution environment

Pair Plots

Visualize relationships between all pairs of variables:

FIG. 12Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 12Interactive Python code execution environment

Heatmaps

Visualize matrices and correlations:

FIG. 14Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 14Interactive Python code execution environment

Categorical Plots

Compare categories effectively.

Bar Plots with Error Bars

FIG. 16Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 16Interactive Python code execution environment

Count Plots

FIG. 18Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 18Interactive Python code execution environment

Strip and Swarm Plots

Show individual data points:

FIG. 20Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 20Interactive Python code execution environment

Combining Plots

FIG. 22Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 22Interactive Python code execution environment

FacetGrid: Multiple Subplots

Create a grid of plots based on data subsets:

FIG. 24Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 24Interactive Python code execution environment

Complete Example: EDA Dashboard

FIG. 26Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 26Interactive Python code execution environment

Key Takeaways

Distribution plots: histplot(), kdeplot(), boxplot(), violinplot()

Relationship plots: scatterplot(), regplot(), pairplot(), heatmap()

Categorical plots: barplot(), countplot(), stripplot(), swarmplot()

FacetGrid: Create grids of plots by data subsets

Styling: set_theme(), color palettes, context settings

Integration: Works seamlessly with pandas DataFrames

Connections: Seaborn in Practice

🔗 Connection to Statistics

Statistical ConceptSeaborn Plot
Distributionhistplot(), kdeplot()
Central tendencybarplot() (shows mean)
Spreadboxplot(), violinplot()
Correlationheatmap(), pairplot()
Regressionregplot(), lmplot()

🔗 When to Use Which Plot

QuestionPlot Type
What's the distribution?histogram, KDE, box
Compare categories?bar, count, box
Relationship between 2 vars?scatter, regression
All pairwise relationships?pairplot
Matrix of values?heatmap

Practice Exercises

Exercise 1: Complete Visualization

FIG. 28Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 28Interactive Python code execution environment

Next Steps

Now you're ready for Exploratory Data Analysis (EDA)—a systematic approach to understanding data by combining all the techniques you've learned.


Ready to explore real datasets? EDA is next!


Further Reading

Visual Galleries

Official Docs

Tutorials

Modern Declarative Alternatives

  • Altair — Vega-Lite-based grammar of graphics. Most principled "declarative viz" library in Python.
  • plotnineggplot2 ported to Python. If you came from R, start here.

Books

  • Book: Storytelling with Data — Cole Nussbaumer Knaflic. The why of visualization.
  • Book: Fundamentals of Data Visualization — Claus Wilke (free online). Library-agnostic; required reading for anyone making serious charts.
CONNECTED CONCEPTS
seabornvisualizationstatisticsdistributions