Airline On‑Time Performance

Turning flight delays into clear, visual stories.

Hi, I'm Wiktor — a data analyst passionate about aviation and interactive visualizations. This portfolio showcases an end‑to‑end analysis of airline on‑time performance and delay causes, based on data published by the U.S. Bureau of Transportation Statistics.

From geospatial maps and rankings to a predicted delay risk model, each visualization is built in R and delivered as a clean, responsive web experience.

Wiktor
Dataset

About the data

The analysis uses airline on‑time statistics and delay causes reported to the Bureau of Transportation Statistics (BTS). The dataset aggregates flights by airline, airport, month and delay type, enabling both high‑level trends and detailed operational insights.

Each record in the dataset describes the performance of a given airline at a specific airport in a given month. For that combination, it includes the number of operated flights, total delay minutes and a breakdown across key delay categories: carrier, weather, National Airspace System (NAS), security and late‑arriving aircraft.

This structure makes it possible to explore patterns across time, compare airlines and airports on a fair basis, and attribute delays to their underlying operational causes. All visualizations in this portfolio are built on top of this aggregated view.

Source BTS On‑Time Statistics
Granularity Airline × Airport × Month
Measures Flights, delay minutes, delay causes
Focus Arrival delays (≥ 15 minutes)
Geospatial

Flight delay map

An interactive view of how average arrival delays vary across airports, combining network structure with performance metrics.

This interactive map visualizes arrival delays across selected airports. Each point represents an airport, with color and size reflecting its average arrival delay and traffic volume. The goal is to make performance differences visible in a way that aligns with how we naturally think about aviation: as a spatial network.

By hovering and zooming, you can quickly identify which hubs consistently perform well, which struggle with delays, and how these patterns align with the broader U.S. air traffic system.

Delay causes

Monthly delay composition

A heatmap of delay minutes by cause and month, revealing which issues dominate and when they peak.

The heatmap shows how different types of delays contribute to total delay minutes throughout the year. Late Aircraft stands out as the dominant driver, especially in the summer peak, highlighting the snowball effect of knock‑on delays across the network.

Weather plays a smaller role than many passengers expect, while NAS and Carrier‑related issues remain relatively stable. This view helps shift discussions from generic “bad weather” explanations to more precise, data‑driven narratives about operational risk.

Seasonality

Monthly delay trend

How average arrival delays evolve over the year, revealing peak stress periods in the air transport system.

This line chart tracks how average arrival delays change month by month. A clear mid‑year peak aligns with the summer travel season, when passenger demand, airport congestion and tight schedules combine to push the system to its limits.

A secondary rise around the holiday season reflects a similar pattern: high demand, constrained capacity and limited room to absorb disruptions. Together, these peaks show when airlines need the most resilience in their operations.

Performance

Top 10 best airlines

A fair comparison of airlines based on average delay per operated flight, not just among delayed flights.

This ranking compares airlines using the total accumulated delay minutes divided by the total number of operated flights. In other words, it measures how many minutes of arrival delay each flight contributes on average, regardless of whether it was classified as “on time” or “delayed”.

By avoiding metrics that only consider already delayed flights, this view rewards airlines that consistently run a punctual operation — not just those that manage to “recover” once a delay has already happened.

Performance

Top 10 best airports

Evaluating airports with the same metric as airlines: average arrival delay per flight.

This ranking applies the same methodology to airports: the total arrival delay minutes are divided by the total number of arriving flights. This normalizes for traffic volume and highlights which airports manage to move large numbers of flights while still maintaining punctual operations.

The result is a performance leaderboard that is more informative than raw delay totals alone, and more nuanced than simple “on‑time percentages”.

Flow of delay

From cause to airline

A Sankey diagram tracing how delay minutes flow from specific causes into each airline, visualizing where operational pain really accumulates.

The Sankey diagram connects delay causes to airlines, with the width of each flow proportional to total delay minutes. This highlights, at a glance, which carriers are most affected by Late Aircraft, weather, NAS or carrier‑driven issues. It turns a long table of numbers into an immediate picture of where operational weaknesses lie.

Predictive view

Predicted delay risk model

A logistic regression model estimating the probability that a flight will arrive ≥ 15 minutes late, summarized as a heatmap by airline and month.

Instead of predicting exact delay minutes, this model focuses on delay risk: the probability that a flight will arrive at least 15 minutes late. It uses aggregated data by airline, airport and month, and fits a logistic regression model that captures structural differences in performance and seasonality.

The heatmap then summarizes the model’s predictions at the airline–month level, weighted by the number of operated flights. Darker cells indicate higher structural delay risk, helping to pinpoint which carriers face the greatest challenges, and during which parts of the year those challenges are most intense.

Summary

Key insights from the analysis

A short summary of what this project reveals about airline performance and what it demonstrates about my approach to data analysis.

Insight 01
Late aircraft dominates delay minutes

Across months, Late Aircraft is the single most important driver of delay minutes, especially in peak summer. This suggests that improving recovery strategies and aircraft rotations may generate more impact than focusing solely on weather.

Insight 02
Seasonality is strong and predictable

Both the monthly delay trend and the predicted risk model highlight two clear stress periods: the summer travel peak and the holiday season. Planning capacity, buffers and staffing around these windows is critical for reliability.

Insight 03
Fair comparisons require fair metrics

Ranking airlines and airports by average delay per operated flight provides a more honest view than on‑time percentages or raw delay totals. It balances volume, punctuality and severity into a single, interpretable measure.

Insight 04
From descriptive to predictive

The delay risk model moves beyond describing what happened and starts to model the conditions under which delays are more likely. Even with aggregated data, this predictive perspective is valuable for setting expectations and prioritizing risk.