The bet on a curve's shape, not its level

A maturity-aware graph model turns a tiny forecasting edge into a smooth, near-market-neutral return stream from commodity calendar spreads — correlation with the S&P 500 of about minus two-hundredths.

Most machine-learning work on futures markets bets on direction: is oil going up or down? That carries exactly the volatile risk professional traders spend their careers trying to shed. This paper does the opposite. It brings graph learning to calendar spreads — a serious institutional strategy that bets on the shape of a price curve rather than its level — and ends up with a return stream almost entirely uncorrelated with the stock market.

Betting on shape, not level

A futures contract is an agreement to buy or sell something — crude oil, corn, natural gas — at a set price on a set future date. For any one commodity, many contracts trade at once, each expiring at a different time: one month out, three months, six. The set of their prices across maturities is the futures curve, or term structure; the time until a contract expires is its time-to-maturity.

A calendar spread ignores the level of that curve and trades its shape. Instead of betting oil rises or falls, you go long one maturity and short another of the same commodity — say, long the six-month contract and short the three-month. The legs are balanced so the position is roughly neutral to the overall price of oil. What's left is a bet on the relationship between maturities: the cost-of-carry structure, far more stable than the underlying price. You've shed the scary part and kept the steady part.

The authors identify two gaps. Despite how widely calendar spreads are traded, there were essentially no machine-learning methods purpose-built for them — learning-based futures strategies had chased directional bets, carrying the very price risk spreads exist to avoid. And prior graph models for futures treated every contract as a generic, interchangeable node, ignoring maturity. Maturity is the whole point.

A graph that knows about maturity

The paper's own example crystallises it. Suppose there's a war, and petrol inventories are low now but expected to recover in six months. A three-month petrol contract and a six-month petrol contract are then in genuinely different situations — one reflects the shortage, the other the recovery. So when you're predicting the six-month diesel contract, you should not treat three-month and six-month petrol as the same kind of neighbour; their relationship to your target depends on how far apart their maturities are. A model that blurs all contracts of a commodity together throws that structure away.

When the structure is real, the model that honours it tends to beat the model that flattens it.

So the core idea is to represent the entire futures market as a two-level — hierarchical — graph. The upper level holds the commodities; the lower level, the individual contracts. Edges connect them three ways: commodities link to each other by their correlations; each contract links to its underlying commodity; and contracts of the same commodity link to their nearest-maturity neighbours. The model is a graph neural network — each node refines its understanding by passing messages along its edges — but taught explicitly that maturity matters.

A neat wrinkle makes this work across commodities, which expire on different dates — crude's calendar isn't corn's. The authors interpolate everything onto a shared "virtual" maturity grid: the same ruler for every commodity, with rungs at one week, two weeks, and so on out to a year. Now "the thirteen-week point of crude" can meaningfully talk to "the thirteen-week point of corn." Information then flows two distinct ways, each with its own learned parameters: across commodities at the same maturity, and across nearby maturities within a commodity. That separation is the maturity-awareness the baselines lacked.

The model predicts each contract's relative price movement, and a projection step turns those predictions into balanced, dollar-neutral spread positions. The authors also back the approach with theory — propositions arguing that, under conditions that hold most of the time in their data, calendar spreads have lower variance, a better information ratio, and lower exposure to the underlying price than long-only positions; they check those conditions empirically and find them satisfied the large majority of the time.

The payoff: a return that marches alone

The dataset spans daily commodity futures from 1977 through 2025 across the major U.S. exchanges, with a trading test of roughly 2016 to 2025 and annual retraining. The headline is a daily information ratio of about 0.085 and a Sortino ratio — a downside-risk-adjusted measure — of about 0.124. Those are improvements of roughly ninety-six and eighty-two percent over the competing methods, and seventy-five and over a hundred percent over the S&P 500 on those risk-adjusted measures.

Against simple long-only commodity baselines, the spread strategy posts more than twice the risk-adjusted return, more than six times lower volatility, and more than fifteen times lower maximum drawdown — a smoother, far less violent return stream. And the property that makes it valuable in a portfolio: its correlation with the S&P 500 is essentially zero, slightly negative, around minus two-hundredths. A positive risk-adjusted return that moves independently of the stock market is a real diversifier, the kind allocators prize.

The maturity-aware graph also beats the maturity-agnostic graph baseline, plus ridge regression, gradient boosting, and a plain neural network. On raw prediction error the edge is tiny in absolute terms, though statistically significant. The payoff comes less from a forecasting breakthrough than from clever trade construction: respecting the structure, stripping out the price risk, and turning a small predictive edge into a smooth, market-neutral return. Ablations confirm the structure earns its keep — removing the within-commodity, maturity-neighbour connections hurts most.

The honest caveats

Returns are reported under a simplified unit-margin assumption and, crucially, without transaction costs baked in; the authors report turnover separately as a stand-in, but the headline figures aren't directly tradeable. The theoretical advantages are conditional — the propositions don't prove spreads always win, only that the conditions for superiority hold often. It's a single asset class, and the raw forecasting improvement is small, so the story leans heavily on trade construction. The trading test is essentially the 2016-to-2025 window, whose profits show sensitivity to a few big regime events — COVID in 2020, the energy shock in 2022. This is a preprint.

Why it matters

There's a narrow reason and a broad one. The narrow: calendar spreads are a large institutional use case, and this is a principled, structure-aware way to build them — producing a near-market-neutral return stream. The broad one is why it's worth reading at all. It's a clean demonstration that respecting real-world structure beats treating every series in isolation. The market has a true shape — contracts share an underlying, and their relationships depend on maturity. A flat model has to rediscover that from scratch, if it can. Encoding it directly — the hierarchy, the maturity-dependence — is what produces the lower-risk, uncorrelated, statistically significant edge. Occasionally, as here, honouring the structure doesn't just sharpen a forecast; it builds a return stream that marches to its own drummer, independent of the market everyone else is riding.