Pattern Statistics and How to Read Them

Multiple sources publish testing results and the numbers vary depending on methodology. Methodological awareness is genuinely more valuable than a memorized table.

What you'll learn

Identify four public source types for candlestick pattern statistics and describe what each covers
Apply six evaluation questions to any published claim about pattern reliability
Explain why the same pattern can show different reliability figures across credible sources
State the honest empirical consensus: most patterns work modestly above 50% in isolation, very few dramatically above 60%
Explain why context — location, confluence, volume, confirmation — separates tradeable signals from noise

Why Methodological Awareness Matters More Than Any Single Number

Instead of attaching a single reliability number to each pattern in a handout, teach students that multiple sources publish testing results and the numbers vary depending on methodology. Different researchers define 'successful pattern' differently. Some require a specific percentage move within a specific time window; some measure simply whether price moved in the predicted direction at all; some adjust for trading costs while others don't. The numbers across sources can disagree substantially for the same pattern.

That methodological awareness is genuinely more valuable than a memorized table. A student who knows 'three different studies report reliability between 55% and 75% for bullish engulfing depending on how they measure success, and Bulkowski's site has his own figures' is better prepared than one who memorized a single number from a handout.

Sources for Candlestick Pattern Statistics

The most extensive single source of candlestick pattern statistics available publicly. The site covers over 100 different candle patterns, including identification guidelines and performance statistics, organized by both visual and alphabetical indexes. What students will find when they visit a specific pattern page: reversal-versus-continuation rates, frequency rankings (how often the pattern appears across the dataset), overall performance rankings (the pattern's rank among all tested candles), average price moves over a 10-day window following the pattern's breakout, and performance differences across bull and bear markets. As an example of what the entries look like — for the shooting star, the site reports that the shooting star acts as a reversal 59% of the time, with an overall performance rank he describes as mid-list. Bulkowski's methodology, briefly: he identifies a pattern, waits for what he calls a 'breakout' (a close above or below the pattern's price range), then measures price behavior over the following 10 days. 'Performance' in his framework means the size of the average move after breakout, not just whether direction was predicted correctly. This is an important methodological detail for students — many sources measure success differently, and the numbers across sources aren't directly comparable as a result. The deeper analysis, the full comparative ranking tables, and the trading tactics for each pattern are in his book Encyclopedia of Candlestick Charts (Wiley, 960 pages). The site is a substantial preview of that work but not a substitute for it. A note for your students: Bulkowski himself describes many patterns as 'near random' in performance, even when their reliability numbers look high in isolation. For the shooting star, despite the 59% reversal rate, he writes that he considers that 'near random' performance. This kind of editorial framing is part of why students should read the source directly rather than just lift numbers — the context around the number often matters more than the number itself.

Several peer-reviewed papers have tested candlestick patterns systematically. The picture from academic literature is generally more skeptical than the picture from trading-focused sources — many studies conclude that candlestick patterns don't produce statistically significant excess returns after accounting for transaction costs. This is itself important information for your students. Names students can search for in Google Scholar or SSRN: Marshall, Young, and Rose (multiple papers testing candlestick performance in various markets); Horton (statistical testing of two-day patterns); Lu, Shiu, and Liu (work on three-line patterns in Asian markets); Caginalp and Laurent (early academic work on candlestick reversals). These are starting points — students who find one relevant paper can use its citation list to find related work. Academic methodologies vary widely. Some studies use strict statistical thresholds for 'successful' pattern outcomes; some use bootstrapping; some use machine-learning approaches to evaluate predictive power. The disagreement among studies is often more about methodology than about underlying truth.

Trading firms occasionally publish backtested results on candlestick patterns. Quality varies enormously and methodology is often unclear or unstated. Students should treat these results with the same critical eye they'd apply to any industry research — useful as data points, not as authoritative truth.

Several charting platforms (TradingView, ThinkOrSwim, MetaTrader and others) allow users to backtest candlestick patterns on their own historical data. For students who want to verify pattern reliability on the specific instruments and timeframes they actually trade, this is the most useful approach — the published statistics aggregate across thousands of instruments and may not reflect performance on any particular one. A student trading SPY on the daily timeframe will get more useful results from backtesting on SPY-daily than from any aggregate study.

How to Evaluate the Statistics Students Find

Students who find a pattern reliability number need to ask several questions before applying it:

What counted as success? A 70% reliability rate means very different things if 'success' was defined as 'price moved at least 0.1% in the predicted direction within one day' versus 'price reached the measured-move target within 10 days.' Always check the success definition.
What market and timeframe was tested? Patterns that work on US stocks may behave differently on forex, commodities, or crypto. Patterns that work on daily charts may behave differently on 5-minute charts. Aggregate statistics across markets and timeframes obscure these differences.
What was the sample size? A pattern with a 75% reliability rate from 50 occurrences is much less reliable as a statistic than one with a 60% reliability rate from 5,000 occurrences. Small samples produce noisy estimates.
Were transaction costs included? A pattern with a positive raw edge may have a negative net edge after spreads, commissions, and slippage. Academic studies often include cost adjustments; trading-focused sources often don't.
Has the result been replicated? A single study showing a pattern works is much weaker evidence than multiple independent studies finding similar results. Patterns where the literature broadly agrees (engulfing in either direction, morning/evening star at trend extremes) are on firmer ground than patterns where one source claims high reliability and others find little.
Has the pattern been arbitraged away? Even if a pattern showed an edge in historical data, persistent profitable patterns tend to attract traders who exploit them until the edge disappears. Always ask whether the data tested is recent enough to reflect current market behavior.

How to Apply This

The structure: First half: students visit Bulkowski's site directly, read several pattern pages they're already familiar with from earlier lessons, and bring back observations. The exercise teaches them to extract information from a real research source rather than rely on a handout.

Second half: students apply the six evaluation questions above to a published claim about pattern reliability — could be from Bulkowski's site, from an academic paper, from a trading firm's blog post, or from a YouTube video on candlestick trading. The deliverable is a written critique that identifies the methodology, the assumptions, and what the student would need to verify before relying on the claim.

This lesson teaches a research skill that serves students far beyond candlesticks. A student who can critically evaluate published statistics about trading patterns can critically evaluate published statistics about anything — which is one of the more durable skills any trader can develop. The principle you articulated earlier — students should have access to public information and decide for themselves — is well served by this approach. They get full access (by going to the sources directly), they get the methodological context that a number alone wouldn't provide, and they develop the critical-evaluation skill that makes the numbers useful rather than dangerous.

One of the search results said something that's worth quoting for your students because it captures the honest state of the field. A trading-analysis source noted that there are over 100 named candlestick patterns in the technical analysis canon, that most of them have success rates barely above a coin flip when backtested, and that every single pattern needs context to work — a hammer 'in the middle of nowhere' is meaningless. This is a defensible summary of where the empirical literature stands. Many patterns work modestly above 50% in isolation, very few work dramatically above 60%, and context (location, confluence, volume, confirmation) is what separates tradeable signals from noise. For your students, that's actually a more useful framing than memorizing per-pattern statistics. The numbers tell you which patterns have a slight edge. The edge is small enough that execution discipline, position sizing, and risk management matter more than which pattern you traded.

Key Takeaways

Multiple sources publish testing results and the numbers vary depending on methodology — methodological awareness is more valuable than any single memorized figure
Four public source types: Bulkowski's pattern site (thepatternsite.com), academic literature (Marshall/Young/Rose, Horton, Lu/Shiu/Liu, Caginalp/Laurent), industry research (variable quality), and independent backtesting platforms
Six evaluation questions: What counted as success? What market and timeframe? What was the sample size? Were transaction costs included? Has the result been replicated? Has the pattern been arbitraged away?
Most patterns work modestly above 50% in isolation, very few work dramatically above 60% — context (location, confluence, volume, confirmation) is what separates tradeable signals from noise
Bulkowski himself describes many patterns as 'near random' despite their reliability numbers — the editorial framing around a number often matters more than the number itself
The best reliability figure is from your own backtesting on your specific instrument and timeframe — aggregate statistics may not reflect performance on any particular one

Quiz — 3 Questions

Answer one at a time

Question 1 of 30 answered

A source reports the bullish engulfing has a 74% success rate. A second credible source reports 56%. A student asks which figure to trust. What is the correct answer?

ATrust the higher figure — a higher success rate indicates more rigorous testing

BTrust the lower figure — conservative estimates always produce better trading outcomes

CNeither figure is trustworthy on its own. The numbers vary depending on methodology — different researchers define 'successful pattern' differently. Ask what counted as success, what market and timeframe was tested, and what was the sample size before weighting either figure

DAverage the two figures and use 65% as the working estimate

The Penetration Spectrum in Context — On-Neck Through Piercing

Completed

Technical Analysis Track

In this lesson

200 — Pattern Integrate

Back to 200 — Pattern Integrate