Multiple sources publish testing results and the numbers vary depending on methodology. Methodological awareness is genuinely more valuable than a memorized table.
Instead of attaching a single reliability number to each pattern in a handout, teach students that multiple sources publish testing results and the numbers vary depending on methodology. Different researchers define 'successful pattern' differently. Some require a specific percentage move within a specific time window; some measure simply whether price moved in the predicted direction at all; some adjust for trading costs while others don't. The numbers across sources can disagree substantially for the same pattern.
That methodological awareness is genuinely more valuable than a memorized table. A student who knows 'three different studies report reliability between 55% and 75% for bullish engulfing depending on how they measure success, and Bulkowski's site has his own figures' is better prepared than one who memorized a single number from a handout.
The most extensive single source of candlestick pattern statistics available publicly. The site covers over 100 different candle patterns, including identification guidelines and performance statistics, organized by both visual and alphabetical indexes. What students will find when they visit a specific pattern page: reversal-versus-continuation rates, frequency rankings (how often the pattern appears across the dataset), overall performance rankings (the pattern's rank among all tested candles), average price moves over a 10-day window following the pattern's breakout, and performance differences across bull and bear markets. As an example of what the entries look like — for the shooting star, the site reports that the shooting star acts as a reversal 59% of the time, with an overall performance rank he describes as mid-list. Bulkowski's methodology, briefly: he identifies a pattern, waits for what he calls a 'breakout' (a close above or below the pattern's price range), then measures price behavior over the following 10 days. 'Performance' in his framework means the size of the average move after breakout, not just whether direction was predicted correctly. This is an important methodological detail for students — many sources measure success differently, and the numbers across sources aren't directly comparable as a result. The deeper analysis, the full comparative ranking tables, and the trading tactics for each pattern are in his book Encyclopedia of Candlestick Charts (Wiley, 960 pages). The site is a substantial preview of that work but not a substitute for it. A note for your students: Bulkowski himself describes many patterns as 'near random' in performance, even when their reliability numbers look high in isolation. For the shooting star, despite the 59% reversal rate, he writes that he considers that 'near random' performance. This kind of editorial framing is part of why students should read the source directly rather than just lift numbers — the context around the number often matters more than the number itself.
Several peer-reviewed papers have tested candlestick patterns systematically. The picture from academic literature is generally more skeptical than the picture from trading-focused sources — many studies conclude that candlestick patterns don't produce statistically significant excess returns after accounting for transaction costs. This is itself important information for your students. Names students can search for in Google Scholar or SSRN: Marshall, Young, and Rose (multiple papers testing candlestick performance in various markets); Horton (statistical testing of two-day patterns); Lu, Shiu, and Liu (work on three-line patterns in Asian markets); Caginalp and Laurent (early academic work on candlestick reversals). These are starting points — students who find one relevant paper can use its citation list to find related work. Academic methodologies vary widely. Some studies use strict statistical thresholds for 'successful' pattern outcomes; some use bootstrapping; some use machine-learning approaches to evaluate predictive power. The disagreement among studies is often more about methodology than about underlying truth.
Trading firms occasionally publish backtested results on candlestick patterns. Quality varies enormously and methodology is often unclear or unstated. Students should treat these results with the same critical eye they'd apply to any industry research — useful as data points, not as authoritative truth.
Several charting platforms (TradingView, ThinkOrSwim, MetaTrader and others) allow users to backtest candlestick patterns on their own historical data. For students who want to verify pattern reliability on the specific instruments and timeframes they actually trade, this is the most useful approach — the published statistics aggregate across thousands of instruments and may not reflect performance on any particular one. A student trading SPY on the daily timeframe will get more useful results from backtesting on SPY-daily than from any aggregate study.
Students who find a pattern reliability number need to ask several questions before applying it:
The structure: First half: students visit Bulkowski's site directly, read several pattern pages they're already familiar with from earlier lessons, and bring back observations. The exercise teaches them to extract information from a real research source rather than rely on a handout.
Second half: students apply the six evaluation questions above to a published claim about pattern reliability — could be from Bulkowski's site, from an academic paper, from a trading firm's blog post, or from a YouTube video on candlestick trading. The deliverable is a written critique that identifies the methodology, the assumptions, and what the student would need to verify before relying on the claim.
This lesson teaches a research skill that serves students far beyond candlesticks. A student who can critically evaluate published statistics about trading patterns can critically evaluate published statistics about anything — which is one of the more durable skills any trader can develop. The principle you articulated earlier — students should have access to public information and decide for themselves — is well served by this approach. They get full access (by going to the sources directly), they get the methodological context that a number alone wouldn't provide, and they develop the critical-evaluation skill that makes the numbers useful rather than dangerous.
One of the search results said something that's worth quoting for your students because it captures the honest state of the field. A trading-analysis source noted that there are over 100 named candlestick patterns in the technical analysis canon, that most of them have success rates barely above a coin flip when backtested, and that every single pattern needs context to work — a hammer 'in the middle of nowhere' is meaningless. This is a defensible summary of where the empirical literature stands. Many patterns work modestly above 50% in isolation, very few work dramatically above 60%, and context (location, confluence, volume, confirmation) is what separates tradeable signals from noise. For your students, that's actually a more useful framing than memorizing per-pattern statistics. The numbers tell you which patterns have a slight edge. The edge is small enough that execution discipline, position sizing, and risk management matter more than which pattern you traded.
Key Takeaways
A source reports the bullish engulfing has a 74% success rate. A second credible source reports 56%. A student asks which figure to trust. What is the correct answer?