← Back

Georgia Tech · CS 7646:Machine Learning for Trading · Fall 2025

Algorithmic Trading Strategy System

Learner: +30.8% in sampleLearner: +8.2% out of sampleBenchmark: −8.4% out of sample
PythonNumPyPandasMachine LearningRandom ForestTechnical Analysis

Overview

Strategy Performance Lab

Strategy learner

+8.2% out of sample

Manual strategy

-5.9% out of sample

Benchmark

-8.4% out of sample

The final project in CS 7646 is a head to head comparison between two trading strategies: one built by hand using domain knowledge and rules, the other learned by a machine from historical data. Both trade JPM stock starting with $100,000. The in sample training period runs from January 2008 to December 2009. The out of sample evaluation period runs from January 2010 to December 2011 that neither strategy was allowed to see during development.

The benchmark in both cases is a simple buy and hold: buy 1,000 shares on the first trading day and hold. The goal is to beat it, particularly out of sample where overfitting is exposed.

Indicators

Signal Inputs

BBP

oversold

band position

RSI

entry zone

relative strength

MOM

trend

10-day movement

EMA

reversion

distance from mean

Both strategies use the same four technical indicators. Using the same signal inputs lets the comparison isolate how much the decision logic matters, rather than what information was available.

Bollinger %B (BBP) measures where price sits within its 20 day Bollinger Band. A value near 0 means price is at the lower band (historically oversold); near 1 means it is at the upper band (historically overbought). RSI is a 14 period Relative Strength Index calculated with exponential weighting. Values below 40 indicate oversold conditions; above 60 indicate overbought.

Momentum measures how much price has moved over the past 10 days, expressed as a ratio. It captures direction and magnitude of recent price movement without smoothing. EMA Distance measures how far price has stretched from its 20 period exponential moving average, normalized as a percentage. When price is significantly above its EMA it tends to mean revert; when below, the opposite.

Manual Strategy

The manual strategy uses a three indicator confirmation rule for entries. A long position is entered when RSI falls below 40, BBP falls below 0.3, and Momentum is negative. A short position is entered when RSI rises above 60, BBP rises above 0.8, and Momentum is positive. Requiring agreement across multiple indicators reduces false signals at the cost of fewer total trades.

Exit logic is simpler. A long position is closed when RSI rises above 55, indicating the oversold condition has resolved. A short position is closed when RSI falls below 45 or EMA Distance turns negative, suggesting the overbought condition has reversed or price has dropped back below its trend. The EMA Distance condition on short exits adds a momentum based escape valve for sudden reversals.

in sample the strategy returned +15.6% against a benchmark of +1.2%. The manual rules were developed exclusively on in sample data and the out of sample period was never examined during development. out of sample the strategy returned -5.9%, compared to the benchmark's -8.4%. The rules didn't generate positive returns in the 2010–2011 period, but they still beat passive holding.

Strategy Learner

From Market Data to Trading Signal

Feature matrix

price, momentum, RSI, BBP, EMA distance

Forward labels

LONG / SHORT / CASH from 5-day returns

Bagged forest

20 random-tree learners vote into a single trading policy.

LONG
CASH
SHORT

The strategy learner frames trading as a classification problem. The same four indicators plus raw price are used as input features. For each day in the training set, a label is generated by looking N=5 trading days forward: if the return exceeds +1% plus the market impact cost, the label is LONG; if it drops below -1% minus impact, the label is SHORT; otherwise it is CASH.

The impact sensitive threshold is the key design decision. Higher market impact raises the return needed to justify a trade, so the learner naturally reduces trade frequency as costs rise. This makes the model aware of transaction economics during training rather than only during execution.

The classifier is a Random Forest: a BagLearner of 20 RTLearners (a Random Tree variant), each built with a minimum leaf size of 5 to prevent degenerate overfitting on the roughly 500 in sample data points. Training calls add_evidence() and freezes the model. Inference calls testPolicy() using the same features on the new time period, with no retraining.

The learner returned +30.8% in sample, roughly double the manual strategy, and +8.2% out of sample. The benchmark returned -8.4% in the same out of sample period, meaning the learner generated positive returns in a period where passive holding lost money.

ML Implementation

The feature matrix has five columns: raw price, Momentum, RSI, Bollinger %B, and EMA Distance. No external normalization is applied. RTLearner splits on raw feature values, so the scale of each indicator relative to itself is what matters, not interfeature comparability. Missing values from indicator lookback warm up periods are filled with zero before training. The last N rows of each training window are dropped because they cannot have valid forward looking labels.

The RTLearner is a Random Tree, or a variation on a standard decision tree that selects both the split feature and the split value randomly (uniform draw from the feature's observed range) rather than searching for the optimal split. This makes each tree faster to build and less correlated with its siblings, which is exactly what bagging needs to work. Each tree grows until leaf nodes contain five or fewer samples, preventing the degenerate case where individual trees memorize the roughly 500 training points.

The BagLearner wraps 20 RTLearners. Each bag trains on a bootstrap sample so each tree sees a different slice of the data. At inference time all 20 trees produce a prediction and the mean of their outputs determines the signal: positive means LONG, negative means SHORT, nearzero means CASH. Averaging across decorrelated trees substantially reduces variance compared to any single tree. The model is fully frozen after add_evidence() returns; testPolicy() only calls query() with no weight updates.

Experiments

Market Impact Changes the Policy

0.000

many trades

0.005

fewer trades

0.010

selective

0.020

high conviction

Raising market impact widens the label threshold, so the learner trains itself to trade less often.

Experiment 1 was a direct three way comparison between the Manual Strategy, Strategy Learner, and Benchmark. The learner outperformed the manual strategy in sample (+30.8% vs +15.6%) and was the only strategy to generate positive returns out of sample (+8.2%), while both the manual strategy (-5.9%) and the benchmark (-8.4%) lost money. The manual strategy's in sample edge didn't transfer cleanly to new data, a typical symptom of hand tuned rules that capture historical noise rather than durable patterns. The learner generalized better because it found signal combinations the rules missed.

Experiment 2 tested whether the impact sensitive label design actually changed trading behavior. The learner was trained and evaluated four times on the in sample period with impact values of 0.0, 0.005, 0.01, and 0.02, with commission fixed at $0 to isolate impact as the only cost variable.

The mechanism is straightforward: at impact=0.0 the LONG and SHORT thresholds are ±1%, so any day with a projected 5 day return beyond those bounds gets labeled as a signal. At impact=0.02 those thresholds widen to +3% and -3%. Far fewer days in the training set clear that bar, so the model learns a more conservative policy with fewer, higher conviction entries. The results confirmed this monotonically: trade count fell and cumulative return declined as impact rose. The important result isn't the return degradation as that's expected when costs rise. It's that the behavior change came from the label construction during training, not from a post hoc filter applied at inference time.