How This Works
A plain-English guide for anyone — no finance or CS background needed

What is a stock?

When a company wants to raise money, it sells small slices of ownership called shares or stocks. If you own a share of Apple, you own a tiny piece of Apple. If Apple does well, your share becomes worth more. If Apple does poorly, your share is worth less.

The stock market is just the marketplace where people buy and sell these shares every day. Prices change every second based on what people think a company is worth.

Core financial terms

Return
How much money you made (or lost)
If you invested $1,000 and now have $1,200, your return is +20%. If you have $800, it's −20%. Simple.
S&P 500
The standard "how the market did" number
An index of the 500 biggest US companies. If you had just bought the whole market, how would you have done? That's the S&P 500. It returned +46.3% in our test period.
Alpha
How much better you did than the market
If the market returned +46.3% and our strategy returned +67.7%, that's +21.4% alpha. Alpha is the "extra" you earned by being smarter than just buying everything.
Sharpe Ratio
Return relative to how bumpy the ride was
1.89 means for every unit of risk taken, you got 1.89 units of return. Above 1.0 is generally considered good. A strategy that makes 50% but swings wildly is less useful than one that makes 30% smoothly.
Drawdown
The worst drop from a peak
If your portfolio hit $12,000, then dropped to $11,100, that's a −7.5% drawdown. Our maximum drawdown was −7.6% — meaning even at the worst point, you'd only lost $760 on every $10,000 invested.
Rebalancing
Buying and selling to reset your portfolio
Every month the system looks at all 50 stocks, picks the best 15, and buys/sells to own exactly those 15 in the right proportions. This is one "rebalance period".

50 hand-picked large-cap US stocks

The model does not automatically discover which stocks to consider. The universe of 50 candidates was manually curated — chosen because they are well-known, have years of full price history, and are highly liquid. A liquidity screen (minimum average daily volume) is applied each rebalance period, but in practice all 49–50 stocks pass every time.

Important caveat

This is a closed, biased universe. All 50 names are S&P 500 mega/large-caps that have survived to today — meaning any company that collapsed is automatically excluded. A rigorous live system would use a dynamic, historically-accurate S&P 500 constituent list.

■ Technology & Semiconductors
AAPLMSFTGOOGLAMZNNVDAMETAAVGOCRMCSCOACNTXNQCOM
■ Financials
JPMVMABACMSGSBLKSPGIBKAXPCUSBPYPL
■ Healthcare & Pharma
JNJUNHLLYABBVMRKTMOABTAMGNBMY
■ Consumer Staples & Discretionary
PGKOHDMCDNKEWMTCOSTMOPMPEPCLLOW
■ Industrials, Energy & Other
HONCATLMTXOM

The system runs a fixed sequence every month. Think of it like a recipe that never deviates.

01
Download price data
The code calls yfinance — a free library that fetches real daily stock prices from Yahoo Finance. For each of our 50 stocks, it downloads years of daily open/close/high/low/volume data. This is the raw material everything else is built from.
Analogy

Like downloading the complete scorebook for 50 players before picking your fantasy sports lineup.

02
Liquidity screen
Only stocks with high average daily trading volume pass. This ensures we can actually buy and sell them without moving the price. Thinly traded stocks get eliminated. 49–50 out of 50 typically pass.
03
Compute 10 features for each stock
Raw prices are transformed into 10 numbers that capture different aspects of each stock's recent behavior:

Momentum features: Has this stock been going up recently? (1-week reversal, 1/3/12 month momentum)
Trend features: Is it consistently above its long-term average? (200-day SMA ratio, trend score)
Risk features: How jumpy is it? How correlated with the overall market? (60-day beta, 60-day volatility)
Activity feature: Are people trading it more than usual? (volume trend ratio)
Analogy

Instead of judging a job applicant just by their name, you now have 10 specific test scores. The ML model learns which scores actually predict job performance.

04
Train the ML model on past data
The model looks at the past 12 months of (features → outcomes) pairs. For each past month, it knows what these features were AND which stocks then beat the average the next month. It learns the pattern between features and outcomes. Critically, it only uses past data — never the future.
05
Score this month's stocks
Now the trained model scores the 50 current stocks — "given this month's features, how likely is each stock to outperform next month?" This gives each stock a probability score between 0% and 100%. Pick the top 15.
06
Size the positions and rebalance
The 15 selected stocks are assigned weights using the Kelly formula (explained in the next section). High-confidence picks get more of your money. The portfolio is rebalanced — buy what you need more of, sell what you need less of. Transaction costs of 0.1% per trade are subtracted.
07
Measure performance, repeat next month
At the end of the month, the return is recorded. Then repeat from step 3. Over 23 months (Feb 2023 – Dec 2024), this produced +67.7% cumulative return.

Three models vote together

Instead of relying on one model, we run three different ML models simultaneously and blend their predictions. Each one sees the same 10 input features but learns differently.

LR
Logistic Regression — the straight line
Assigns a weight to each of the 10 features, multiplies them together, and outputs a probability. The weights are learned by gradient descent — it nudges each weight slightly up or down until the error on training data stops improving. Fast, stable, and tells you exactly which features matter most (the weights are directly readable).

Weakness: can only draw a straight boundary — if the signal between two features is curved ("high momentum is good, but only when volatility is low"), LR misses it. Weight: 34%.
RF
Random Forest — the committee vote
Builds 200 independent decision trees, each trained on a random subset of the training data and a random subset of the 10 features. Each tree independently votes "outperform" or "underperform." The forest's prediction is the fraction of trees that voted yes.

Because each tree sees different data, the trees specialize in different market conditions. Averaging their votes cancels out individual errors. Catches complex interactions LR misses. Weakness: slower, and needs more training data to be reliable. Weight: 33%.
GBM
Gradient Boosting — the sequential corrector
Starts with a weak tree, measures its errors, and adds a second tree that specifically corrects those errors. Then a third tree corrects the remaining errors. Each tree is built at a small learning rate (0.05) to prevent overcorrecting. After 100 rounds, the final prediction is the sum of 100 small corrections.

Best for capturing complex patterns in smaller datasets. Weakness: if the first few trees overfit, errors compound. Weight: 33%.

A concrete example: scoring NVDA in Jun 2023

Here's what the actual input features looked like for NVDA entering June 2023 (the month it returned +18%):

Feature Value What it means
12-month momentum +168% Dominant uptrend over the past year — very bullish signal
3-month momentum +52% Short-term continuation, acceleration in recent months
1-month reversal +8% Positive last month (not a mean-reversion candidate)
200-day SMA ratio 1.68 Price is 68% above its long-run average — strong trend
Volume trend 1.42 Trading 42% above average daily volume — increased interest
60-day volatility 0.48 High volatility — model must weigh risk carefully
60-day beta 1.9 Moves 1.9× the market — amplified risk/reward

Given these 10 numbers, each model independently produces a probability estimate:

Model Outperform probability Weight Contribution
Logistic Regression 0.81 34% 0.275
Random Forest 0.76 33% 0.251
Gradient Boosting 0.79 33% 0.261
Ensemble score 0.787 → ranked 1st, selected

NVDA's ensemble score of 0.787 ranked it #1 among all 50 stocks. It received the highest Kelly weight (10.9%) and returned +18.2% that month.

The key insight

The ML model doesn't need to predict exactly how much a stock returns. It only needs to rank stocks correctly — identifying which are most likely to beat the median. Even a model that's only right 55% of the time can generate consistent alpha if it's correctly identifying the relative ranking of 50 stocks month after month.

Which features matter most?

Based on the Random Forest's feature importance scores across all 23 test periods, the most predictive signals were:

#1
12-month momentum
The single most powerful predictor. Stocks that have risen strongly over the past year tend to keep rising the next month (momentum effect — well-documented in academic finance).
#2
200-day SMA ratio
Whether the stock is above or below its long-run trend line. Stocks above their 200-day average are in defined uptrends.
#3
Volume trend
Rising volume alongside rising price signals institutional accumulation — more conviction behind the move.
#4–5
3-month momentum & Beta
Short-term trend continuation plus market sensitivity. High-beta stocks get penalized in BEAR regimes automatically through the regime scale factor.

Kelly Criterion — how much to bet

Once we know which 15 stocks to buy, we still need to decide how much to put in each one. Not all stocks in our top-15 are equally confident picks.

Kelly Formula
f* = (p × b − q) / b

p = probability of winning (from ML model, e.g. 0.72)
q = probability of losing = 1 − p (e.g. 0.28)
b = average win / average loss ratio (estimated from training data)
f* = fraction of your bankroll to bet
Plain English

If you flip a coin that lands heads 70% of the time, Kelly tells you exactly what fraction of your money to bet each flip to grow your wealth fastest over time. Bet too little and you leave money on the table. Bet too much and one bad streak destroys you.

We use fractional Kelly (25%) — we bet only a quarter of what Kelly suggests. This is more conservative, but also more stable. A hard cap of 20% per stock prevents any single name from dominating.

Macro regime detection

Once a month, the system also looks at SPY (the S&P 500 ETF) and asks: what kind of market environment are we in right now? It fits a Gaussian Mixture Model — essentially a statistical clustering algorithm — to classify the market as:

Bull 🐂
Low volatility, rising trend
Full Kelly sizing deployed. The environment is stable and trending up — take full positions.
Neutral 😐
Mixed signals
75% Kelly sizing. Market is choppy or directionless — scale back slightly to manage risk.
Bear 🐻
High volatility, falling trend
50% Kelly sizing. The market is under stress — automatically cut all position sizes in half. Our system detected Bear in Aug–Sep 2024.

Why most backtests lie

It is surprisingly easy to build a strategy that looks amazing in a backtest but fails completely in real life. The most common causes:

Look-ahead bias
Using tomorrow's data to make today's decision. Example: training on "which stocks went up this month" to decide what to buy this month. This system avoids it by using only data from strictly past periods to train each month's model.
Survivorship bias
Only testing on stocks that still exist today — ignoring companies that went bankrupt. This system mitigates it by using only large, liquid, actively traded names that have existed throughout the full period.
Ignoring transaction costs
Pretending you can trade for free. In reality, every buy/sell costs money (broker fees, bid/ask spread). This system deducts 0.1% per rebalance and 0.2% annual FX drag.
Walk-forward validation
The correct approach. For each test month, you train exclusively on all months before it, then test on that one month. Then move forward one month, retrain, test again. Every result shown on this site is out-of-sample. The model never had access to the data it's being judged on.

What the numbers actually mean

+67.7% Total Return
$10,000 became $16,770
Over 23 months (Feb 2023 – Dec 2024). This is the compounded return across all rebalancing periods, net of all costs.
+21.4% Alpha
67.7% − 46.3% (SPY)
An investor who simply bought the S&P 500 index fund over the same period and did nothing would have made +46.3%. The ML strategy made +21.4% more.
Sharpe 1.89
Decent risk-adjusted performance
Above 1.0 is generally considered good. The SPY's Sharpe ratio for the same period was roughly 2.1 — meaning the market itself was unusually smooth in 2023–2024 (no major crashes).
Max −7.6% Drawdown
Worst case: Sep 2023
During Oct 2023 the portfolio recovered fully. For comparison, the S&P 500 drew down roughly −10% in the same period. The strategy held up better in the stress period.

Honest caveats

This is a paper trading simulation, not real money. Things that differ in live trading:

Slippage — when you place a large buy order, the price moves against you before it fills. This simulation doesn't model that.

23 months is a short track record. Professional funds require 3–5 years of live performance before taking numbers seriously. 2023–2024 was a particularly good market environment.

Parameter sensitivity. Some of the configuration choices (Kelly fraction = 25%, N_STOCKS = 15, etc.) were set before the backtest. But there's always a risk that another researcher testing 1,000 different configurations would overfit to this period by accident.

The walk-forward validation and cost accounting make this result more honest than most student projects, but real-world deployment would require significantly more validation.