A Crypto Trader on a Raspberry Pi

Reinforcement Learning Trading Projects

In early 2021 it was hard to avoid crypto, and harder still to avoid the idea that you could automate your way to being clever about it. I didn’t think I’d beat the market. I did want to find out what it actually takes to build a trading agent end to end, from price feed to a bot that places orders on its own.

So I built one, and I ran it on a Raspberry Pi sitting on my desk.

The setup

The Pi was the whole point. A trading bot wants to be on all the time — markets don’t keep office hours — and I didn’t want a laptop fan spinning through the night. A Raspberry Pi pulls a few watts, holds a wallet, talks to an exchange API, and quietly does its thing. It’s a surprisingly good home for a long-running agent.

On top of that I built the trader itself.

The agent

I started from an open-source RL framework rather than writing the plumbing from scratch, then replaced the parts I cared about with a bespoke agent. The interesting decisions were about what the agent gets to see and how it represents the world.

For the state representation I used deep belief networks to learn a compressed encoding of the market features before handing them to the policy. Raw OHLCV candles are noisy and high-dimensional; learning a lower-dimensional representation first gave the agent something more stable to act on.

The part I was most curious about was sentiment. Price history alone is a thin signal — everyone has it, and it says nothing about why the market is moving. So I fed in sentiment analysis data alongside the price features, on the theory that crowd mood leads price often enough to be worth modelling. The agent’s action space was the usual buy / sell / hold, and the reward was tied to portfolio value.

What actually happened

The honest result is the one everyone who builds one of these eventually arrives at: backtests lie. It’s easy to build something that looks brilliant on historical data and falls apart the moment it meets a live market with fees, slippage and latency. Markets are non-stationary and adversarial — the pattern your agent learned last month is partly gone because enough people learned it too.

What I got out of it was worth far more than any trades. I learned how to structure an RL problem where the environment is real and hostile rather than a tidy simulator, how much representation learning matters before you ever get to the policy, and how to run an autonomous agent unattended on cheap hardware without it doing anything catastrophic. Those lessons came back around in my research, where evaluating agents under randomness and non-stationarity turned out to be the actual hard problem.

The code lives at cryptotrader.ai. I would not trade my own money with it. I’m very glad I built it.