Suppose you trained an algorithm which predicts the stock market correct about 70% of all cases. Now you would like to start using your algorithm in real life. Chances are that you don’t get what you expect! On every stock market, you have to pay fees for trading (both for buying and selling stocks). In this article, I will explain how I minimized the risk for my Bitcoin Stock Prediction algorithm mathematically. My algorithm and risk minimization is successfully tested on a Bitcoin stock market called Kraken.

Looking for new interesting investments idea? Check out our Crypto Advisor service!

## Give me a Model!

We first need a model! In this model, some characteristics of the stock market are captured. Here, I will make several assumptions and simplifications to the problem:

### Fees

I assume that a fee is a percentage of the price. On Kraken, there is a fee for a transaction maker of 0.16% and a transaction fee for a transaction taker of 0.26%. This means that the total fee for a transaction is 0.16% + 0.26% = 0.42% (= 0.0042). I made 0.005 of this just for simplification. Also notice that I took the worst-case scenario. In many occasions (for example, if you would trade in large volumes), this fee will be lower. I call the fee amount $latex \beta$. If I would spend an amount $latex A$ on the stock market, then I am left with $latex A \times (1 – \beta)$ after buying and selling the stocks since I have to pay the fees.

## Example

Suppose that the buying fees are 5% and the selling fees are 5%. Then, $latex \beta = 0.10$ (that is, the total transaction fees are 10%). If I would make a transaction and invest €100,- (so $latex A=100$), I have to pay €10,- for fees. After the transaction, I am left with $latex A \times (1 – \beta) = 100 \times (1-0.10) = 90$, so with €90,- (since I paid €10,- for the fees).

### Price jumps

Suppose that $latex \gamma$ is the average jump between prices and for which the prices are separated one hour from each other. The stock market behaves like a random walk. By that, we know that the gaps get bigger over time. If you look from day-to-day, the gaps are bigger than the hour-to-hour gaps. By the assumption that the stock market behaves like a random walk, we actually know the following relation: $latex \gamma_1 = \gamma_0 \cdot \sqrt{T}$. Please, stay with me! I will explain it. Suppose $latex \gamma_0$ is our hour-to-hour average gap between prices and we want to know what the average day-to-day gap is. We can then simply take $latex T=24$ (since $latex 24$ hours fit in a day). Suppose the hour-to-hour gap is 25% = 0.25 (that is, a price increases or decreases on average by 25%). Then, $latex \gamma_1 = \gamma_0 \sqrt{24} \approx 1.22$. So, day-to-day prices differ approximately 122%. It makes sense, the day-to-day prices should be larger, since day-to-day gaps consists of many hour-to-hour gaps. For Kraken, I took hour-to-hour gaps and I approximated $latex \gamma$ by looking at actual gaps and estimated $latex \gamma = 0.002$ (0.2%). Thus, that means that day-to-day gaps are approximately 0.9% (as an exercise: try to compute this yourself).

### Predictions

Suppose we have a classifier which can predict a stock price in the future by some accuracy $latex \alpha$. As an example, assume that our classifier can predict a future stock price with 70% accuracy (that is, 7 out of the 10 prices are predicted correctly). As a simple model, the classifier could predict whether a stock goes up or down. Then, we can compare how well the classifier performed on real data. Assume that $latex \alpha$ is the accuracy of this model.

### Capital

Suppose we have $latex X_t$ amount of money. We start with $latex X_0$. Also, assume that we trade for $latex T$ timesteps. We are interested in the following question: will $latex X_T$ be larger than $latex X_0$? In words: will we end up with more money than we started with? $latex X_T > X_0$? We will try to answer this question with our model!

## The actual Model

In summary, $latex \beta$ are the fees on the trades, $latex \gamma$ are the average gaps between two consecutive stock prices. And $latex \alpha$ is the accuracy of our classifier. $latex X_t$ is our capital at time $latex t$ and $latex T$ is the horizon or endpoint. With this, we could come up with the following model:

$latex X_{t+1} = \begin{cases}

X_t (1 – \beta) (1+ \gamma) & \mbox{with probability } \alpha\\

X_t (1 – \beta)(1 – \gamma) & \mbox{with probability } 1 – \alpha

\end{cases}$

So, what does this mean? $latex X_{t+1}$ means the following timestep. $latex X_t$ is the current timestep. Our goal is to compute the future amount at time $latex T$, so our goal is to compute $latex X_T$. At each timestep, one of the following scenarios can happen: either the classifier predicted correctly (this happens with probability $latex \alpha$), or the classifier did not predict correctly (with probability $latex 1 – \alpha$). For both cases, we have to pay the fees (this is the factor $latex 1 – \beta$). If we predicted the stock movement correctly, we gain $latex 1 + \gamma$ since we moved up with the stock. Otherwise, have lost the gap ($latex 1 – \gamma$).

### Computing $latex \alpha$ for given $latex \gamma$ and $latex \beta$

Suppose we know the average gap size $latex \gamma$ and the fees $latex \beta$, can we compute the minimum accuracy we need for the classifier in order to make profit? Yes, we can! I will leave out the mathematical details, but it boils down that the minimum accuracy equals the following: $latex \alpha = \log_{\frac{1 + \gamma}{1 – \gamma}}(\frac{1}{(1-\beta)(1 – \gamma)})$.

## Model applied on Kraken

I applied my model on Kraken. Here, $latex \beta$ is fixed since we know the transaction fees. Now we can plot $latex \gamma$ versus $latex \alpha$ and this yields the following result:

In this graph, you can clearly see that $latex \alpha > 0.5$. This is obvious, since if the classifier classifies incorrect in most of the cases, then it would be impossible to make profit. For relatively small gaps ($latex \gamma < 0.01$), the classifier accuracy needs to be extremely large! There is even a certain $latex \gamma$ for which no classifier could make profit. We can zoom in in this area and obtain the following graph:

Okay, so we need at least price gaps of about 0.005 (0.5%) in order to make any profit at all. Since we know that the hour-to-hour price gaps of Kraken are 0.002, we know that we cannot make any profit on Kraken with an hour-to-hour classifier. However, we also know that the day-to-day gaps approximately have $latex \gamma=0.009$. For this $latex \gamma$, we need a classifier with a prediction accuracy of approximately 75% accuracy, which is doable. In this way, we can make profit. If we take a classifier with 90% accuracy, we take 365 timesteps ($latex T=365$), and we take the fees of Kraken ($latex \beta=0.005$) and the approximated day-to-day gaps of the Bitcoin stockprice ($latex \gamma = 0.009$) and if we invest €1000,- in the machine at timestep $latex 1$ ($latex X_1=1000$), we get (using $latex 1000$ simulations) that the average $latex X_T$ equals €2205,-! So we could double our initial capital if we have a good classification algorithm.

## Conclusions

In the real world, the stock market parameters make it hard for classification algorithms to make profit. A classifier should have an accuracy of nearly 80% in most cases in order to be valuable. If you have any questions or comments, feel free to post them below! By the way, if you are interested in making money, you should definitely read this article on getting rich using Python on Twitter for predicting the stock market. Also interesting is this mathematically flavoured article on optimizing a portfolio.