Illustration: Kaboompics.com – pexels
Crypto Price Predictions:
How Activation Functions Shape Neural Network Accuracy
Cryptocurrencies—most notably Bitcoin, Ethereum, and Litecoin—have gained worldwide attention for their dramatic price movements and potential to disrupt traditional financial systems. Unlike government-issued money, these digital assets rely on decentralized computer networks known as blockchains, where transactions are verified and recorded without central banks. While many people are drawn to cryptocurrencies in the hope of profiting from rapid price gains, the very volatility that can yield huge returns also poses a major challenge: prices can swing wildly in a matter of hours or days. This makes forecasting crypto prices both essential (for investors, traders, and researchers) and exceedingly difficult.
In this study, the authors examine how different activation functions in neural networks can improve the accuracy of crypto price predictions. An activation function is a mechanism inside a neural network that decides how much “signal” passes from one layer of the model to the next, shaping the final output. For instance, if a neural network is trying to predict tomorrow’s Bitcoin price, each “neuron” in the network processes part of the input data—like price history or trading volumes—and then applies an activation function to decide which information is most important. Three common activation functions are tested here:
- ReLU (Rectified Linear Unit): A simple and fast option, which outputs zero if the input is negative, and outputs that same input if it’s positive. Because it doesn’t deal well with negative values, it sometimes “loses” information.
- Sigmoid: This function produces numbers between 0 and 1. It’s great for probabilities but can get “saturated” for extreme inputs (very large or very small), slowing the network’s learning.
- Tanh (Hyperbolic Tangent): Similar to Sigmoid, but outputs values between -1 and 1, making it easier in certain cases to learn patterns that cross zero.
By experimenting with these activation functions, the researchers aim to see whether one particular function (or multiple) can reduce the errors in price forecasts. They also explore a variety of neural network architectures. While all these architectures belong to the broad family of Recurrent Neural Networks (RNNs)—which are especially suitable for time-series data—they differ in how they manage information over time. For example, a simple RNN might have trouble remembering anything that happened more than a few steps ago. A GRU (Gated Recurrent Unit) or an LSTM (Long Short-Term Memory) network, on the other hand, uses extra “gates” or “cells” to decide which pieces of past information are relevant and which can be forgotten. This approach often helps them handle longer sequences, which could be crucial when markets move over days or weeks rather than minutes. The authors also test hybrid models like LSTM-GRU and RNN-LSTM, which combine features from different designs in an effort to harness the best of each.
To fully test these approaches, the study looks at three different periods between 2016 and 2022. Each period has distinct economic or market characteristics. One period (starting around 2018) was initially thought to be calmer, but still showcased unexpected volatility in crypto markets. Another period (2020) coincided with the global COVID-19 pandemic, which reshaped economic behavior and introduced big market uncertainties. The final period (2022) covered the early months of the Russia–Ukraine conflict, another global event likely to impact financial markets. The idea is that if a forecasting model can handle these diverse scenarios—ranging from relatively stable to high-stress conditions—it may be more reliable in practice.
During these tests, the authors analyze price data either as a single “univariate” time series (relying only on the closing price over the previous 50 days) or a richer “multivariate” format (adding related information like opening, highest, and lowest daily prices). Typically, adding more data can help the model understand context—such as if the price opened higher than it closed or if there was an especially large difference between the day’s highest and lowest values. Such details might signal growing price momentum or unusual market pressure.
The study’s main findings can be summarized as follows. First, GRU models frequently emerged as the most accurate forecasters, while the simplest RNN structures tended to lag behind. This makes sense because GRUs have a built-in mechanism to handle longer-term dependencies more effectively than standard RNNs, yet are less complex than LSTMs. Interestingly, LSTM also performed well, but not necessarily better than GRU, especially when tested against the unpredictable shifts of the crypto market. Second, in nearly every test, multivariate models (using opening, highest, and lowest prices alongside closing prices) outperformed the univariate ones (which only tracked closing prices). This suggests that providing the network with more market context leads to better forecasts.
Another significant discovery was that Tanh often delivered the biggest improvements over the “default” ReLU function, particularly for models that initially did poorly, like the simple RNN. In other words, if an RNN was giving unsatisfactory results with ReLU, switching to Tanh sometimes boosted it to a much more competitive level. The Sigmoid function helped in a few cases, but not as consistently as Tanh.
It’s also worth noting that the level of market volatility at different times affected how well these models did. The authors observed that while COVID-19 in 2020 disrupted global economies, the crypto market seemed less extreme than in 2018 or 2022, leading to smaller forecast errors overall. This outcome shows how even advanced methods can struggle when markets behave in unexpected ways.
In terms of implications, the research underlines a critical insight: the choice of activation function is not just a minor technical detail but can significantly influence the accuracy of predictions. Traders or analysts often spend considerable effort tweaking things like the number of layers or neurons in a model, yet they might overlook how using Tanh instead of ReLU could yield a notable improvement. This is particularly true when dealing with chaotic data, like daily crypto price movements, that may contain abrupt ups and downs and no clear pattern for extended stretches.
The authors also point out that hybrid models can be promising in certain contexts, though no single blueprint emerges as best under all conditions. Sometimes an LSTM-GRU hybrid might shine, and at other times, a straightforward GRU might suffice. This indicates that researchers and practitioners should remain flexible—testing multiple approaches rather than relying on a single “best” practice.
Looking ahead, the authors propose expanding these experiments to include a wider variety of cryptocurrencies beyond the big three (Bitcoin, Ethereum, and Litecoin). There are hundreds, if not thousands, of smaller coins that see highly erratic price behaviors. It would be worthwhile to see if Tanh still gives the same advantage or if another activation function might excel in smaller, less-liquid markets. They also recommend layering in additional data sources, such as economic indicators (inflation rates or interest rates) and even social-media sentiment, since crypto prices are frequently influenced by news, rumors, and online discussions. Lastly, studying periods of extreme market upheaval—like regulatory crackdowns or sudden worldwide events—could reveal which models truly hold up under chaos.
In conclusion, this study shows that activation functions can have a larger effect on crypto price forecasting than many might suspect. It also demonstrates how combining thoughtful architecture choices (like GRU or LSTM) with an appropriate activation function (often Tanh) can result in markedly better predictions. By refining these tools, both professionals and everyday enthusiasts can make more informed decisions about buying or selling digital assets, thereby navigating the crypto space with greater confidence and less risk. Although uncertainty is a defining feature of cryptocurrencies, careful tuning of neural networks can help tame some of this unpredictability—and potentially unlock new opportunities in an ever-evolving market.
Vancsura, L., Tatay, T., & Bareith, T. (2024). Investigating the Role of Activation Functions in Predicting the Price of Cryptocurrencies during Critical Economic Periods. Virtual Economics, 7(4), 64–91.
https://doi.org/10.34021/ve.2024.07.04(4)