The opening and closing values correspond to the price of the first and last trade, respectively. High and low are the maximum and minimum price of all trades in the candle (may coincide with the opening and closing). Finally, volume is the sum of all assets traded (for example, in an ETH-USD pair, volume is measured as the number of Ethereum traded at the time of the candle). By convention, when a candle closes at a higher price than its open price, we paint them green (or keep it blank), but if the closing price is lower than the open price, we paint them red (or fill them with black).
Here is a very simple yet fast Python application for creating tick candles:
And here is a visual of how check bars look compared to standard time-based candlesticks. In this case, we show the 4-hour and 1000 tick bars for the BTC-USD trading pair, as well as all the trade prices between 21–01–2020 and 02–20–2020. Note that for candlesticks, we show an asterisk each time we sample a stick
Two main observations about these plots:
Yes, tick candlesticks look so ugly. Chaotic, overlapping, and difficult to understand, but remember they shouldn't be human-friendly: they need to be machine-friendly.
The main reason they are ugly is because they do their job very well. Look at the asterisks, see how more asterisks (and more bars) are mixed together during periods of high price change? And vice versa: when the price doesn't change much, teak bar sampling is much lower. Essentially, we are creating a system where we synchronize the entry of information into the market (higher activity and price volatility) with the sampling of candlesticks. In the end, we sample more during periods of high activity and less during periods of low activity. Hurray!
What about statistical properties? Is it better than their traditional time-based counterparts?
We will look at two different properties: (1) series correlation and (2) Bitfinex stock of including all historical bars in cryptodatum.io offered 15 crypto currency unit for each of the pair and normal return-based and tick candle sticks size for each time:
Time-based bar sizes : 1 minute, 5 minutes, 15 minutes, 30 minutes, 1 hour, 4 hours, 12 hours, 1 day.
Tick stick sizes : 50, 100, 200, 500, 1000
Serial correlation measures how much each value of a time series is correlated with the following (for delay = 1), or between any i value and any other i + n value (delay = n). In our case, we will calculate the series correlation of log returns calculated as the first difference of the log of candlestick closing prices.
It turns out that tick bars (labeled as tick- *) often have lower autocorrelation than time-based candlesticks (labeled as time- *) - this seems to be closer to the Pearson's autocorrelation to 0. it is less pronounced for larger time bars (4h, 12h, 1d), but interestingly even the smallest check bars (50-tick and 100-tick) give a very low auto-correlation and this applies to smaller time bars is not (1 minute, 5 minutes).
Finally, it is interesting to see that several cryptocurrencies (BTC, LTC, ZEC, and ZIL) express quite strong negative autocorrelation in several of their time bars. Roberto Pedace comments on negative automatic correlations here :
A utocorrelation, also known as serial correlation , can exist in the regression model when the order of observations in the data is relevant or important. In other words, autocorrelation in time series (and sometimes panel or logical) data is a concern. […] No autocorrelation implies a situation where there is no definable relationship between the values of the error term. […] Although not likely, negative autocorrelation is also possible. Negative autocorrelation occurs when an error in a particular sign tends to be followed by an error of the opposite sign. For example, positive errors are usually followed by negative errors, negative errors are usually followed by positive errors.
We will perform an additional statistical test called the Durbin-Watson (DB) test, which also diagnoses the presence of serial correlation. The DB statistic ranges from 0-4 and its interpretation is as follows:
Value
Value Meaning DB-statistic << 2 positive serial correlation DB-statistic ~ 2 no first-order correlation DB-statistic >> 2 negative serial correlation
Meaning DB-statistic << 2 positive serial correlation DB-statistic ~ 2 no first-order correlation DB-statistic >> 2 negative serial correlationEssentially, it is the closest to 2 and the lowest series correlation. Here are the results:
The results are in line with the Pearson autocorrelation test, which strengthens the narrative that check bars exhibit a slightly lower autocorrelation than time-based candlesticks.
3.2 - Normality of returns
Another statistic we can look at is the normality of returns, which is whether the distribution of our log returns follows a normal (aka Gaussian) distribution.
There are a few tests that we can run to check for normality - of which the 2 will perform: Data of a normal distribution fits skew and kurtosis whether tests that Jarque-Bera test and one of them is the Shapiro-Wilk test to check whether it follows the Gaussian distribution of a sample at from the classic tests.
In both cases, the null hypothesis is that it follows a normality, for example. If the null hypothesis is rejected (p value lower than the significance level - usually <0.05), there is convincing evidence that the sample does not follow a normal distribution.
Let's first look at the p values for Jarque-Bera:
The results are almost unanimous: their daily returns do not follow a Gaussian distribution (most p-values <0.05). If we set our significance to 0.05, the two cryptocurrency pairs (Stellar and Zilliqa) actually seem to follow a Gaussian. Let's take a look at their distributions (kernel density estimates):
Fair enough, some may look like Gaussian (at least visually). Note, however, that the number of samples (n) is very small (e.g. XLM-USD candle_tick_1000 n = 195), so I suspect that one of the reasons may be a lack of sampling, which does not provide enough evidence to Jarque-Bera rejects the null hypothesis of normality.
In fact, a quick glance at the CryptoDatum.io database shows that the trading pairs XLM-USD and ZIL-USD were launched in May and July (2018) last year, respectively, and appear to be of fairly low volume.
Mystery solved? :)
Now let's run the Shapiro-Wilk test to see if it fits with the previous results:
Damn, Shapiro, didn't they teach you not to cheat during an exam at school? It seems like the rule that returns are not normal regardless of the type of bar.




0 Comments