The Accuracy of the Tick Rule in the Bitcoin Market

The tick rule is one of the most popular trade classification algorithms used when an order initiator in market data is not signed. Using 11.9 million trades of Bitcoin/USD on Bitstamp, this article tests the accuracy of the tick rule in the Bitcoin market. Evidence indicates that the overall success rate of the tick rule is 76.87%. It is also shown that the tick rule is inclined to fail in discerning trade intentions when there is a long period of time between trades. Furthermore, order imbalances computed using the tick rule lack sufficient accuracy in the Bitcoin market.

Recent research concentrates on the microstructure of the Bitcoin market from the view point of order flow or order imbalance using historical trade data. For instance, Dimpfl (2017) inferred Bitcoin trade directions from tick data downloaded from bitcoincharts.com. Feng et al. (2018) employed normalized order imbalance to measure informed trading in the Bitcoin market using microstructure data from bitcoincharts.com. Wang et al. (2020) studied the impact of informed trading indexed by order imbalance on Bitcoin return and volatility using data from the same source as Dimpfl (2017) and Feng et al. (2018). Ibikunle et al. (2020) also utilized order imbalance in research on Bitcoin price discovery based on historical trade data from Bitstamp. On one hand, microstructure data with an initiator are not always available. It is important to assign trade directions based on classification algorithms when the initiator is not indicated in market microstructure research. As far as we know, the tick rule has been applied to classify historical trade data in Dimpfl (2017), Feng et al. (2018), Wang et al. (2020), and Ibikunle et al. (2020) for the Bitcoin market. On the other hand, the classification accuracy of the tick rule in the Bitcoin market, which consists of multiple online crypto exchanges, remains unknown.
This study investigates the accuracy of the tick rule in the Bitcoin market. The commonly used tick level data of Bitcoin, which are freely downloadable from bitcoincharts. com, in research related to the Bitcoin market microstructure does not indicate the initiator. Nevertheless, the market data of Bitcoin/US Dollar (USD) with a trade indicator from Kaiko provide an opportunity to test the accuracy of the tick rule in the Bitcoin market. This study focuses on three main questions: First, what is the classification success rate of the tick rule for Bitcoin market data? Second, what factors are associated with the classification accuracy of the tick rule in the Bitcoin market? Third, how accurate are the order imbalances based on the classification results of the tick rule in the Bitcoin market?
The primary results of this study are as follows. First, classification of 11.9 million Bitcoin/USD trades on Bitstamp suggests that the overall classification accuracy of the tick rule is 76.87%, which is close to that in the stock markets, and daily accuracy ranges from 68.98% to 83.76% during the sample period. This study finds a positive relationship between higher likelihood of misclassification and longer period between trades. As information spillover exists across Bitcoin exchanges (Brandvold et al., 2015), bigger time gaps contain more information to drive price changes. Furthermore, all of the order imbalances computed using the tick rule are significantly different from the true ones. The biases of the order imbalances calculated using large-size trades are relatively smaller than those using the whole sample.
The empirical findings of this study contribute to the small but ongoing exploration of the Bitcoin market microstructure in terms of research methodology. To the best of our knowledge, this is the first study to assess the classification accuracy of the tick rule in the Bitcoin market. Recently, tick-based classification methods have been used in research on the Bitcoin market microstructure (Dimpfl, 2017;Feng et al., 2018;Ibikunle et al., 2020;Wang et al., 2020) without knowledge of the classification accuracy of the tick rule. The conclusions from this study can thus serve as a guide for future research related to the methodological approach to be applied to the Bitcoin market microstructure when the transaction direction is not available.
This study also contributes to the existing literature on trade classification algorithms. Except for Aktas and Kryzanowski (2014) and Carrion and Kolay (2020), the classification accuracy of the tick rule in the above-mentioned articles is examined in slower trading environments, namely when high-frequency trading is not widely applied. Using tick data stamped to seconds from the Bitcoin market, this article also confirms the findings of Carrion and Kolay (2020), namely that the classification accuracy of the tick rule is similar to those in slower trading environments.
The remainder of this article is organized as follows. "Literature Review" section briefly reviews the literature on the tick rule. "Classification Accuracy" section reports on the empirical analysis of the classification accuracy of the tick rule. "Order Imbalance" section presents the biases of order imbalances based on trade directions assigned by the tick rule. "Discussion" section discusses the empirical findings of this study. Finally, "Conclusions" section concludes the article.

Literature Review
Trade classification approaches are commonly used to discern trade intentions when there is no available trade direction in market data. Popular classification methods include the tick rule or tick test, quote rule, Lee-Ready algorithm proposed by Lee and Ready (1991), and bulk volume classification (Easley et al., 2012(Easley et al., , 2016. The tick rule assigns trade direction based on price movements when no quoted data are available. If the transaction price is higher than that of a previous transaction (uptick), then the present transaction is classified as a buyer-initiated order. Conversely, if the transaction price is below the latest price (downtick), then the present transaction is classified as a seller-initiated order. If there is no price change (zerotick), the transaction direction is assigned as the same as the previous one. On the other hand, the quote rule assigns trade as a buyer-initiated (sellerinitiated) order if the price is above (below) the midpoint of the bid and the ask. Meanwhile, the Lee-Ready algorithm (Lee & Ready, 1991) combines the two aforementioned rules. For unclassified trades at midpoint, the Lee-Ready algorithm applies the tick rule to discern directions. Finally, bulk volume classification (Easley et al., 2012(Easley et al., , 2016 uses empirical distribution of price changes to infer the possibility of buyer-initiated (seller-initiated) volume from the aggregate volume of each bar. As transaction direction is not always available when conducting researches on market microstructure, classification algorithms are very important to infer trade intentions in researches.
The tick rule is usually used in research related to market microstructure when only trade data are available. For example, Barber et al. (2009) employed the tick rule to identify directions of partial trades. Bernile et al. (2016) also used the tick rule to compute order imbalance, which measures informed trading activity ahead of the Federal Open Market Committee's policy announcements. Dimpfl (2017), Feng et al. (2018), Wang et al. (2020), and Ibikunle et al. (2020) used the tick rule to assign directions to measure informed trading activity in the Bitcoin market.
Although the tick rule has been used in existing studies, the accuracy of such results is a cause for concern. Theissen (2001) noted that research related to the market microstructure model can be systematically biased because of inaccurate trade classification. Undoubtedly, the accuracy of the tick rule has been examined on stock markets. Using TORQ data for the U.S. stock market, Odders-White (2000) reported that 78.6% of transactions were correctly classified by the tick rule. Tests conducted by Finucane (2000) using information from the TORQ database from November 1990 to January 1991 showed that the classification accuracy of the tick rule applied to a sample of 144 NYSE firms was 83.0%. Ellis et al. (2000) utilized NASDAQ data during the period of September 27, 1996, to September 29, 1997, and showed that the success rate of the tick rule was 77.66%. Tests by Chakrabarty et al. (2007) on NASDAQ stocks traded on INET and ArcaEx revealed that the overall success rate of the tick rule was 75.4% during the sample period. A recent examination by Carrion and Kolay (2020) showed that the classification success rate of the tick rule was 78.62% in a sample of data stamped to seconds from NASDAQ HFT database over a subset of dates during the period of 2008 to 2010 when trades were more frequent than before. And the classification success rate ranged from 69.75% to 83.34% across individual stocks. Similar studies have also focused on the non-U.S. stock markets. Aitken and Frino (1996) studied trades of 2 years on the Australian Stock Exchange and reported a success rate of 74% for the tick rule. Using 15 stocks on the Frankfurt Stock Exchange in 1996, Theissen (2001) documented that the tick rule correctly classified 72.2% of the transactions. An examination of classification algorithms on data from the Taiwan Stock Exchange by Lu and Wei (2009) revealed an overall success rate of 74.18% for the tick rule. Aktas and Kryzanowski (2014) examined the trade classification accuracy of different classification algorithms using the data of component firms of the BIST-30 index and found that the classification success rate of the tick rule ranged from 84.86% to 92.15% in different subsamples. In addition, Omrane and Welch (2016) found that the classification success rate of the tick rule for 1.2 million trades on the foreign exchange electronic communication network market was about 68%. To the best of the authors' knowledge, none of the previous research has focused on the accuracy of trade classification algorithms for the cryptocurrency markets.

Data and Methodology
Historical market data are required to test the accuracy of the tick rule in the Bitcoin market. Due to decentralization and lack of regulation, Bitcoin is traded simultaneously and continuously (24/7) on multiple online crypto exchanges. To examine the classification accuracy of the tick rule, the market data of Bitstamp, an order-driven online crypto exchange, is considered as being representative of its liquidity. Founded in Europe in 2011, Bitstamp is one of the oldest and largest global crypto exchanges that allows trading between Bitcoin and USD. The tick-by-tick trade data of Bitstamp were acquired from Kaiko, a cryptocurrency market data provider. Every transaction record includes a unique trade ID, timestamp, transaction price in USD, amount in Bitcoin, and trade direction indicator. This study includes classifiable trades from December 6, 2017, to October 7, 2018 (Greenwich Mean Time), amounting to a total of 11,919,298 observations. Since all these trades are stamped to seconds, some trades occur at the same timestamps because of fast trading. This study thus uses the trade ID to determine the order of the trades.
The detailed classification process of the tick rule applied to a specific tick i can be summarized as seen in the following equation: where Trade i is the estimated initiator of tick i, and P i is the executed price of tick i. As defined in the previous literature, the classification accuracy R T during a specific period T can be described by the following equation: where N is the total number of transactions, and n is the number of correctly classified transactions. Table 1 reports the classification accuracy of the tick rule for directions in Bitcoin transactions. According to the Bitstamp market data, true seller-initiated (buyer-initiated) orders account for 41.50% (58.50%) of all trades, whereas 45.84% (54.16%) of all trades are classified as seller-initiated (buyerinitiated) orders by the tick rule. Furthermore, the tick rule wrongly classifies 9.40% of all trades as buyer-initiated orders and 13.74% of all trades as seller-initiated orders. Thus, the overall classification success rate of the tick rule on the Bitstamp market data during the sample period is 76.87%. Figure 1 shows the daily change in the misclassification rate. It can be observed that the misclassification rate of the tick rule varies with time, ranging from 16.24% to 31.02%. This means that daily success rate of the tick rule on the Bitcoin/USD transaction data of Bitstamp ranges from 68.98% to 83.76%.  Table 2 displays the misclassification rate for each of the three tick types. The misclassification rates for uptick, downtick, and zerotick are 16.92%, 26.76%, and 25.28%, respectively, indicating that the classification success rate of the tick rule for upticks is much higher than that of the other tick types. In addition, the misclassification of zeroticks contributes the most to the total errors. Figure 2 displays the daily proportion of seller-initiated orders. Although the statistics in Table 1 show that the number of misclassified seller-initiated orders is less than those from the buyer side, Plot A in Figure 2 illustrates that the proportion of seller-initiated orders in misclassified trades, defined as the number of intraday misclassified seller-initiated orders divided by the number of intraday misclassified samples, changes from day to day. Moreover, this value is not always less than that of the buyer-initiated orders. Plot B presents the proportion of seller-initiated orders in intraday trades, defined as the number of intraday seller-initiated orders divided by the number of intraday trades. The Pearson (Spearman) correlation coefficient of the two time series is 0.76 (0.73). Accordingly, the misclassification rate of the seller-initiated trades is positively associated with the proportion of seller-initiated trades in the sample. Table 3 reports the classification accuracy of the tick rule during two subperiods to examine whether Bitcoin market conditions could affect the accuracy of this classification algorithm. First, Panel A displays the classification accuracy during the first subperiod from December 7, 2017, through March 31, 2018, which covered a highly bull market in December 2017 and a subsequent crash in January 2018 in Bitcoin price. It is found that the tick rule wrongly classifies 9.59% of all trades as buyer-initiated orders and 13.57% of all trades as seller-initiated orders. Namely, the classification accuracy of the tick rule is 76.84% during the first subperiod. Next, Panel B displays the classification accuracy of the second subperiod spanning from April 1, 2018, to October 7, 2018 during which Bitcoin price was relatively stable. Similarly, the classification accuracy of the tick rule is 76.90% during the second subperiod. Hence, Bitcoin market condition does not significantly impact the classification accuracy of the tick rule on the whole.

Multivariate Analysis
To analyze the variables associated with misclassification, this study draws from Ellis et al. (2000) and examines the following four variables: true trade direction, time from previous trade, trade size in Bitcoin, and price in USD. In addition, as trades are stamped to seconds, the time gaps of trades occurring at the same timestamp are set at zero. Table 4 reports the distribution of misclassification rate of the tick rule in different subsamples. Panel A divides all trades into four groups according to the length of time from the previous trade. In the first group, where trades occur in no more    Note. This table reports the sample size of seller-and buyer-initiated transactions during each subperiod. The first column shows the number of seller-initiated transactions, including the number of ticks correctly classified as seller-initiated transactions and the number of ticks misclassified as buyer-initiated transactions. The second column presents the number of buyer-initiated transactions, including the number of ticks misclassified as seller-initiated transactions and the number of ticks correctly classified as buyer-initiated transactions. The numbers in the parentheses indicate the corresponding proportions of the total sample during each subperiod. data, correlations may exist among them. Further studies on the relationship between misclassification and time from previous trade/trade size/price level are thus needed. Table 5 reports the results of multivariate regressions, including ordinary least square (OLS) and logistic regressions. At the beginning, multivariate regressions of all trades indicate that seller-initiated order, amount, and price are negatively associated with the likelihood of misclassification, while time from previous trade is positively correlated with the likelihood of misclassification. However, the regression results of the subsamples are not in line with those of the whole sample. The results show that when time from the previous trade is no longer than 5 s, the likelihood of misclassification is positively associated with seller-initiated order and time from the previous trade, and negatively associated with trade size and price level. When trade size is no more Note. This table reports sample sizes and misclassified tick numbers in different subsamples. The first two columns contain subsample sizes and proportions of the total sample, whereas the last two columns provide misclassified tick numbers and proportions of the corresponding subsample.  11,919,298 10,737,962 8,198,688 7,215,806 Note. This table reports the regression estimates of the relationship between misclassification and the independent variables, namely true trade direction, time from previous trade, amount in Bitcoin, and Bitcoin price in USD. The dependent variable misclassification is a discrete variable. It equals one when the transaction direction is erroneously classified, and zero otherwise. The last row reports observations in the regressions. In addition, for robustness check, we have done Probit regression for each group, and Probit regressions give similar results for all groups. *, **, and *** denote statistical significance at the 5%, 1%, and 0.1% levels, respectively. than 0.1 Bitcoin, the likelihood of misclassification increases with all four independent variables. However, for an executed price of no more than 10,000 USD, the likelihood of misclassification decreases with price. In general, it is evident that the likelihood of misclassification is positively associated with time between trades in all regressions.

Order Imbalance
Order imbalance is usually employed as a measure of informed trading activity that cannot be observed directly.
Order imbalance here is defined as (B − S)/(B + S), where B (S) denotes buyer-initiated (seller-initiated) variables. This study examines imbalances (Barber & Odean, 2008;Bernile et al., 2016;Feng et al., 2018;Ibikunle et al., 2020;Ning & Tse, 2009;Sun & Ibikunle, 2017;Wang et al., 2020) estimated using the tick rule, namely order imbalance based on number of trades (OIN), trade size (OIS), and volume in USD (OID), using market data from Bitstamp. In addition, large-size orders are usually considered to be related to informed trading, because informed traders are prone to using block trades to cut down transaction costs. Following Feng et al. (2018) and Wang et al. (2020), the order imbalances of large-size orders that are larger than the 95th percentile of intraday trade sizes are also compared for robustness. Figures 3 to 5 present daily order imbalances measured by OIN, OIS, and OID, respectively. Plots A and C in Figure 3 are true OINs calculated with the number of true trade directions and OINs calculated with directions assigned by the tick rule, respectively. Plot E in Figure 3 shows the bias of the estimated OIN, defined as the difference between the true OIN and the OIN estimated using the tick rule. Plots B, D, and F in Figure 3 use the same method but for large-size trades. The two lines in Plots E and F indicate the values of −0.1 and 0.1. The proportions of OIN bias whose absolute value is larger than 0.1 are 148/306≈48.37% in Plot E and 74/306≈24.18% in Plot F. Figures 4 and 5 report the daily order imbalances measured by OIS and OID, respectively, using the same method. Therefore, the proportions of biases whose absolute value is larger than 0.1 are 70/306≈22.88% (OIS) and 73/306≈23.86% (OIS95) in Figure 4, and 69/306≈22.55% (OID) and 72/306≈23.53% (OID95) in Figure 5. In general, all these order imbalance measures are biased to a certain degree. Table 6 reports the results of a parametric test (the Welch two-sample t test) and a nonparametric test (the Mann-Whitney U test) applied to examine the differences between the true and estimated order imbalances. On one hand, the results of the Welch two-sample t test reveal that order imbalances estimated using the tick rule are underestimated on the whole and are statistically different from the true order imbalances at the 5% level. On the other hand, the null hypotheses of the Mann-Whitney U test are all rejected at the 1% significance level, indicating that distributions of the true and estimated order imbalances are not equal. On the whole, the statistics in Table 6 suggest that order imbalances in the Bitcoin market computed using the tick rule lack adequate accuracy. Table 7 reports the regression results of daily return and volatility of Bitcoin on each daily order imbalance to explore whether order imbalances could predict Bitcoin return or volatility. At the beginning, the first three columns display estimates of order imbalances from regressions of Bitcoin daily return. And it is found that all order imbalances (including the true and estimated ones marked by "TR") are positively correlated with daily return at the 1% significance level. However, the adjusted R-squared shows that order imbalances computed using the tick rule rather than the true ones could predict more variation of Bitcoin daily return. Then, the next three columns display estimates of order imbalances from regressions of Bitcoin realized variance (RV, hereafter) multiplied by 10 4 . Commonly, RV proposed by Andersen and Bollerslev (1998) is used as an ex-post volatility measure in financial literature. In this study, all estimates of order imbalances are negative and statistically significant at the 5% level, which means that realized variance decreases as order imbalances increase. Nevertheless, the adjusted R-squared of each order imbalance is less than 10% whereas order imbalances based on number of trades including OIN and OIN (TR) outperform others. Third, the rest of Table 7 displays estimates of order imbalances from regressions of positive semi-variance RV d + multiplied by 10 4 and negative semi-variance RV d − multiplied by 10 4 . And it is found that order imbalances are more statistically correlated with negative semi-variance RV d − , and predict more variation of negative semi-variance RV d − than that of positive semi-variance RV d + . On the whole, it is not very surprising that order imbalances computed using the tick rule outperform the true ones in predicting daily return because they are ex-post measures based on price change or return. In other words, performance of true order imbalance is overestimated by using estimated order imbalance in predicting daily return during sample period.

Discussion
Given that Bitcoin is listed on multiple unregulated online crypto exchanges, the classification accuracy of the tick rule in the Bitcoin market is similar to those in stock markets. The empirical analysis in this study shows that the overall classification accuracy is 76.87% and the daily classification accuracy ranges from 68.98% to 83.76% in the Bitcoin market. According to previous research, the classification success rate of the tick rule ranges from 72.2% (Theissen, 2001) to 92.15% (Aktas & Kryzanowski, 2014) on the U.S. and non-U.S. stock markets. Of the research cited in this work, the study by Carrion and Kolay (2020) presents the similar fast trading environment by using high-frequency NASDAQ data stamped to seconds. And the accuracy of the tick rule assessed in this study is close to the corresponding values of individual stocks in Carrion and Kolay (2020), namely from 69.75% to 83.34%. Therefore, the empirical results indicate that there exists a positive correlation between the likelihood of misclassification and the time from the previous trade in the Bitcoin market, as shown in Tables 4 and 5. Conversely, Ellis et al. (2000) found a higher classification success rate when trades were slow, due to a higher turnover rate of quotes. The difference can be attributed to the fact that since Bitcoin is traded simultaneously on multiple online crypto exchanges, information spillover from other crypto exchanges could impact the price (Brandvold et al., 2015). Consequently, it may be more difficult to discern trade direction based on previous trade when a long period of time elapses between trades. In addition, order imbalances calculated using large-size trades are relatively closer to their true values in the Bitcoin market during the sample period. As shown in Table 6, the results of the Welch two-sample t test show that the differences in the means of the true and estimated order imbalances are smaller and less statistically significant when the order imbalances are calculated using trade sizes larger than the 95th percentile of intraday trades.

Conclusions
This study investigates the accuracy of the tick rule in the Bitcoin market, wherein Bitcoin is listed on multiple online crypto exchanges rather than traditional regular exchanges. Although the tick rule has been utilized in researches on the microstructure of this innovational market (Dimpfl, 2017;Feng et al., 2018;Ibikunle et al., 2020;Wang et al., 2020),   the accuracy of this trade classification method requires further examination. This study attempts to address three issues: the success rate of the tick rule in the Bitcoin market, factor(s) associated with classification success, and bias of order imbalances, which are usually used as indexes for informed trading computed using the tick rule.
This study answers the three above-stated questions through empirical analysis using the tick-by-tick transaction data of Bitcoin/USD with signed initiators on Bitstamp from December 6, 2017, to October 7, 2018. First, this study finds that the overall success rate of the tick rule is 76.87%, and the daily accuracy ranges from 68.98% to 83.76% during the sample period. There are less misclassified seller-initiated orders than misclassified buyer-initiated ones on the whole, and this result is associated with fewer seller-initiated trades in the sample. In general, trade classification using the tick rule in the Bitcoin market has limited success. Second, this study finds that the longer the time between trades, the higher the possibility of misclassification. It is more difficult to discern transaction intentions when transactions are less frequent in this innovational market of multiple online crypto exchanges. Third, the empirical analysis indicates that the order imbalances computed using the tick rule in the Bitcoin market lack sufficient accuracy. However, order imbalances calculated using large-size trades are relatively closer to their true values. Evidently, attention must be paid to the accuracy of the trade classification algorithm when conducting research on the microstructure of the Bitcoin market.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by the Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (C) 17K03657.  . All t-values reported here are adjusted for Newey-West standard errors. RV = realized variance; OI = order imbalance; OIN = order imbalance based on number of trades; OIS = order imbalance based on trade size; OID = order imbalance based on volume in USD; TR = tick rule. *, **, and *** denote statistical significance at the 5%, 1%, and 0.1% levels, respectively.