Tr8dr

Arbitrage In DEFI (p2)

Sat, 16 Nov 2024 11:00:00 +0000

As mentioned in my prior post Arbitrage in DEFI (p1), have been building and improving a MEV strategy in DEFI to perform both atomic and non-atomic arbitrage, backrunning, liquidations, etc. In this article we continue to focus on algorithms to detect and optimise arbitrage paths through the pool graph.

Let us consider a simple graph of possible flows between 6 pools:

Bute-force approach (for small graphs)

For smaller graphs, we can avoid evaluating a complex optimisation by simply doing a DFS traversal to determine all possible acyclic paths from source -> sink:

Given all possible paths we can evaluate the optimal flow through each path in order to maximize outcome. As we put more flow (size) through a given pool, we are penalized with an increasingly worse price and outgoing size. Hence the payout for a profitable path will be a concave function such as this:

An optimisation on the path determines the size of the flow through the path that maximizes this function. For a small enough graph we could generate all such paths and maximize each one with a simple solver. However, the approach of generating all acyclic paths grows combinatorially and is not feasible for a large graph.

Optimisation

We want to find the weights through this graph such that the (outgoing - incoming) flow from source -> sink (our wallet) is maximized. We can frame this problem graphically as a max-flow problem where we:

\[\begin{align*} \text{maximize} \quad & \sum_{i \in N_{in}(sink)} \Lambda_i w_{i,sink} & \text{(flow into sink)} \\ \text{where:} \quad & \\ & \Lambda_i = g_i(\Delta_i) & \forall i \in N \text{ (node transfer functions)} \\ & \Delta_i = \sum_{j \in N_{in}(i)} \Lambda_j w_{j,i} & \forall i \in N \text{ (node input flows)} \\ & \Lambda_{src} = c & \text{(source capacity)} \\ \text{subject to:} \quad & \\ & \sum_{j \in N_{out}(i)} w_{i,j} \leq 1, \, w_{i,j} \in [0,1] & \forall i \in N \text{ (weight constraints)} \\ \text{where:} \quad & \\ & N \text{ is the set of all nodes} \\ & E \text{ is the set of all edges} \\ & N_{in}(i) \text{ is the set of nodes with edges into node } i \\ & N_{out}(i) \text{ is the set of nodes with edges from node } i \\ & g_i(\cdot) \text{ is the non-linear transfer function for node } i \\ & w_{i,j} \text{ is the weight of edge } (i,j) \\ & \Lambda_i \text{ is the output flow from node } i \\ & \Delta_i \text{ is the input flow to node } i \\ & c \text{ is the source node capacity} \end{align*}\]

Note that the node transfer function (or amount of flow output for a given input) is as follows for a Uniswap V2-like AMM:

\[g_i(\Delta) = r_{\Lambda} - \frac{k}{r_{\Delta} + \gamma \Delta}\]

Quasi-Convex Optimisation

In the paper Optimal Routing for Constant Function Market Maker, Angeris et al showed an elegant approach to determine the optimal amounts to be traded across a collection of pools maximizing some utility function (for example maximize output of ETH) within the constraints of CFMMs.

\[\begin{align*} \text{maximize} \quad & U(\Psi) & \text{(utility function)} \\ \text{subject to} \quad & \Psi = \sum_{i=1}^m A_i(\Lambda_i - \Delta_i) & \text{(trading function inputs } \Delta \text{, outputs } \Lambda \text{)}\\ & \varphi_i(R_i + \gamma_i\Delta_i - \Lambda_i) = \varphi_i(R_i), & \forall i \in m \text{ (constant product constraint)} \\ & \Delta_i \geq 0, \; \Lambda_i \geq 0 & \forall i \in m \text{ (traded amounts must be positive)} \end{align*}\]

where:

$(\Delta_i, \Lambda_i)$ is the amount in and amount out for the ith pool
$A_i$ indicates the adjacency matrix between tokens
$\phi_i(*)$ is an expression of the constant product reserve constraint
$U(\Psi)$ is the utility function, for example: $U(\Psi) = prices_{mkt} \times \Psi$, maximizing market value of net output

The setup of this problem works beautifully on small to medium sized acyclic graphs, but is not stable or becomes overwhelmed in the following contexts:

graphs with cyclical feedback
- the possible sub-graphs present different discontinuous gradient functions
large graphs with hundreds of thousands of edges
- many of these algorithms have complexity $O(n^3 log(1/\epsilon))$, which does not scale for large sized problems in the 10K+ range

We can make this approach workable if we can significantly reduce the graph, focusing on profitable neighborhoods, and remove cycles prior to applying the optimisation (I may elaborate on this in a subsequent post).

Max-Flow Optimisation (for acyclic graphs)

If the graph is acyclic we can efficiently solve the max-flow problem as follows:

determine a function to resolve flows given a weight vector (essentially a graph traversal)
determine the gradient of the network with respect to weights
determine objective function (for example maximize flow into sink node)
use L-BFGS-B to solve for the objective

This works well, but does require that the graph has no cycles. With cycles a continuous gradient cannot be determined due to the discontinuity of an edge / cycle being in the graph or excluded. If the number of cycles is limited can evaluate each graph alternative separately. I have an implementation for this that arrives at the global optimum in approximately $O(n\, log(n))$ time.

Heuristic Optimisation

As mentioned above convex optimisation approaches have issues in tackling the complexity and scale of these graphs. Heuristic optimisation algorithms such as differential evolution (DE) and simulated annealing are well suited to finding a global optimum in the presence of:

multiple local minima
discontinuous solution surfaces
mixed integer and continuous domain

Neither of these solutions is guaranteed to arrive at the absolute maximum (minimum) due to their stochastic nature, however can get quite close or alternatively converge arbitrarily close to the optimum with increased iteration.

Differential Evolution is more capable than Simulated Annealing in terms of being better able to explore the solution space. This benefit comes at the cost of requiring more computation, at least in early iterations, however. DE is a special case of a genetic algorithm, one where new individuals from generation to generation are expressed with gradient aware operators. SA has some similarities to DE in the sense that the parameter vector is perturbed in each successive generation, however one can easily get stuck in some part of the domain depending on the starting point of the simulation. DE is able to continue to explore the domain, though will tend to do so less in successive generations.

I found that differential evolution was consistently able to arrive close to the global optimum, whereas simulated annealing might find a local optimum 50% of the time. SA could be adjusted to use a sampling (multi-simulation) approach and take the maximum over those simulations.

Bellman-Ford Optimisation

In the BF construction of the problem, we do not solve for max-flow directly, rather look for minimum paths through the graph where the weight is the $-log(price_i)$. If we circuit from, say, WETH -> … -> WETH and have a negative cumulative path, then have found a profitable path.

The problem with BF algorithms are that they:

assume that price remains constant with size
- so have to test for profitable paths on some unit size, later requiring a secondary optimisation to determine appropriate size
algorithm cannot handle cycles
- there are various heuristic approaches to trimming the graph in the presence of a cycle, but does not lead to optimal solutions. It can still provide viable, if not optimal, paths however.

Example

We want to find profitable paths through the following, given a source with 20 WETH:

Uniswap V2: ETH <-> USDT
- reserves: 20100184, 10000, fee: 30bps
Sushiswap V2: USDT <-> ETH
- reserves: 20020000, 10830 fee: 30bps
Uniswap V3: USDT <-> ETH
- reserves: 20000000, 11040, fee: 30bps
Shibaswap V2: USDT <-> ETH
- reserves: 23000000, 10000, fee: 30bps
Sushiswap V2 USDT <-> BTC
- reserves: 101, 5000000, fee: 30bps
Sushiswap V2: BTC <-> ETH
- reserves: 100, 3518, fee: 30bps

Here is the raw graph without optimisation and the graph post optimisation (where we have identified the optimal flow through the graph from source -> sink):

Solving with Differential Evolution

Differential Evolution is not the most efficient approach, however is one of the simplest and one that can achieve global optimum. In a subsequent post I will discuss a hybrid approach to reduce the size of the graph and allow for more economical solutions.

The approach to solving the max-flow problem with DE is as follows:

determine a function to resolve flows given a weight vector
- this is essentially a graph traversal where we evaluate incoming flows x weights and compute outgoing flows
- we also terminate and zero-out cycles when detected
write a function to enforce constraints $\sum_i w_i \le 1$ and also sparsify weight vectors
- for sparsification I use $\bar{w}^k / \sum_i w_i^k $, pushing values towards 0 or 1
determine an objective function
- maximize sum of incoming flows into sink node
write or use an existing differential evolution algorithm
- python scikit has a differential_evolution function
- rust has crates for this as well differential_evolution and metaheuristics_nature
- I’ve written a DE / GA implementation for C++ as well

Code

Realistically one will implement optimisation in Rust or C++, however for a more concise presentation am providing some python code snippets here for the DE approach.

Pool

This class defines functionality for a Uniswap V2-like pool or top-of-book V3 pools. Here is the output function defining amount out for a given amount in, according the CFMM constraints:

def amount_out(self, amount_in: float) -> float:
    """
    Compute the amount of money outgoing given incoming amount in the pool.  The amount in should be in
    the units of cin.
    """
    if self.reserve0 >= 0 and amount_in > 0.0:
        k = self.reserve0 * self.reserve1
        gamma = 1.0 - self.fee
        return self.reserve1 - k / (self.reserve0 + gamma * amount_in)
    else:
        return amount_in

Resolving Flows

We want to determine flows through the graph, given a weight vector (where the weights define the % of flow per edge):

def resolve_graph(self, weights, amount_in):
    """
    Resolve the graph for given weights and amount in
    """
    def resolve_flow(weights: np.array, node: Pool, seen: set, eps=1e-4) -> float:
        if node.id in seen:
            return node.outgoing if not np.isnan(node.outgoing) else 0.0
        if not np.isnan(node.outgoing):
            return node.outgoing

        seen.add(node.id)

        # normalize incoming edges to sum <= 1 and sparsify
        incoming_edges = node.incoming_edges(self.G)
        self.normalize_and_sparsify (incoming_edges)
        
        # determine amount of inflow
        inflow = 0.0
        for edge in incoming_edges:
            iedge = self.edge_to_idx[edge]
            ancestor = self.pools[edge[0]]
            w = weights[iedge]
            if w > eps:
                out = resolve_flow(weights, ancestor, seen)
                inflow += w * out
            else:
                out = 0.0

            # clamp near-zero weights to 0
            if out == 0.0:
                weights[iedge] = 0.0

        node.outgoing = node.amount_out(inflow)
        node.incoming = inflow

        return node.outgoing

    seen = set([])
    for node in self.nodes:
        node.reset()
    for node in self.nodes:
        resolve_flow(weights, node, seen)

Objective

Our objective function is simple:

resolve flows across graph for a given weight vector
determine output flow vs input flow

def objective(weights: np.ndarray) -> float:
    """
    Maximize profit as: `flow reaching sink` - `amount in` (as a minimization problem)
    """
    self.resolve_graph(weights, amount_in)
    return -(self.pools[sink].incoming - amount_in)

Optimisation

We then call our DE optimiser:

result = differential_evolution(
    func = objective,
    bounds = ranges,
    maxiter = maxiter,
    popsize = population,
    disp = True,
)
self.weights = result.x
self.profit = -result.fun - amount_in

Upcoming

There is no entirely satisfactory solution for the optimisation problem in that:

the presence of cycles invalidates a number of optimisation approaches
the scale of the problem is such that traditional optimisers cannot solve efficiently
heuristic optimisation can determine the global maximum +/- the stochastic nature of the algorithm

In a subsequent post will elaborate on an approach to reduce the problem significantly, making more amenable to a variety of these approaches.

Arbitrage In DEFI (p1)

Mon, 11 Nov 2024 11:00:00 +0000

I have been building and improving a MEV strategy in DEFI to perform both atomic and non-atomic arbitrage, backrunning, liquidations, etc. In this post will focus on one of the hard algorithmic problems, namely, determining the optimal size and path of arbitrage through swap pools and other protocols.

On Ethereum, for example, there are ~700K ERC20 tokens and a few hundred thousand AMM pools (fortunately only a fraction of these pools and tokens are active). We can consider the possible transactions and interactions across pools as a directed graph, where edges represent flows from a wallet or pool to another pool or wallet. The size of this graph is enormous, perhaps 200K nodes and a similar number of edges. Here is a small example of such a graph:

The Problem

Given a source (our wallet) we want to determine if there are profitable arbitrage paths through the graph. Each node represents a swap pool, where the amount-in, in one coin with result in some amount-out in another coin. This presents itself as a max-flow problem, where we want to:

maximize the net amount out from source -> sink
- i.e. amount into our wallet > amount out of our wallet at the end of the arbitrage paths
reject arbitrage paths if:
- expected cost of arbitrage > profit or minimum profit target, this includes gas cost & priority fee

AMMs such as Uniswap V2 (and top of book V3) use the so-called “constant product” formulation for pricing $XY = k$, where X and Y are the reserves of coin 1 and coin 2.

If we want to trade $\Delta X$ for some outgoing amount of coin $\Delta Y$, and assuming $\gamma = (1-fees)$, we arrive at the relation:

\[(X + \gamma \Delta X) (Y - \Delta Y) = k\]

expressing the notion that an incoming amount of $\Delta X$ is added to the X reserves and an outgoing amount of $\Delta Y$ is removed from the Y reserves, all the while keeping $k$ constant. Graphically, swapping adjusts size and and price along a hyperbolic curve like this:

We can formulate the problem, the max-flow problem, by determining the weights to assign to edges in the graph. Let us define the weights to be in [0,1], representing the % of coin to flow on a given edge from its immediate source node. Given this setup, the following must be true:

the sum of weights for outgoing edges must be = 1
- such that all outgoing flow from a node is handled
a weight on a given edge can be 0
- indeed for a given arbitrage epoch, most weights will be 0 and just the pools and transfers active in the arbitrage will have non-zero weights

So, assuming we know how much size we are starting with at the source, it should just be a matter of determining the weights through the graph such that we end with more size than we started with.

If the pools were linear in terms of input -> output, this is a problem that could be solved with linear algebra, however most AMMs have a non-linear output function. For any given node we could formulate its output based on the weighted incoming flows:

\[\begin{align} amount_{out} (F_i,w_i | i \in edges) &= Y - \frac{k}{X + \gamma \sum_{i \in edges} w_i F_i} \\ \sum_{i \in edges} w_i &= 1 \end{align}\]

where $F_i$ is the flow from the pool associated with the i’th edge and $w_i$ is the weight assigned to an incoming edge. One could create a chain of these equations linking nodes in the graph from source to sink.

Difficulties

The graph of possible trades across pool presents problems:

all possible traversals of the graph has combinatorial complexity.
- the upper bound is on the order of O(n!), if fully connected
we encounter cycles
- cycles pose problems for various max-flow or distance problems on a graph; for a max-flow problem we want to have acyclic paths between the source and the sink.
the amount of flow is non-linear with size, as we have seen above
- in fact Uniswap V2 clones present one of the easier cases for optimisation; Uniswap V3 presents an even more complex discontinuous function from the point of view of optimisation.

The cycle problem

A typical graph of AMM pools will have hundreds of cycles. While we could do a DFS or BFS and eliminate cycles, the DFS expansion would grow astronomically large. Let’s consider a small example:

The $w_2$ edge is reasonable if $w_1$ is disconnected ($w_1 = 0$) or vice-versa. We can introduce indicator variables $I_k \in {0,1}$, to gate flow avoiding cycles in the graph:

To determine where to put these indicator variables, we can do a breadth-first-search (BFS) from the source down towards the sink and make the following adjustments when a cycle is detected:

add indicator variable $I_j$ on the outgoing edge completing the cycle
add indicator variable $I_k$ on incoming edges already visited
add constraint $I_j + \sum_k I_k = 1$

The above approach eliminates cycles and for proper construction of a max-flow problem.

The optimisation problem

Optimisation is a challenge:

convex or quasi-convex optimisation
- The bilinear $I_j w_j$ weight-indicator products on edges create non-convex constraints. Unfortunately, this would require Mixed Integer Nonlinear Programming to solve if done with a traditional optimiser. If we get rid of the binary indicator variables we could use a quasi-convex DQCP optimiser.
- We could use a continuous function, albeit with cusp-like derivatives to model an indicator-like variable. This might not be very stable, however.
size of problem
- The size of the jacobian and other matrices used would be massive, making solving such a system in this way impractical, even without the integer constraints.
- there are other approaches that we will discuss that evade this issue.
- there are also techniques we can use to dramatically reduce the size of the problem

We will discuss and evaluate these in further posts.

Upcoming

We will evaluate the following in coming posts:

heuristic optimisation
convergence in DQCP with some reformulation of the problem
reduction of the problem (by a couple orders of magnitude)

New Crypto Strategy

Wed, 01 Sep 2021 11:00:00 +0000

I have a new stat/arb strategy in crypto that uses an adaptive state based system to identify MR opportunities on small portfolios of coins. I have identified ~1600 such portfolios across 200 coins, each of which is traded as a strategy. In practice I trade a portfolio of these strategies (which I term, “stratlets”), where the strategy is defined as a dynamically weighted portfolio of these stratlets.

Setup

While each stratlet is adaptive, adjusting internal parameters over time, there are some hyper parameters for the following layers:

model:
- 5 dynamic parameters inferred over time
- 1 fixed parameter
- 1 parameter to be optimised
trading state machine:
- 3 parameters to be optimised
money management:
- 2 parameters to be optimised

In order to avoid overfitting, rather than optimising all 6 free parameters at once, took the following approach:

optimise each layer separately: i.e. the model, trading, and MM are optimised separately
restrict the parameter values to a small # of discrete possibilities
perturbation analysis to make sure the “solution” is not overly sensitive to the optimal setting

I assumed 30bps in transaction costs and slippage in backtesting. Trade profitability is quite high, so is not terribly sensitive to transaction cost. I am expecting < 10bps in transaction costs, so the remaining 20bps relates to execution slippage.

Results

99% of the 1600 stratlets were very positive out-of-sample. The remaining ~15 losing stratlets had easily identified issues during the in-sample period, with low to negative performance in-sample. These 15 portfolios also scored poorly in terms of stationarity as well.

I observed a wide range of risk characteristics across the strategies (Sortino, Calmar, etc), so wanted to understand to what degree the losing trades are correlated across stratlets. If the losses were highly correlated then bundling into a portfolio would not tend to improve the overall risk. Fortunately, as we shall see, the losses are relatively uncorrelated, leading to improved risk characteristics when trading a larger # of stratlets.

Small Portfolio

Here is the performance of a small portfolio based on higher-liquidity stratlets ranked by the Calmar ratio within the validation period. The individual stratlets have sortino’s ranking from 0.6 to 3.2 within this small set, however the effective Sortino for the portfolio is higher, at 4.3:

Medium Portfolio

The risk characteristics improve even further to 10.3 with the larger top-25 portfolio:

So the good news is that the losses appear to be relatively independent across stratlets allowing us to average these out with other profitable stratlets during the loss periods.

Large Portfolio

I evaluated a portfolio based solely on the sortino seen within the training period, selecting stratlets with sortino > 3. (selecting ~600 stratlets). Note that the validation period was not used in portfolio selection, providing a longer unseen period, from late March to current:

This appears to an excellent result, with a Sortino of 22.4 and 4682% return. However, the return seems too good to be true, and it is. This result assumes we can execute in size (and at equal size) across stratlets. In reality, scaling up this strategy would underweight the majority of stratlets and overweight a smaller set with higher liquidity (the lower liquidity stratlets have higher returns than the highly liquid ones).

Notes

I have not accounted for execution failures (for example missed legs)
- will have to see how this plays out in live
Many of these stratlets are liquidity constrained, so the actual return when size adjusted will be lower.
- the lowest liquidity portfolios can only handle, maybe, 10K / trade, while the highest, potentially a few million.
Live performance might degrade by 3 - 5x
- due to the above considerations
The nature of stat/arb is that over time, the market will become more efficient
- so I expect that the profitability of the strategy will diminish over time.

Crypto Trading Depth

Fri, 20 Aug 2021 11:00:00 +0000

I have a collection of crypto stat/arb strategies I plan to trade as a portfolio of strategies. Each strategy trades a small mean-reversion portfolio of loosely cointegrated coins, based on a bayesian state-based model. The returns in cryptos for this sort of strategy are phenomenal, however, finding enough size can be difficult for some coin portfolios.

In my universe of roughly 220 coins, there is significant variation in liquidity, requiring that I size these strategy portfolios very differently depending on the coin composition. Ultimately, the most robust answer re: achievable scale lies in live-testing, however wanted to get some indicative view on liquidity apriori.

Some relevant measures:

spread-to-sweep
- spread required to clear k$ in liquidity (with an aggressive order sweeping multiple levels of the order book).
- one would rarely sweep the book to achieve a large $ amount unless there was an arbitrage, massive risk event, or wanted to impact the market, however is useful in understanding book depth.
maximum size achievable within k bps, and time period
- Using trade data we can observe the $ amount that was filled within k bps of a reference price within a given time period.
- For different k bps buckets can produce a distribution of achieved size.

Implementation

Many of the coins in my portfolio universe are traded on Binance and/or FTX. Using Binance trade data observed the following:

Create distribution of size / price-spread bucket, conditional on:

up, down momentum or none
low, medium, high volatility
day of week
time of day
aggressive or passive size

The approach was to:

choose a reference price on a sliding window (from a buyer or seller perspective)
- we evaluate the size distribution for some forward period across the time series relative to the price at the start of the window
observe the conditional variables (above) to classify volume
evaluate the size / spread cost distribution over periods: 5min, 10min, 15min .. 30min
- replay trades to see what could be captured at various spread to base price levels
size by time-of-day, day-of-week
size differences as relates to volatilty and momentum
size achievable over different time periods
size achievable for a given spread cost

Time of Day and Day of Week

Dividing coins into the major (top 10% by volume), mid (above 50% < 90%), minor (< 50%), examining TOD and DOW patterns, to answer the following questions:

Is there a substantial difference in TOD / DOW profiles between top and bottom coin groups (no)
Is there a time-of-day effect (yes)
Is there a day-of-week effect (yes)

Size / Cost Distribution

I wanted to get a view of how much size is done (passively or aggressively) for a given direction over different time periods and for some cost (in terms of bps deviation from the reference price).

We can see a substantial difference in the amount of size traded within 5 minutes between BTC (in the top tier) and BEAM (bottom tier):

Size accumulation by time in market / cost basis

The size accumulated within 5min - 30min scaled linearly with the amount of time was in the market (for BTC/USDT):

On average, the price hovered in the vicinity of the base price across the 5min - 30min period on average, such that the volume achieved within 2bps grew 1:1 in proportion to the amount of time in the market.

Size in the context of momentum

Using the momentum labeler, observed the amount of size done in upward, downward, and neutral momentum (here are the stats for BEAM/USDT, in the 2bps cost category):

As expected, the results show:

Chasing momentum results in, overall, lower size
- for example buying during upward momentum only accumulated 8k$ of size versus 11K$ neutral or 16K$ in downward momentum.
Buying in downward momentum allows for more net accumulation
- needless to say, most size accumulated in this scenario can be passive

Size in the context of different volatility regimes

I divided volatility into low (<= 30% quantile), medium (30% < 70%), and high (> 70%) categories to see how this would affect liquidity (again BEAM/USDT, in the 2bps category):

The results show that there is much more liquidity in higher volatility regimes in the bottom ranked coins. The effect is less pronounced in the top ranked coins.

Size By Coin (5mins)

Here is the average size seen over a 5min execution window by coin and cost bucket (on Binance):

pair	2bps	5bps	10bps	15bps
BTC/USDT	7.0e+06	7.8e+06	8.8e+06	9.7e+06
ETH/USDT	4.4e+06	4.7e+06	5.2e+06	5.7e+06
BNB/USDT	2.4e+06	2.6e+06	2.8e+06	3.0e+06
XRP/USDT	1.7e+06	1.8e+06	1.9e+06	2.1e+06
ADA/USDT	1.4e+06	1.4e+06	1.6e+06	1.7e+06
DOT/USDT	8.0e+05	8.5e+05	9.3e+05	9.9e+05
MATIC/USDT	7.6e+05	7.9e+05	8.4e+05	8.8e+05
LTC/USDT	5.6e+05	6.0e+05	6.6e+05	7.0e+05
ETC/USDT	5.1e+05	5.3e+05	5.6e+05	5.9e+05
EOS/USDT	5.0e+05	5.3e+05	5.7e+05	6.1e+05
VET/USDT	4.9e+05	5.1e+05	5.5e+05	5.8e+05
CHZ/USDT	4.9e+05	5.1e+05	5.4e+05	5.7e+05
LINK/USDT	4.4e+05	4.7e+05	5.1e+05	5.5e+05
…	…	…	…	…
BEAM/USDT	1.2e+04	1.3e+04	1.3e+04	1.4e+04
WAN/USDT	1.2e+04	1.2e+04	1.3e+04	1.3e+04
KMD/USDT	1.1e+04	1.1e+04	1.2e+04	1.3e+04
NMR/USDT	1.1e+04	1.1e+04	1.2e+04	1.2e+04
TROY/USDT	1.0e+04	1.1e+04	1.1e+04	1.2e+04
REP/USDT	1.0e+04	1.0e+04	1.1e+04	1.2e+04

For those interested, happy to share the larger generated data set containing distribution by: pair, side, dow, tod, period, condition & level.

Excursions

The assumption in this size analysis is that we are executing (passively or aggressively) during some fixed period and targetting some maximum entry cost (for example 2bps). The price will undoubtedly drift outside of our target cost region during the execution period, particularly if is a longer period or in the context of volatility or momentum.

Hence was interested to know what the average maximum excursion (in bps) was for a given cost target. For coins like BEAM, this could be rather large depending on volatility context:

Conclusions & Notes

time-of-day is material, with a 45% impact in terms of addressable volume
day-of-week variation is fairly marginal, with a 10% degradation over the weekend
addressable size in the top pairs is quite substantial, and in the bottom pairs quite marginal
some coins trade on multiple exchanges, increasing addressable size

I am not yet trading the referenced strategies, as need to set up an offshore entity or partner with a 3rd party. Also, still have some more work to do in terms of portfolio sizing and blending, but otherwise look very attractive.

Pricing Deribit Options

Wed, 28 Jul 2021 11:00:00 +0000

We have been working on some option strategies and wanted to get a sense of how well BTC and ETH options are priced on Deribit, i.e. is there a substantial IV premium over realized volatility or are options fairly priced. At first glance, based on the documentation, it seemed that Deribit options were Europeans on spot or spot equivalent. On closer inspection, however, and with some follow-ups with deribit, determined that this is actually a much more complex product (to price).

It turns out that the options have the following features:

are “asian” options, where the payout is based on the average of the underlier over some observation window
- in particular these options use the average price over a 30 min or 5 min window in determining the price at expiry
- the average is an artithmetic average as opposed to a geometric average. Asian option pricing for arithmetic averages is more complicated as does not allow a closed-form solution.
variable averaging window depending on whether the expiry is matched 1:1 with a future
- a 30 minute window is used for options where the expiry does not align exactly with a future
- a 5 minute window is used for options where the expiry aligns with a future
The underlier is a future or synthetic future, as opposed to spot
- in practice, the future should converge to spot by expiry, but may have some different dynamics in between

Our original goal was to back out the Black-Scholes equivalent implied volatilities (IV) and determine the volatility surface. It is worth noting that Black Scholes (BS) has well known drawbacks (model error) in related to its assumptions:

constant volatility across the lifetime of the option
constant interest rate across the lifetime of the option
log normal price distribution (under estimates the tails)

In order to back out the BS IV for deribit options we will need to:

work out how to price each “asian” option, given:
- time to expiry, price of underlier, assumed risk-free-rate, volatility
interatively adjust volatility to determine an implied volatility that reprices to the premium
- i.e. f(spot price, time, rfr, iv) = observed premium

Pricing an Asian Option

We cannot use the Black-Schole formula for European options to price an Asian Option, however we can borrow the the price process assumed in BS. The Black-Scholes formulation of the price process evolves according to the following SDE:

\[\partial{S_t} = rS_t\partial{t} + \sigma S_t \partial{W_t}\]

where $S_t$ is the price, $r$ is the risk-free-rate, $\sigma$ the volatility, $W_t$ is a Weiner process. We will use this later in our solution for pricing an Asian option.

The averaging period of an Asian option is path dependent: i.e. we care about where the price went across the averaging period as opposed to the terminal price at expiry. Due to this path dependency, while there are some closed form approximations, the most accurate way to evaluate these options is with a Monte Carlo (MC) technique.

With the MC technique, we simulate many price paths to option expiry using the above Black-Scholes SDE. We observe the price at K points within the averaging period to determine the terminal price to be used for the option value at expiry.

Given, say, 100K paths, for each path $i$ we compute the average $A_i$ during the observation period and evaluate the payoff of for a call option to be:

\[e^{-r t} \, max(A_i - K, 0)\]

where $K$ is the strike, $r$ is the risk free rate, and $A_i$ is the observed average on that path. We can determine the premium to be the expected value of the option payoff, or in other words, the average payoff of all of the paths.

\[premium = \frac{1}{n} \sum_{i = 1}^n e^{-r t} \, payoff \, (path_i)\]

Implementation

The first thing to note is that we need a discretized form of the BS price process. Integrating for some $\Delta t$, this works out to be:

\[S_{t+\Delta t} = S_t e^{(r - \tfrac{1}{2} \sigma^2)\Delta t + \sigma \sqrt{\Delta t}W_t}\]

Using the above we can use this to evolve a path for K time steps until expiry, expressing $S_{t+\Delta t}$ in terms of the prior price $S_t$ recursively.

We would like to limit the amount of computation required to evaluate a path. Noting that prior to the averaging period, we have no price path dependency, we can avoid computing the path from $t = 0$ until $T - 30min$. This can be accomplished by:

sampling the “prior path” up to $S_{T - 30m}$ from the BS forward price distribution
- given by the above discretization, where $\Delta t = T - 30m$
using the sampled price $S_{T - 30m}$, create a path with 30 additional prices
- each price on the path post T - 30min will be used in the average for that particular path
- we assume here, that the average is composed of 1min samples

There are other tricks we can deploy to reduce the computation time or improve accuracy, for example:

choose an appropriate random number generator which has a distribution suitable for MC
- Sobol and Halton sequences are excellent for MC
precompute the random normal samples and reuse in each calculation
- the cost of generating random numbers on the fly can be substantial
antithetic variate
- for every random draw can apply the + and - variation, creating 2 paths for every 1.
variance reduction
- reframe the variable (if possible) such that evaluating a problem with reduced variance, requiring fewer paths

Code

Here is a snippet of code for the simple case where we are pricing an asian option before the averaging period has started:

let W[t] be a precomputed cache of random normals drawn from a Sobol sequence
let t = time until maturity (in 1 / 365.25 units)
let spot = the underlier at the time of evaluation (the future price in this case)
let r = risk-free-rate (interest rate)
let dir = 1 = call, -1 = put
let sigma = volatility

val T1min = 1.0 / (365.25 * 24 * 60.0)
val sqrtT1m = sqrt(T1min)

val Tprior = t - averageperiod * T1min
val sqrtTp = sqrt(Tprior)

var cpayoff = 0.0
for (pathi in 0 until paths / 2) {
    val ri = pathi * (averageperiod+1)

    // compute 2 antithetic paths up to average measurement period: T - observation window
    var path1 = spot * exp((r - 0.5 * sigma * sigma) * Tprior + sigma * sqrtTp * W[ri])
    var path2 = spot * exp((r - 0.5 * sigma * sigma) * Tprior - sigma * sqrtTp * W[ri])

    // compute average period for the 2 antithetic paths
    var cmean1 = 0.0
    var cmean2 = 0.0
    for (i in 1 .. averageperiod) {
        path1 = path1 * exp((r - 0.5 * sigma * sigma) * T1min + sigma * sqrtT1m * W[ri+i])
        path2 = path2 * exp((r - 0.5 * sigma * sigma) * T1min - sigma * sqrtT1m * W[ri+i])

        cmean1 += path1
        cmean2 += path2
    }

    // compute payoffs
    val mean1 = cmean1 / averageperiod
    val mean2 = cmean2 / averageperiod
    
    cpayoff += max (dir * (mean1 - strike), 0.0)
    cpayoff += max (dir * (mean2 - strike), 0.0)
}

val df = Math.exp(-r * t)
val premium = df * cpayoff / paths

If the we are already in the averaging period then the code would be adjusted to combine the “fixings” (historical price points) and the remaining path projected with MC.

Calculating IV

Given the ability to, now, price an asian option, we can backout the value of sigma ($\sigma$), our implied volatilty. Making use of zbrent, or another efficient root finder, can solve for:

\[pv(spot, time, rfr, sigma) - premium = 0\]

for some range of sigma.

Notes

deribit actually uses 6 second sampling rather than a 1min sampling during the average window
- my current Deribit data is at 1min granularity, so assuming a different sampling frequency.
- deribit does selectively use either 30min or 5min windows for the average as described above.
deribit’s published IVs are priced assuming that the option is a European Black-Scholes
- their IV is not correct, so use at your own risk
the option underlier may be a blend of the two straddling futures if there is no matching future

Learning Candlestick Patterns

Wed, 12 May 2021 11:00:00 +0000

In the previous posts I described an Reinforcement Learning approach to “Learning the Exit” part 1, part 2. My initial conclusions there have been:

reward smoothing (with the labeler) leads to more robust results than a reward on position exit
- without smoothing the learning process struggled and had more volatility from epoch to epoch
- obtained the best results with smoothed reward
obtained better results with discrete actions (such as enter, hold, exit) rather than continuous
- continuous action distributions appears to be much harder to learn, and the algorithms, such as DDPG or TD3 are much slower to evaluate than DQN or PPO2
- PPO2 produced the best results, roughly 30% better than DQN
the results showed actionable positive P&L after accounting for transaction costs and expected slippage
- however did not get as close as I would like to the optimal reward in many scenarios

In terms of improving RL performance, thought I could improve the information available to the agent by adding a feature that classifies the current price-action in terms of:

in momentum
in trend
momentum or trend at an end or turning point
sideways price movement or noise

The goal for the RL strategy is to hold a position through a momentum or trend period in the direction of profit, and exiting when the features signal an end to momentum or a turning point in direction. I provided the model with a number of features that assist in this decision, however did not have a feature encoding any information that might be present in the “chart”. I am not a chartist, however there is undoubtedly information present in the price volume activity as seen in a chart.

A Possibility

I came across an article Identifying Candlestick Patterns using Deep Learning, which attempts to train a DL model to “become a chart reader”, effectively. The aproach:

use one of the top trained (million parameter) deep-learning vision models as the basis
produced images of candlestick charts, incrementing a rolling window across time
retrained the outer layers to learn a new objective (transfer learning) against these chart images
attempted to predict a fixed period forward return

The use of transfer learning, that of taking a trained vision model, stripping off the final layers and retraining a new set of final layers, was interesting.

However, I had the following reservations regarding the approach:

complexity and suitability of a vision model with a million parameters
- concerns about complexity adding to my trading environment
- the chart domain is much simpler and regular than the objects these models are trained on, not ideally suited
using fixed period return labels is not robust.
- a positive or negative return could have real support or could just be noise
- I found that t+5 returns often triggered a non-zero label due to apparent noise, and did not seem to be part of a move or have prior support in the price or volume.

A Better Approach

Rather than use an enormously complicated model, had the view that could create a simplified representation of a “chart” and apply classification. Like the above approach, I would present the prior K bars in some form as features to a ML classifier, providing indication of individual and overall geometry. Furthermore, I needed to address the labeling problem; avoiding noisy labels to the extent possible.

Labeling

In terms of labeling made use of a variation of our labeler to label in the following manner:

+2 / -2 for upward and downward momentum or trend with minimum magnitude of 3.5 x bar ATR
-1 / +1 for end of momentum or momentum direction transition periods
0 for sideways or noise

Features

The next question was, what set of features captures the geometry of the chart pattern? I started with simple features, with the notion that could add more sophistication, as and if necessary:

for each bar in our rolling window (for example 20 bars):
- normalized open to high
- normalized open to low
- normalized open to close
- percentage change in volume relative to a rolling MA on volume
- rolling cumulative return from start of window (this give the model a sense of geometry across the chart)
generate the above for 1min bars during market hours across a 10 year period
- I did not experiment with other bar granularities, but expect that “fatter” bars would tend to produce higher accuracy since they carry more aggregate information, and smooth out noise. However, I need to balance between timeliness (bar frequency) versus potential gains in accuracy for longer periods.

Just as with the labeling, I used each asset’s ATR as a normalizer so that the same model could be applied across a portfolio.

Results

Using a random forest classifier, obtained the following accuracies out-of-sample:

where the diagonal represents accuracies for labels { -2, -1, 0, +1, +2 } respectively. I was surprised with the level of accuracy given the rather simple features. The transition period labels { -1, +1 } did especially well, which is useful. I took the same model and applied it to other equitie and it gave similar accuracy, which was reassuring.

A friend of mine (thanks Steve W.), suggested that I take the prior bar prediction and feed that into the model as a new feature. So for time $t$, use the label predicted by the original model $f(x_{t-1})$, augmenting the feature set and training a new model $f(x_t, f(x_{t-1}))$. This improved the results significantly:

improving from 69% on the directional labels to 74%, as well as improving on the neutral and transition labels.

Discussion

While the predictions are quite high, some thought is required in terms of how to use as a trading strategy in its own right. Correctly predicting the label for 1 min is not a return easily captured, and was not the intent of the labeling. The directional labels point to a bar belonging to a persistent market move, rather than being individually actionable. The transitional labels are likewise useful in identifying the end of market moves.

The end goal was to produce another feature that would assist the RL algorithm in determining optimal exit strategy. The signal can be used on its own however, taking into account the meaning assigned to these labels.

Possible Next Steps

produce classification for longer bars, augmenting the finer grain predictions
consider an exit approach without RL using this signal (to avoid the extra complexity of RL)
add as feature in RL and determine whether there is a significant advantage of RL over signal-based heuristics
evaluate the predictions as a trading strategy in its own right

Learning the Exit (part 2)

Mon, 26 Apr 2021 11:00:00 +0000

As described in my prior post Learning the Exit (part 1), I have a model that indicates mean reversion entries with ~81% accuracy, however I did not have a good approach in handling the exit. While 81% of MR signals had a minimum profit of 25% (of prior amplitude), the mean profit available was 150%, pointing to a larger profit opportunity to be had if can better handle the exit.

I have found it very difficult to predict the magnitude of the optimal MR returns a-priori (at the point of entry) in these scenarios, achieving a $r^2$ of only 20% with various regression models. Hence the approach calls for online decision making as to how to position and when to exit post entry.

We want to determine a policy function $\pi_{\theta}(a | s_t)$ to indicate the appropriate action to taken for each bar post entry. In order to learn this function, be it with a genetic algorithm or reinforcement learning, we need to define an objective/reward function.

Objectives

The reward function should include the following objectives:

reward for proceeding on a profitable path
penalty for proceeding on an unprofitable path
penalty for non-performance in a prolonged sideways move
penalty for amount of risk assumed
transaction costs + slippage

A reward approach

Many papers take the approach of deferring reward until position exit, i.e. rewarding $r_t = 0$ from $T_{entry}$ to $T_{exit-1}$ while in position, and only presenting the cumulative reward at $T_{exit-1}$. However, this approach leads to the credit assignment problem, where individual actions during the holding period have an unassigned reward, requiring the RL algorithm to approximate the implied reward through some form of backward induction.

We will avoid the credit assignment problem, and hopefully motivate a faster and more accurate convergence towards a policy model. Providing continuous reward also should make the problem more amenable to gradient descent approaches.

Making use of our Labeler, we can label the cumulative return from the start of the MR signaling $T_{start}$ to some maximum $T_{start} + \Delta T$. Using these labels can determine the extent of upward, downward, and sideways price (return) movements.

Take 1

For each “bar” we will assign a reward based on the direction of the labeled extents:

for upward sloping return movements: reward = dr/dt x position
- note that dr/dt (slope) is positive, yielding a positive reward
for downward sloping return movements: reward = dr/dt x position
- note that dr/dt (slope) is negative, yielding a negative reward
for sideways movements: reward = 0

The cumulative reward in this scheme seems to be headed in the right direction, however takes considerable drawdown risk (-39 bps) in order to achieve an additional 10bps. I would like to see the optimum at the end of the 1st upward section (in this case).

Take 2

The prior reward system did not add extra penalty for drawdown. Without some additional penalty for drawdowns, the optimal reward will be achieved at an end point with the maximum return, regardless of unrealized drawdown. To ameliorate this we use the following scheme:

for upward sloping return movements: reward = dr/dt x position
- as above
for drawdowns: reward = drawdown penalty x dr/dt x position
- we add a penalty to discourage riding through drawdowns for minor eventual gains
for sideways movements: reward = sideways penalty x position
- this acts as a discount factor, penalizing for non-performance during extended sideways moves

Using a penalty of 2.0x for downward moves, the reward and cumulative looks like:

This is good, as now the optimum shows my preference for not riding through the -39bps of drawdown in order to achieve an additional 10bps above the 50bp move of the first section.

Taking this further

There are other aspects one may wish to incorporate into the reward function, for example:

time:
- do we discount rewards as time proceeds further away from signal inception?
transaction cost:
- do we incorporate transaction cost at the point of entry or do we spread the cost across the holding period? Allocating the transaction cost to the reward at entry would tend to present a negative reward on entry even when the entry is optimal (or likely to be profitable).
- we could potentially ammortize the transaction cost across the remaining direction. If the agent does not hold for the full extent of the direction, the remaining cost is allocated in the exit reward.
we could also penalize for the amount of noise in a move, preferring less noisy moves.

Finally, would love to hear from readers regarding your own RL experience, problem setup, etc.

Learning the Exit (part 1)

Mon, 26 Apr 2021 11:00:00 +0000

I have a model that indicates mean reversion with ~81% accuracy. Here is an example of a MR scenario the model identified:

The strategy is very good at detecting entry points (usually very close to optimal), however is fairly crude in terms of how it determines the exit. Given that 81% of MR (with this model) experiences a profit of at least 25% of the prior amplitude, one could take the approach of waiting for a pullback of 25% less stop-loss.

While there are various profitable fixed stop-loss and profit target regimes for this model, using a fixed target ignores market information post entry. Here are some problems I want to solve:

opportunity cost
- The average profit available is ~150% of the prior amplitude (due to the scenario this model detects).
- Selecting a conservative fixed exit target leaves “money on the table”
risk reduction
- many of the risky scenarios involve a runaway trend post signal; this is detectable
- blindly holding for a k% exit, ignoring information post entry, does not make sense

Approaches

We might start with a labeling approach using the Labeler to identity the target mean-reversion extent. The labels would indicate whether we should enter, build position, reduce, or exit. Given the labels, we can attempt to construct a ML driven model :

supervised classifier to predict individual labels (not recommended)
- This is unlikely to work without significant noise or lag. The features may not have support for all movements or periods in the movement. For example some movements may be associated with significant directional volume and others might not be. Forcing a model to be able to predict all points in the timeseries is doomed to failure.
learn a policy
- Using either RL (Reinforcement Learning) or GP (Genetic Programming), learn a policy to decide what action to take at any point in time. These algorithms are not (may not) be penalized for missing opportunities, rather can focus on learning detectable scenarios.
- we can make use of the labeler to inform the policy we are trying to learn; However, unlike classification, we are not “forcing” the model to learn every label. More on this when we discuss the reward function in the next post.

I will focus on learning a policy in this post.

Application of RL

There are many great papers and tutorials for reinforcement learning, so will not go into great detail here. Our goal is to create a model that decides how to trade our MR exit. RL is framed as a markov process, where we progress from one state $s_t$ to the next $s_{t+1}$, based on the action $a_t$ taken by a “policy function” $\pi_{\theta}(a | s_t)$.

Given our entry signal, the policy function $\pi_{\theta}(a | s_t)$ will propose an action $a_t$ for each bar post MR signal, informing how to trade the prospective MR entry and exit. The actions it takes can either be discrete (for example Buy, Sell, None) or continuous (such as % allocation) in nature.

We learn $\pi_{\theta}(a | s_t)$ by training the policy function across many “episodes”. In our case an episode starts with the MR signal at time $T_{start}$ and ends within some maximum holding period $T_{start} + \Delta T$. Reinforcement learning requires some form of feedback in order to effect learning, this is accomplished by assigning a reward $r(a_t,s_t)$ for each action given a state. The learning process attempts to learn a policy function that maximizes the accumulated reward across episodes (roughly speaking).

For a given episode, there will be an optimal sequence of actions $a_{t = 0} .. a_{t=T}$ that maximizes the cumulative reward function $Q(s_t,a_t)$. In Q learning, this cumulative reward function is expressed as (source wikipedia):

Note that the above expression is only applicable to discrete action spaces. Q-learning was an early implementation of reinforcement learning, later superseded by a variety of more effective deep-learning based approaches, such as:

A2C
DQN
DDPG
…

Example

For example in the entry and exit scenario pictured here:

the optimal action sequence for an episode in our MR trading might be:

{ $a_0$ = None, .., $a_9$ = None, $a_{10}$ = Sell, $a_{11}$ = None, …, $a_{80}$ = Buy }

In the above example:

the (downward) MR signal was early by 10 bars (the market continued to climb for 10 bars); hence the optimal action would be “no action” in the first ten bars.
the agent entered short at bar 10 and exited at bar 80.

Entering earlier would have reduced the cumulative reward, as an earlier entry would have resulted in negative reward for $a_0$ through $a_10$. Likewise, entering later, say at $a_20$, would have reduced the cumulative reward due to opportunity cost, not having harvested the rewards from $a_10$ to $a_20$.

Sub-Problems

Breaking this down into sub-problems:

determine feature set & state likely to support policy learning
- in addition to features want state indicating current position, P&L within episode, etc.
defining a “gym” where the agent trains on episodes (collection of MR entries and timeseries post prospective entry)
- track current state given actions and environment
- compute reward
determine reward approach
- what is the best way to reward so that our training converges towards a model with desireable behavior?
determine actions
- discrete or continuous action spaces: for example Buy, Sell, Hold (discrete) versus % allocation (continuous).

I will focus on the reward function in the next post.

Uniswap v3 & Liquidity Provision

Mon, 12 Apr 2021 11:00:00 +0000

I had a look at Uniswap’s upcoming V3 protocol overview and white paper. There are some substantial improvements above the current V2, such as:

ability to offer “concentrated” liquidity (we’ll discuss this below)
ability to introduce pools with varying fee structure
advances in oracle efficiency and new oracle types
..

For a market maker, the most important of these is the ability to “concentrate” liquidity. What does this mean? As of V2, one’s liquidity was uniformly distributed across all price levels, as opposed to being distributed near the prevailing price. In limit-order book terms, this would be equivalent to offering 1/k coins at each price level (given a maximum of K price levels). This is very capital inefficient as most trading occurs near the prevailing market price.

Here is a diagram from the whitepaper illustrating the liquidity allocation of V2 versus possible distributions of liquidity in the uniswap V3 pool:

In V3, through the use of range position contracts, one can effect any distribution of liquidity desired.

Problems

Being able to center liquidity around the mid price (or whatever price is advantageous for inventory) is an indispensable tool for market makers (and yield farmers for that matter). However as nice as this seems on the surface, has some significant issues:

transactions in the ethereum blockchain take 10 - 20 minutes to transact
gas fees are expensive and progressing higher due to the large # of ERC tokens on Ethereum

A traditional market maker will adjust orders (liquidity positions) at high frequency to remain competitive and keep liquidity near the prevailing market price. With a transaction overhead on the Ethereum blockchain measured in 10 - 20 minutes, would require the market maker to spread liquidity across a wider price range in order to maintain participation in the market, diluting capital efficiency and incurring more risk.

The second issue relates to cost. Centralized exchange, both traditional and crypto, do not charge fees for order placement. The gas fees inherent in updating the liquidity position range on Uniswap would be prohibitive for traditional market making.

A Solution: Pegged Liquidity Contract

Uniswap should introduce a Pegged Liquidity order (contract). The contract would indicate the following:

allocate K coins according to some distribution relative to the prevailing price
express the distribution as:
- x coins at level price +/- offset1
- y coins at level price +/- offset2
- …
- up to a sum of K coins
limit the lowest (highest) price will offer on
- this avoids following the price in a flash crash or just some level below which the market maker considers uneconomical

As the prevailing market price moves (as determined by the oracle), the liquidity automatically moves with the price. This will allow the market maker to participate in as many transactions as possible without having to issue a new directive on the chain to move the liquidity position when the market moves. This avoids both the ultra-high latency (10 - 20mins) and the costs (gas) of moving the liquidity to meet the price.

One can imagine other variations of this where the liquidity positioning only changes when the price increments by more than k%, keeping the liquidity/price offering constant for smaller moves.

The proposed contract solves some of the concentration problem, however, it cannot go as far in offering the granularity of control that is possible with centralized order books. With a faster and cheaper blockchain we can start approaching the efficiency of a central LOB.

Future of Uniswap

Uniswap’s Achilles heel is that it is based on the Ethereum blockchain. This has served it well until recently as many of the coins traded on Uniswap are ERC-20 tokens, avoiding the need for Ethereum-transplanted proxy coins. However, Ethereum presents substantial problems for a DEX:

non-deterministic transactions (required K blocks to verify transaction, much like bitcoin)
- by non-deterministic mean that addition of a transaction to a block does not guarantee the transaction; rather the transaction can only be assured after a certain number of blocks are added on the branch with the transaction.
- this increases latency substantially
scalability concerns (perhaps mitigated with side-chains)
- however side-chains are not the most elegant solution and may present new attack vectors
high costs (gas fees)

The future for DEFI / DEX’s is likely to be on one of the newer blockchains, ones that:

provide atomic transactions (ideally)
higher scalability
lower transaction costs
interop with other chains

In this regard, Cosmos looks quite attractive. There are other blockchains and DEFI projects in competition for the DEFI space (I will not enumerate them here). The Ethereum Blockchain is innovating quickly (unlike the Bitcoin chain). The momentum and the early lead of the Ethereum ecosystem could still win out in the end, in spite of the attractive properties of these newer chains.

Perhaps, though, the answer is not one L1 chain technology, but rather the projects that allow an inter-net of block chains - allow multiple chains to coexist and seemlessly interoperate. Cosmos is playing in this direction.

Stable Coin Minting & Momentum

Sat, 09 Jan 2021 11:00:00 +0000

A friend of mine (thanks Adal) suggested that I look at other stable coins, again observing the relationship between coin issuance and momentum. Given that Circle (USDC) and Binance coin (BUSD) are widely traded, focused on these two coins.

Binance coin is newer than Circle, so I only have history for BUSD from late 2019. However, will compare the signal from these coins for events in 2020.

2020 Covid Bounce

Here is the combined signal for USDC and BUSD during the covid bounce (this looks like a better read than USDT):

2020 Q4 Momentum

USDC and BUSD accumulation is in advance of Q4 momentum:

Conclusions

Analyzing issuance of stable coin gives us a view on buying interest. A good signal would blend information from both of these coins (perhaps looking at the cross-correlation of excitation).