On the Theory of Asset Allocation Part 2

In this post, I would like to show some research I have done on the front of utilizing portfolio theory to efficiently and optimally allocate capital to a pair of systems.

Traditional theory applied outright can be problematic. As I mentioned in the previous post, the assumptions that go in are not consistent in the long run, ie expected return, therefore the optimized portfolio performance will deviate significantly from backtest results. This is similar to developing a system on data that no longer reflect the current market state which will ultimately bankrupt you.

In my opinion, there are ideas within the traditional framework that are useful. Expected return of individual assets may not be reliable, but expected return of a well designed system should be, in a probabilistic sense.

For the past year, I started to view everything as a return streams. By this I mean rather than differentiating between assets and trading systems applied on assets, one should look at them equally. Although this may sound obvious, I will come back to this subject later and expand on it. But this way of thinking has helped me go against traditional methods of system design to built more robust systems that are model free and parameter insensitive.

In portfolio theory, the lower the correlation between instruments the better. In this experiments, I will be referring to two trading systems that are different in nature; Mean Reversion and Trend Following. Both of these trading systems will be applied to the same asset, SPY. System details:

Their daily return correlation is 0.04. The following is their equity curve.

Both of them are profitable and aren’t optimized as all. Test date is 1995-2012. Daily data are used from yahoo finance and no commissions or slippage was taken in to account. Next is their risk reward chart as popularized by traditional theory.

MR = Mean reversion, TF = Trend following, SPY = etf. From an asset allocation point of view, MR seems to be the most desirable of the three. Up next I show the efficient frontier of the two systems from 2000-2004 and then based on the minimum variance (MV) allocation in that period, I will forward test it. More concretely, I will compare the portfolio level equity curve from trading the two systems together.

The MV allocation is the left most point on the curve. It’s the allocation that minimizes portfolio variance. The following is the equity curves from 2005-2012 with different allocations.

Apologies for not plotting the legend! The red curve is the buy and hold of the SPY, the blue is a equal weight allocation between the two systems, and the orange curve is the MV allocation (~19% TF and ~81% MR). From a pure return perspective, trading the MV allocation produced the most return but from a risk reward standpoint, the equal weight allocation is better. In my optimization process, I found that the allocation that maximizes sharpe ratio would be allocation 100% to MR system. Now the numbers…

If I were to choose, I would go with the equal weight allocation as it has in my opinion the features that I will be able to sleep at night. I am not going to discuss the results in more depth as its well passed midnight, maybe another time I will come back and do something different.

Note: the portfolio level testing was simulated using tradingblox while the optimization and plotting were done in R. If you have any questions or comments, please leave a comment below! Email me if you want the TB system files.

Machine Learning

I’m going to do something a bit different today. With this post, I would like to start on documenting some of my conquests in Machine Learning (ML). I personally find this subject fascinating and couldn’t wait 2 more years before I can take the data mining course offered at my school.

Warning, I am not trained at all in data mining and all of the following ideas are really bits and pieces of knowledge I’ve accumulated from the web and books I’ve been reading for the past month or so.

In ML, there are usually two categories with which users can categorize what they are trying to accomplish. The first one is a regression problem, and it is about taking a set of training data and predicting continuous ouputs which are usually numbers. For example, given your height, I would like to predict your weight. The next group is called “classification” problems. These are problems where by you take training data (inputs) and spit out which categories they belong to. Although the output can also take numeric values, they are usually dummy variables that are representing different factors. So continuing with our early example, given your height and weight, a classification problem will try and predict whether you are a male or female.

There are a lot of tools available to the data scientist. A list of different algorithms can be obtained from the following wikipedia page. (ML Algos)

Today, I would like to share a really simple ML algorithm called K-Nearest Neighbor Algorithm, short for KNN. This is a classification algorithm that takes in a vector of train data set and predicts what category the input belongs to. The process is achieve by the following steps

1. Plot training data on a 2D plane

2. Plot the value that you are trying to predict on the plane

3. Find the nearest K point(s) and initialize a vote

4. The category belonging to the new point is the majority of the points surrounding it

The K in “KNN” is a user defined integer parameter which specifies how many closest points should the algorithm take into consideration when determining the category for an input variable.

For my little experiment, I will be using Weight and Height Data to predict whether someone is a female or male. I’ve separated the data in to training (in-sample) and testing (out of sample). The below graph shows all the data points graphed and categorized.

To gauge performance, I will use mis-classification percent (error rate). In the next graph, I graphed out of sample error rate as a function of K.

As you can see there is a obvious downward trend in increasing K. This is can be attributed to the fact that with more surrounding information, the model can increase accuracy in predicting. In the future, I hope to post more on this and other relating methods.

I have found tremendous commonalities between ML and trading system development. Although it would be very naive to use ML outright to predict stock or asset prices, I brainstormed a few ways of using ML to build trading systems and improve existing testing methods. I hope to blog about this in the future. Stay tuned.

Code for The Above Exercise (drop me a email for dataset= [email protected])


require(class)
require(ggplot2)

train.raw<-read.csv("hw_training.csv")
test.raw<-read.csv("hw_test.csv")
train<-train.raw[,2:3]
test<-test.raw[,2:3]
result<-knn(train,test,cl=train.raw[,1],k=3) #knn algo
k=c(1:10)
p=rep(0,10)
sum=cbind(k,p)
colnames(summary)=c("k","Mis_Class")

#optimization for different values of K
for(i in 1:10)
{
 result=knn(train, test, cl=train.raw[,1], k=i)
 summary[i,2]=(nrow(test)-sum(diag(table(result,test.raw[,1]))))/nrow(test)
}
#plot
ggplot(train.raw,aes(x=Height,y=Weight,color=Gender))+geom_point()
qplot(summary[,1],summary[,2],geom='line',xlab="K",ylab="Mis-Classification (%)")

Popular Valuation Ratio

When ever you go about reading or researching companies at different websites, they always offer some sort ratio analysis. When a stock selling cheap relative to its book value, it seems like a good buy. Some screen for stocks based solely on them. But has the “intelligent”  investor ever thought if they actually have sort of predictive power for future return?

In this post, I take the popular valuation ratios and see how they perform. My experiment will be simply screening and buying stocks that are in the top percentile. I believe that it is more robust to use percentiles than hard fixed thresholds as stocks in different industries have different threshold that characterize their value.

The ratios I will be testing are Price To Sales Ratio, Price to Earnings Ratio, Price to Book Value, and Price to Free Cash Flow. The backtest will start from 2001 and end in 2012 and positions are rebalanced quarterly. When I am referring to top 20th percentiles, I am referring to stocks that have the “lowest” ratios in their respective universe of stocks. My universe contains 3647 stocks that are liquid and have market capitalization of 50 million or more; no ADR are included. All data are adjusted for survivor-ship bias. Below are the equity curves.

From the above you can see that compared to buy and hold, if the investor were to buy stocks with lowest respective ratios, they would outperform the market in the last decade.

But looking at the raw numbers, the investor would’ve have to put up with much higher drawdowns. In the upcoming weeks, I will come back in the future with a rolling return graph to show the consistency of these ratios performance over time.

SE

Value Investing with Risk Management

When I started researching about investing at the age of 16, I only believed in value investing. The idea of buying things cheaply intuitively made sense. You’d find me constantly researching and reading value investing books night after night. As each company I researched had different business models and economics moats, a lot of my earlier methods of analyzing stocks weren’t systematic and therefore couldn’t really be tested. Besides where could a teenager get his hands on data back in 2006.

I stopped all my efforts in discretionary value investing in early 2010 but 2 years today, I finally could put my earlier ideas to test. Let me define some of my ideas for researching companies. There are a lot of ways to value a company. Be it using earning projections to discount future cash flows to deriving value from assets in the balance sheet, many value investors can only agree that value is dependent on how one defines it. For myself, I didn’t like outright projections of earnings due to the myriad of factors and assumptions I had to make regarding the future. Instead I placed heavy emphasis on assets on the balance sheet which really took out companies that weren’t heavily asset backed. Within the balance sheet, I really just simply looked at how healthy things were. The ratio of liquid assets verses long term debt and short term debt should be healthy. The current price should be trading near its book value per share minus liquid cash. Profit margins should be high indicating market share. One of my favourite ratios to look at is comparing trailing 12 month net income to total debt. Reason being the higher it is, the more assured I am that the company can pay money back to it creditors. These are just some of the analysis I do but what comes after is entirely subjective hence hard to know if its lucky or hard work.

To test my earlier methods of analyzing stocks, I have created a stock screen that screens and buys stocks based on the following 2 conditions.

1. Closing price of stock must be less than Book Value per share (quarterly)

2. Annual Free Cash flow > Total Annual Debt

Test date are from 2001-2012. Rebalancing quarterly. Stock universe is based on the Russell 1000 adjusted for survivor-ship bias and splits. My benchmark is the SP500. The results, assuming 100 dollar invested:

The simple strategy multiplied the initial capital by more than 8 times compared to the benchmark which barely moved in the past decade. Its true that no matter how nice a business or its fundamentals are, a rising tide lifts all boats. Simply running this stock screen, the investor would face drawdowns of 50%, a bit more than holding the SP500 which endured a 49% MaxDD. What risk management measures can be implemented to improve the performance?

In one of my earlier posts I mentioned about trading the equity curve. The modified strategy will be taking signals until the 4 quarter total equity value is less than its own SMA which then I will be favouring cash rather than invested in equity. The results:

From the above image, one can see that there indeed is a reduction in risk associated with the modified strategy. Although Sharpe improved by 2.8%, MAR improved by 74%. Further research I think may improve results is ranking each quarters signals by some measure of price, volume, or volatility.

This was really an eye opener as I would never have dreamt of actually testing my earlier methods in such a manner. I always thought that value investing would be more of an art compared to systematic trading. In the future I hope to start messing together my research in value and momentum to show that combining uncorrelated edges improves performance.

SE

Low Volatility

Engineering returns did a piece on whether low risk outperforms high risk and his results confirmed that it indeed does. He measure risk as being the historical volatility. In this piece, I’d like to do it with another measure and apply it to three different indices of stocks to test for robustness.

In the following study, I will be using beta as the measure for volatility. My universe of stocks is separated into 3 portfolios: SP500, custom Mid-Cap, and custom Small Cap. My custom mid-cap portfolio consists of 950 stocks with market cap between 1B-5B while my custom small-cap portfolio consists on 1100 stocks with market cap between 300M-1B. All portfolios and data are from Thompson Reuters point-in-time database adjusted for survivorship bias.

To test, each week  I will be buying ether the top 50 stocks with the highest beta or buying the bottom 50 stocks with the lowest beta. The rationale behind this is that stocks that are low in beta are less volatile while vice versa for stocks that are high in beta. Rinse and repeat each week. Test starts from 2001 to 2012.

 

 

The above chart represents the strategy applied to the SP500 stock index assuming 100 dollars initially invested. One can clearly identify the out-performance of the strategy which also has a smoother equity curve

 

 

The above table represents the corresponding return and performance measures against each respective benchmarks. As you can see low volatility (Bottom 50) stock have outperformed in all instances with lower drawdown and standard deviation of returns. Note that mid-cap low volatility strategy underperformed its respective benchmark. I do not have an answer but it will require further testing to confirm if size affects the performance of the low volatility anomaly.

SE

Volatility Parity

When I first started out in system research a year ago, I was told that in this business, if you can achieve return with lower volatility, you will definitely attract a lot of people’s attention. Since then, I’ve found myself to leaning towards strategies with lower volatility, usually achieved through proper volatility management.

In this post, I’d like to take a look at portfolio volatility by using some tools from portfolio theory. I’d like to show that through peeling into volatility one can better manage their portfolio.

There exists a fine line between academic finance and practitioners of finance. The opposing ideas are  whether the markets are efficient or not. I am not going to dive in to the discussion of this, but I stand to reason that there are no rules or equation to the markets. They are ever changing, therefore, I believe that one should treat every concept as tools.

A bit of equations…portfolio variance is defined by the following equation. I am only going to use a two asset class example to avoid bringing in the use of covariance matrix.

        

    

 

The variance contribution of each asset is thus…

In my opinion, the above equations capture a lot of information that can be used to manage volatility. At any given time multi market strategies will have more than one position. If you are able to position size each position so that each one contributes equally to overall portfolio volatility, you will have much smoother balanced and diversified portfolio.

In the following graphs, I calculated according to the above equations how Bonds and Stock contribute to aggregate portfolio variance. For stocks, I used SPY and for bonds, I used IEF, both are exchange traded funds. This is a rolling 252 day graph with traditional 60/40 allocation.

 From the above graph one can infer that the volatility contribution is not equal and at times, you will see that stocks will contribution more than 100% while bonds contributed negatively.

The above graph also gives a pretty good market timing signal. When bonds contributed negatively, it seems that the market is in turmoil and vice versa for stocks when it contributed more than 100%.

I hope through this, the reader will be able to understand volatility more and look and just how it affects your portfolio.

Combining Edges

It was early on when I started out in research that I found the academic space provides a lot of ideas and inspirational data. On such popped up this morning when I was over at CXO Advisory. They do such a great job in gathering and filtering out relevant research; I highly recommend them.

In the paper, The Supraview of Return Predictors, they combined a total of 333 stock return predictors varying from accounting based ratios all the way to priced based predictors like momentum. I though a picture would explain a thousand words…

Above graph’s Y-Axis is the gross Sharpe ratio while the X-Axis is the number of predictors. The different lines are varying correlations amounts predictors in portfolio. As the number of predictors included increases (equal-weighted), so would the Sharpe ratio.

The above image shows Sharpe Ratio (Y-Axis) versus the correlations (X-Axis) amongst predictor. The concept here has reinforced my own philosophy about combining uncorrelated strategies to achieve exceptional return.

Images source from paper.

Market Beat

In my continual efforts to follow the markets more rather than just focusing on the system development and research, I have created a new page called the “market beat”. Here is where I will be updating weekly asset class performance. Currently there are only 1 sets of graphics displaying 52 week high and lows. Asset class covered include Bonds, Equities, Precious Metal and US sectors. I hope to add more information soon when I have more ideas. Feel free to suggest some!

Equity Curve Trading

In 2006, Cambria Investment Management published a paper that gained wide spread popularity where it proposed that applying a long term 10 month SMA to assets classes (rules below) to trade them would effectively cut the drawdown while preserving the equity like return. Such a simple trend following method paired with ease of access to exchange traded funds for exposure has helped made the idea very popular.

I proposed a similar idea. Instead of applying a SMA to the asset classes, one would instead calculate a X month SMA of a buy and hold strategy’s equity curve. If the equity curve ever dips below the equity curve SMA, we will go to cash. In effect we are forming a hedge against our own equity curve.

Below are the results for trading the S&P 500 Index, ETF data are limited so I used underlying total return index as proxy. Data from 1988 to 2011 are total return, but 1970 to 1988 are normal composite data. I know that the numbers are probably a bit off and one should take it as instructive purposes.Test duration are from 1970 to 2012 (May). No commission or slippage accounted for. All signals are taken the following days open.

Th equity curve in the above image shows that it has avoided the last two major bear markets. It preserves return through being on the sidelines.

The above Drawdown for the entire test period confirms that the strategy is able to weather storms! Below is the rolling 10 month return of the strategy. You can see that it replicates the upside buy and hold return very well while. Pretty consistent across time.

I believe that there can be a lot of ideas that can be paired with this method. One such extensions I dreamt about was using traditional methods of portfolio optimization to form portfolio of assets that meet criteria like Minimum variance or risk parity. Overlay this method to only participate in the upside and staying in cash or investing in bonds when equity curve is below its SMA. You are bounded by your own imagination.

 

Long Vs Long Short Part 2

I promised that I would do another post on my findings. And here it is.

In my last post regarding my experiment about long vs Long Short, I found that Long only breakout trend following system show systematically higher Sharpe for the intermediate run. Giving it a second thought, I went on to do more sensitivity analysis to check if such phenomena was true for other trend following strategies. As if such was true, it may be justified to long the futures market outright.

In the secondary experiment, I will use the Dual Moving average crossover system and the Bollinger band Breakout system.

Simulation parameters are the same compared to the previous test. 56 instrument across all sectors and 1% of equity.

The Dual Moving Average system has two parameters, Short term MA and the Long term MA. I classify the short term MA any where between 10-40 days while long term to be anywhere between 50-200. In the following test, I fixed the short term MA and stepped through the Long term MA in increments of 10 as I found through stepping that the performance were nearly all the same, regardless of how I incorporated the parameters. Again, my fitness function will be the Annual Sharpe.

From this simple test, we can see that the incorporating the short side into a MA strategy always improved Sharpe. I did additional test to allow only short trades and found that all (not typo) the stepped tests that traded short lost money. Initially, I found that this to be counter-intuitive. If the short side was a losing proposition, how can it improved the result when combined together? One explanation I felt was acceptable was just the underlying nature of the entire system. From the smoothing nature of the moving average, the number of bars and the magnitude of each bar required to reverse a signal is greater compared to a breakout system. Given this, we can also infer that the number of false breakout of MA strategy is less than a pure breakout strategy. Therefore the MA strategy with lower false signals and less trades (due to less whipsaws) will have a short side that is less volatile. Aggregating long short together will nevertheless improve performance. (In a follow up post, I would like to confirm this with additional tests as currently I am not near my personal laboratory !)

Last but not least, I used another classic Bollinger band breakout strategy and varied the length of the MA. 

In this graph, I varied the days from 20 to 300 in increments of 10. I found the results to be confirming the last one. Nothing much can be said as a picture yields a thousand words. The conclusion is same.

All in all, it seems as if trading long vs long/short wholly depends on the underlying system. This is an acceptable conclusion since now I have statistics that confirms this. I hope the reader enjoyed this short experiment.

good trading