BRUTE wrote:
1) does (falsely) extrapolating get less likely with bigger datasets?
2) for example, it's certainly (falsely) extrapolated if brute goes all-in on a stock that did well this year. but if a portfolio does well over most of 45 years, brute feels intuitively more certain. is that intuition wrong?
3 )brute wants his to be mostly passive, and actually values stability. and it might be true that he's engaging in over optimization in the past.
4) but there are tons of portfolios that do terrible on many accounts even in the past - so wouldn't using one of those be worse...
5) for example, while the PP might have been hypothesized before actually looking at the data, it would have historically limited gains pretty strongly. so it would be a case of maybe running into a future problem vs. almost certainly running into the problem the PP has had, always. (of course not applicable if the goal of the portfolio is not to grow, but just to protect existing money - which the PP seems to do).
6) the TFP mentioned by FBeyer still had 9 year drawdowns, and didn't do so well in 2008. it did survive 2000 remarkably well, and has been doing well after 2009 though.
point being, it seems like there is no free lunch.
I wish we had a white board and a conference room... I don't even know the financials, but the data part alone could take a few hours of lectures. From a data point of view I think these are good questions, I'll do my best to answer to the best of my current knowledge.
1) No. And Yes.
If 'the process that generates the data' is the same, then more data is better. If the process is different, you need to identify the characteristics of the process to know how to interpret the outcome. Economic conditions change over time, and so the process that generates returns differs over time. Every specific return at some point in time is a result of politics, investor expectations, actual conditions, current momentum and expected stability of returns over time. History shows that unstable times yield more than stable times. Times have been quite stable since the end of the second world war right? Many argue that the post-war buildup has given investors an unrealistic idea of the long term gains of general investments, ie indexing. Go back 70 years and 'things always go up'. Why is that necessarily the case for the next 30-40 years where you and I are supposed to live off of our investments?
Statistical analysis can only give you more information after you applied as much expert knowledge as possible. Math and graphs do nothing on their own, without coupling to domain specific knowledge. It is the most difficult task of the statistical consultant to goad that knowledge from the client before, during and after gathering data. The graphs support the hypothesis. You cannot form a hypothesis based on past data and conclude that you are right, you must conduct a new experiment to see if the hypothesis applies to newly acquired data as well. Pattern recognition is a discipline all on its own, and the alluring thing about pattern recognition is that you seem to get results from data immediately, but what you truly get is 'ideas to test' from past results. It's very hard to explain to people why concluding anything from a past pattern is not a result, but a suggestion. The confusion of the two is the basic pitfall of p-hacking. Expert knowledge on a domain is the pillar on which the data analysis and the pattern recognition rests, not the other way around.
More data means more power. Power means the ability to detect smaller effects/differences. The actual causality of the difference is up the expert to hypothesise on, then back to the statistician who has to help conduct a new experiment to see if the hypothesis is correct. That means that with a lot of historical data, we're quite certain that small cap value stocks HAVE indeed generated a higher return that the total stock market. Why that is is up to someone else to find out. whether that will be the case in the future depends on the underlying mechanism that generated the excess return.
TL;DR extrapolation works when you know you have the same conditions in the domain you're extrapolating to as the domain you're extrapolating from. That is why a higher sampling density on a closed interval gives you better statistics, but analytical forecasting from more data does not. You require expert knowledge of the subject to make efficient use of a forecast.
2) Uncertainty on the mean. By sampling over an index rather than a single stock you're averaging the effect of every single company out and try to capture the compound effect. By analysing more companies over a longer time frame, compared to one company over a short time frame you're simply getting a better estimate of the uncertainty on the mean, or the uncertainty on the CAGR is you will. An investor will most likely want to capture one-tailed outliers, not the average. Your knowledge of the CAGR is better. The improved statistics should give you a much better confidence interval when projecting into the future, yes. That is, if the underlying data generating process is still the exact same... If that is not the case, then the effect from the erroneous data generating model is much more important than the actual accuracy of your parameter estimates from past times. You can extrapolate quite well if you don't extrapolate very far, but you and I are most likely trying to extrapolate 30 years into the future with our lazy portfolios aren't we?
The intuition is not wrong, but rather off. You're more certain that the returns are actual returns, that the numbers are real if you will, rather than flukes of mispricing (fuck the EMH) but the continued realization of those returns in the future depends on point 1) above. The index is more well-priced, almost by definition, than single companies are. Whether the index is correctly priced is something else entirely.
Investments will make money if the fundamentals to make money are there and if the prevalent investor psychology is there to drive prices up. That is the basic issue with fundamental analysis: A vastly underpriced company will not rise in price until the rest of the stock market catches up. Are the fundamentals to keep driving the stock market up at the same pace as in the past there for the next 30 years?
3) If you want stability, why are we talking about a portfolio composed of the two most volatile stock indices? The Permanent Portfolio is designed to be stable. The Global Portfolio is designed to be lazy. Choose one of those instead. There is expert knowledge behind the design of those two, rather than being a back testing driven build.
4) Yes. Do you KNOW and UNDERSTAND why those portfolios fared badly in the past or can you just see that they fared badly by looking at the graphs on portfoliocharts?
5) The PP does exactly what it was designed to do: Preserve capital. It's exact performance based on what it was designed to do is, IMO, one of the reasons why so many people talk about it. This was not a CAGR/efficient frontier optimized portfolio, it had a specific outset and was built from first principles.
6) I pulled numbers out of my ass (PUMA). My point was not to find a portfolio with the smallest drawdowns in the past, that would most likely be the Golden Butterfly, my point was more that your choice of reference ie the TSM was arbitrary. Again, apply knowledge, then dig around in the data, not the other way round.
I know I don't know a lot of things. If I knew things, I wouldn't need to diversify my own investments...