White paper

White paper: Frec Direct Indexing algorithm

14 min read
Share

Direct Indexing is an increasingly popular investment strategy that bypasses traditional investment in mutual funds or exchange-traded funds (ETFs) in favor of investing directly in the individual stocks within a given index, such as the S&P 500. By adopting this approach, investors can avoid mutual fund or ETF management fees1 and gain greater control over their portfolio, allowing customization based on personal preferences or objectives.

Tax loss harvesting individual securities

A core advantage of direct indexing lies in its capacity for tax-loss-harvesting (TLH), a tactic used to crystalize taxable losses in order to reduce taxes owed. To employ a TLH strategy, one periodically sells underperforming assets and invests the proceeds in similar — but not “substantially identical” — assets. This creates a taxable loss without significantly altering the investment strategy, which can then offset other taxable gains or be used as a deduction. This approach can be especially beneficial for high-income individuals or those in high-tax jurisdictions.

Summary

Many robo-advisor platforms leverage TLH while simultaneously tracking a major index, by swapping between similarly-performing ETFs during market downturns. There are two main drawbacks of this approach when contrasted with direct indexing. First swapping between ETFs that are not “substantially identical” often results in higher tracking error, e.g. an investor may end with a substantial investment into a Russell 2000 ETF despite aiming to track the S&P 500. Second, because these assets generally reflect broad market performance they don’t allow for loss harvesting at the individual stock level. For instance, consider a recent 12 months of S&P 500 performance during a recovery period (06/2022 – 06/2023), here the overall index generated a return of 14.4%, which would not offer an investor in one of the S&P 500 ETFs an opportunity to harvest losses at the end of this time period. However, as illustrated below, a total of 173 individual constituents were trading at a loss at the end of this period, each of these investments providing an opportunity to harvest capital losses.

Percent return for individual S&P constituents between the dates Jul 1, 2022 – Jul 1, 2023, sorted in ascending order.

For self-directed investors the efforts required to expertly balance index tracking risk with tax optimization are likely too time-consuming and intricate to justify the advantages. So such optimizations are likely best carried out through algorithmic trading software. Though one should keep in mind that there are many different algorithmic approaches to direct indexing and that solutions available in the marketplace will differ in terms of both the underlying software and the level of transparency into these proprietary approaches. Below we provide a brief explanation of the approach currently employed at frec.com.

Methodology

Generally speaking, simply harvesting losses for a given portfolio involves some amount of bookkeeping but in itself presents no great challenge for a software application. However, weighing the tradeoff between deviating from the target index allocations and harvesting losses, and deciding how to strategically reinvest any sale proceeds, entails quite a different level of complexity.

The methodology used in Frec Direct Indexing builds on the widely-used formulations in the work on Modern Portfolio Theory (MPT), also known as Markowitz Portfolio Optimization [1,3]. In this report we can only provide a high level description of these concepts but encourage the interested reader to research this area independently as it provides useful context when attempting to understand how our direct indexing algorithms operate. Further, direct indexing involved extending the MPT framework to consider the implication of capital gains taxes and specific tax lots. While much of the details here are not likely of interest to the casual reader we direct more detail is provided on these formulations in the Tax-Aware Portfolio Construction work of Moehle et al. [2], which closely match the formulations and approach used by Frec.

To understand how the benchmark-tracking v.s. loss-harvesting tradeoff is managed, let’s consider some examples. Suppose that the target index on a given day is defined as a list of percentage allocations into individual stocks, e.g. 7.1% APPL, 6.5% MSFT, 3.2% AMZN etc. Now, in our direct index portfolio we would like to harvest $100 in losses in AAPL but it would require us to lower the allocation of AAPL to 5.1% and reallocate that capital (2% of portfolio) elsewhere. There are 2 questions that arise, first how should that 2% be reallocated and, second, is the amount by which we will deviate from the target really worth the $100 loss?

To address the many tradeoffs arising from each individual portfolio we define a scoring function that describes the tradeoffs and the constraints (e.g. avoiding wash sales) explicitly and utilize state-of-the-art numerical solvers to find improved portfolios. The scoring function which the Frec Direct Indexing algorithm attempts to minimize is given as:

As a concise summary, this scoring function effectively penalizes the over and under allocation of each stock based on a percentage over/under. In the example above, for the 2% of excess cash the solution that minimizes the above scoring function would reallocate the cash to the other non-AAPL holdings in a way that is over-allocated by the same proportion to their weight. So, MSFT may move up from 6.5% to 7.5% but AMZN would move from 3.2% to 3.7% (both 1.15x over-allocated). 

Finally we come to whether harvesting that $100 in losses for AAPL is worth what we give up in terms of matching the target index allocations. This tradeoff is computed by taking all of these over-under allocation dollar amounts, squaring them, and comparing them directly to the $100 multiplied by the constant factor λ. This factor is set based on hundreds of historical backtesting simulations (detailed below) and we continually optimize this parameter on behalf of our customers in an effort to maximize long-term portfolio returns. If the scoring function results in a lower value by selling and reallocating the 2% of AAPL, then the automated system will do so at the next trading opportunity.

More advanced approaches may also include the correlations between the different stocks in the portfolio, which can improve tracking error even further. While our current empirical studies have not shown a significant performance improvement with this approach we continue to evaluate this direction.

Performance

At Frec we take a scientifically rigorous approach to building our investment products, particularly our automated investment strategies such as direct indexing. While there are a number existing academic and industry papers exploring the benefits and drawbacks of direct indexing (see e.g. [4,5,6])  it is very challenging to independently verify these findings from the published results. In order to evaluate the performance of our direct indexing implementation for the target index we offer (S&P 500), we elected to run extensive simulations on historical data for S&P 500 constituents going back 20 years.

Data

The quality of any simulation is only as good as the underlying data used. We are particularly proud of the considerable efforts taken to produce high-quality simulations and continue to refine our simulations going forward. The dataset used for our simulations come from two sources, first from S&P Global we obtained the historical weighting used by the S&P 500 index on a daily basis going back to 2003. Additionally, we license dividend and split-adjusted daily stock prices from our primary data provider Xignite. Even with high quality data sources there were still dozens of data artifacts (e.g. mergers & acquisitions) and inaccuracies that had to be detected and corrected for, which was done manually by our engineering teams.

Historical Simulations

In order to provide evidence for the performance of direct indexing we consider the results for  multiple simulations using overlapping 10 year periods, with starting dates every 90 days, over the period of 12/2003 – 06/2022. In each of the 36 simulations runs we iterate through time on a weekly basis calling the direct indexing algorithm and simulating the resulting trades. Also, in this section we consider the scenario where there is only a single one-time deposit of $50,000 at the beginning of the simulation time period2. While direct indexing does indeed work best when regular deposits are used, the performance implications of each of these deposits can generally be considered in isolation, more on this below.

Importantly, two different direct indexing simulations were conducted, the first simulation does not assume any of the tax losses harvested are reinvested into the portfolio while the second assumes that any tax losses can be converted into cash quarterly using an assumed tax rate of 42.3%. This latter tax rate represents an investor in the 95-98th percentile tax bracket as is described in the “Investor A” profile of Khang et al. [5]. To illustrate, consider simulation results from the period 2003-12-17 to 2013-10-23, below is a time series of capital losses incurred in each of these simulations: 

Capital losses for direct indexing with reinvestment (blue, ID: 19fc) and no reinvestment (green, ID: ad6a).

Here, the capital losses for the reinvestment simulation (ID: 19fc) are significantly higher, as cash obtained by tax savings reinvested into the account grants further opportunities for tax loss harvesting. To illustrate how this quarterly reinvestment affects the overall performance we can look at the portfolio value over time for this same example:

Portfolio value for direct indexing with reinvestment (blue, ID: 19fc) and no reinvestment (green, ID: ad6a) compared to the SPY ETF (dotted black), 0.1% advisory fee not included.

Both of these simulation variants are helpful in understanding the benefits of direct indexing as the results depend heavily on each investor’s individual tax situation. The reinvestment scenario represents near the best case for direct indexing, assuming that the investor has capital gains each year that can be written off and reinvested in the market immediately. The scenario without reinvestment represents somewhat of the worst case scenario for this time period, where the investor at the end of the period has carried forward $25.5k losses (51% of the initial deposit) the value of these losses can be assessed by the individual familiar with their tax situation. Finally, by leaving out reinvestment we can understand how closely the direct indexing portfolio matches the performance of the target S&P 500 index, in this example the final portfolio value is within 0.3% of target.

Multiple Simulation Results

In order to get a better overall understanding of the performance for direct index we may consider the aggregate results of multiple simulations over different time periods. Here, unfortunately, the data available for backtesting simulations is somewhat limited by the quality and resolutions of historical data. Especially when we consider that direct indexing is a long-term investing strategy and must therefore be evaluated on time periods relevant to the curious investor.  

In light of this, we elected to run overlapping simulations on the historical data in an attempt to make the most of what data is available. Specifically, 10 year periods with 90-day staggered starting dates were used for our analysis, for example, the first simulation ran for the period 2003-12-17 to 2013-10-23, the second simulation ran from 2004-03-16 to 2014-01-23, and so on, overall a total of 36 such simulations were conducted. Note, that the results for these simulations are deterministic so multiple simulation runs on the same time period are not required or desirable.

As mentioned above, these simulation runs consider only the single $50,000 deposit setting as we feel that this setting provides the most clarity into the performance of direct indexing. In the regular deposit setting individual deposits can have some interaction, e.g. deposits can help fill in underweight positions sooner rather than waiting for a loss harvest opportunity, but these interactions have a relatively insignificant effect on the total losses harvested. As such, one can likely best understand the benefits of a multiple deposit setting by simply adding up the expected benefit of each deposit individually.

The primary statistics of interest for direct indexing relate to the question of how much capital loss can one expect to capture, and when in the lifetime of the portfolio can these losses be expected to occur. To answer these questions consider the following figure.

Average harvested losses in each year following the initial deposit as a percentage of the deposit, error bars are shown at two standard deviations3.

Here the losses are averaged over all 36 simulations for each year following the deposit and plotted above as a percentage of the initial deposit. After 10 years the direct indexing algorithm harvested losses summing to 45% of the initial deposit ($22,567) when reinvesting and 40% ($19,820) without reinvesting losses. Additionally, this data indicates that the large majority of the tax losses harvested occur in the first few years of a deposit, with more than half of the losses occurring within the first 2 years.

Next we consider the question of how much additional return might be expected from utilizing the direct indexing strategy. To answer this question, simulations where losses are reinvested into the portfolio become necessary simply due to the fact that if the losses harvested from direct indexing are never utilized then no financial benefits can be observed. Recall that for the reinvestment case losses are reinvested quarterly at an assumed tax rate of 42.3%, e.g. if $100 of net losses are harvested in Q1 year 3 at the end of the quarter an additional $42.30 will be deposited into the account and reallocated by the direct indexing algorithm.

Average excess return (portfolio value – benchmark value) as a percentage of the initial deposit, error bars are shown at two standard deviations3.

The above figure shows the excess returns as a percentage of the initial deposit relative to the benchmark (SPY ETF) averaged over all 36 simulation runs with and without reinvestment. Excess return here is defined as portfolio value – benchmark value, so the overall average excess return for the reinvestment portfolio is 51.3% and the average excess compounded annual growth rate (CAGR) (a.k.a alpha) is 2.21%4 – note that this number does not include the 0.1% advisory fee. The simulations without reinvestment show fairly consistent tracking error over the course of the simulations but do deviate from the index over time. Note that the direction of the deviation is not as meaningful as the magnitude of the observed variation in the excess return5. For year 10 the standard deviation of the observed excess return over the 36 simulation runs was 8.2% or approximately +/- 0.77% annually.

Discussion

The Primary observation from our perspective is that these historical simulations indicate that direct indexing can harvest a significant amount of losses from a single deposit over a short period of time, while closely tracking the broader index performance. Additionally, these harvested losses tend to occur relatively soon after the deposit is made, which is expected since the cost basis of the purchased assets are most similar to their market prices immediately after purchase. Any small drop in prices will therefore lead to an opportunity to harvest losses, leading to a secondary effect: the repurchase of more stocks at their new cost basis. 

While it’s generally a good thing that the benefits occur early, so that they may be realized and reinvested sooner, one can also make the observation that the benefits of direct indexing diminish over time. Again, this generally makes sense as we expect the assets in our portfolio to grow over time away from their cost basis.

If you have any questions about the details, feel free to reach out at help@frec.com or schedule a call here.


References

[1] Markowitz, H.: Portfolio Selection. J. Finance 7(1), 77–91 (1952)

[2] Moehle, N., Kochenderfer, M.J., Boyd, S. et al. Tax-Aware Portfolio Construction via Convex Optimization. J Optim Theory Appl 189, 364–383 (2021). https://doi.org/10.1007/s10957-021-01823-0

[3] Grinold, R., Kahn, R.: Active Portfolio Management, 2nd edn. McGraw- Hill, New York (1999)

[4] Chaudhuri, Shomesh and Burnham, Terence C. and Lo, Andrew W., An Empirical Evaluation of Tax-Loss Harvesting Alpha (March 5, 2019). Available at SSRN: https://ssrn.com/abstract=3351382 or http://dx.doi.org/10.2139/ssrn.3351382

[5] Khang, Kevin and Cummings, Alan and Paradise, Thomas, and O’Connor, Brennan, Personalized Indexing: A Portfolio Construction Plan, (March 2022), https://corporate.vanguard.com/content/dam/corp/research/pdf/personalized_indexing_a_portfolio_construction_plan.pdf

[6] Wealthfront Inc., Wealthfront’s US Direct Indexing, https://research.wealthfront.com/whitepapers/stock-level-tax-loss-harvesting/

    This white paper describes implementation and performance details of a Direct Indexing approach similar to that used on the Frec platform at the time of writing (10/01/2023) details may differ from the implementation used in the product now and in future. This paper may be amended at any time to reflect new findings, improve readability, or correct inaccuracies. A list of notable corrections follow. 10/11/2023: Estimates for the variation in the excess returns for direct indexing were updated from 2.7% to 8.2%, the previous value represented the standard deviation of the estimate opposed to the observations. 10/10/2023: The charts and relevant text were updated with corrected data for the benchmark index in year 10.

    This white paper is for information purposes only and is not intended as tax advice. Frec refers to Frec Markets, Inc. and its wholly owned subsidiaries, Frec Advisers LLC and Frec Securities LLC. Frec does not provide legal or tax advice and does not assume any liability for the tax consequences of any client transaction. Clients should consult with their personal tax advisors regarding the tax consequences of investing with Frec and engaging in these tax strategies, based on their particular circumstances. Clients and their personal tax advisors are responsible for how the transactions conducted in an account are reported to the IRS or any other taxing authority on the investor’s personal tax returns. Frec assumes no responsibility for tax consequences to any investor of any transaction. 

    The S&P 500© index is a product of S&P Dow Jones Indices LLC or its affiliates (“SPDJI”), and has been licensed for use by Frec Markets, Inc. S&P®, S&P 500®, US 500 and The 500 are trademarks of Standard & Poor’s Financial Services LLC (“S&P”); and these trademarks have been licensed for use by SPDJI and sublicensed for certain purposes by Frec Markets, Inc. Frec’s direct indexing strategy is not sponsored, endorsed, sold or promoted by SPDJI, Dow Jones, S&P, their respective affiliates and none of such parties make any representation regarding the advisability of investing in such product(s) nor do they have any liability for any errors, omissions, or interruptions of the S&P 500© index.

    The effectiveness of Frec’s tax-loss harvesting strategy to reduce the tax liability of the client will depend on the client’s entire tax and investment profile, including purchases and dispositions in a client’s (or client’s spouse’s) accounts outside of Frec, the type of investments (e.g., taxable or nontaxable) or holding period (e.g., short-term or long-term. The performance of the new securities purchases through the tax-loss harvesting service may be better or or worse than the performance of the securities that are sold for tax-loss harvesting purposes. 

    1. Frec’s Direct Indexing does not charge a management fee, but does have a 0.10% annual fee.
    2. The $50,000 initial deposit was chosen to align with Frec’s initial minimum requirement for its direct indexing strategies. It is also an amount that allows for investment in almost all of the stocks of the tracked indices.
    3. Error bars indicate the approximation error of the estimate, not to be confused with the observed variance of the simulation runs.
    4. There is an important subtlety when considering the average CAGR statistic, primarily that the avg(CAGR(return_i)) is not equal to CAGR(avg(return_i)) as both numbers are defensible  in our reporting we used the smaller number avg(CAGR(return_i)).
    5. We consider deviation from the benchmark performance in the direct indexing strategy in either direction to be equally undesirable, if any algorithm could indeed consistently identify stocks that over/under-perform the benchmark it could be turned into a profitable pricing strategy which we assume does not exist in an efficient marketplace.