# iMetrica for Linux Ubuntu 64 now available

The MDFA real-time signal extraction module

My first open-source release of iMetrica for Linux Ubuntu 64 can now be downloaded at my Github, with a Windows 64 version soon to follow. iMetrica is a fast, interactive, GUI-oriented software suite for predictive modeling, multivariate time series analysis, real-time signal extraction, Bayesian financial econometrics, and much more.

The principal use of iMetrica is to provide an interactive environment for the numerical and visual analysis of (multivariate) time series modeling, real-time filtering, and signal extraction. The interactive features in iMetrica boast a modeling and graphics environment for analysts, practitioners, and students of econometrics, finance, and real-time data analysis where no coding or modeling experience is necessary. All the system needs is data which can be piped into the system in many forms, including .csv, .txt, Google/Yahoo Finance, Quandle, .RData, and more. A module for connecting to MySQL databases is currently being developed. One can also simulate their own data from a one or a combination of several different popular data generating models.

With the design intending to be interactive and self-enclosed, one can change modeling data/parameter inputs and see the effects in both graphical and numerical form automatically. This feature is designed to help understand the underlying mechanics of the modeling or filtering process. One can test many attributes of the modeling or filtering process this way both visually and numerically such as sensitivity, nonlinearity, goodness-of-fit, any overfitting issues, stability, etc.

All the computational libraries were written in GNU C and/or Fortran and have been provided as Native libraries to the Java platform via JNI, where Java provides the user-interface, control, graphics, and several other components in a module format, and where each module specializes in a different data analysis paradigm. The modules available in this open-source version of iMetrica are as follows:

1) Data simulation, modeling and fitting using several popular econometric models

• (S)ARIMA, (E)GARCH, (Multivariate) Factor models, Stochastic Volatility, High-frequency volatility models, Cycles/Trends, and more
• Random number generators from several different types of parameterized distributions to create shocks, outliers, regression components, etc.
• Visualize in real-time all components of the modeling process

2) An interactive GUI for multivariate real-time signal extraction using the multivariate direct filter approach (MDFA)

• Construct mulitvariate MA filter designs, classical ARMA ZPA filtering designs, or hybrid filtering designs.
• Analyze all components of the filtering and signal extraction process, from time-delay and smoothing control, to regularization.
• Adaptive real-time filtering
• Construct financial trading signals and forecasts
• Includes a real-time/frequency analysis module using MDFA

3) An interactive GUI for X-13-ARIMA-SEATS called uSimX13

• Perform automatic seasonal adjustment on thousands of economic time series
• Compare SARIMA model choices using several different novel signal extraction diagnostics and tools available only in iMetrica
• Visualize in real-time several components of modeling process
• Analyze forecasts and compare with other models
• All of the most important features of X-13-ARIMA-SEATS included

4) An interactive GUI for RegComponent (State Space and Unobserved Component Models)

• Construct unobserved signal components and time-varying regression components
• Obtain forecasts automatically and compare with other forecasting models

5) Empirical Mode Decomposition

• Applies a fast adaptive EMD algorithm to decompose nonlinear, nonstationary data into a trend and instrinsic modes.
• Visualize all time-frequency components with automatically generated 2D heat maps.

6) Bayesian Time Series Modeling of ARIMA, (E)GARCH, Multivariate Stochastic Volatility, HEAVY models

• Compute and visualize posterior distribtions for all modeling parameters
• Easily compare different model dimensions

7) Financial Trading Strategy Engineering with MDFA

• Construct financial trading signals in the MDFA module and backtest the strategies on any frequency of data
• Perform analysis of the strategies using forward-walk schemes
• Automatically optimize certain components of the signal extraction on in-sample data.
• Features a toolkit for minimizing probability of backtest overfit

Tutorials on how to use iMetrica can be found on this blog and will be added on a weekly basis, with new tools, features, and modules being added and improved on a consistent basis.

Please send any bug reports, comments, complaints, to clisztian@gmail.com.

# High-Frequency Financial Trading with Multivariate Direct Filtering Part Deux: Index Futures

Out-of-sample performance (cash, blue-to-pink line) of an MDFA low-pass filter built using the approach discussed in this article on the STOXX Europe 50 index futures with expiration March 18th (STXEH3) for 200 15-minute interval log-return observations. Only one small loss out-of-sample was recorded from the period of Jan 18th through February 1st, 2013.

The frequency of observations on the index that are to be considered for building trading filters using MDFA is typically only a question of taste and priorities.  The beauty of MDFA lies in not only the versatility and strength in building trading signals for virtually any financial trading priorities, but also in the independence on the underlying observation frequency of the data. In my previous articles, I’ve considered and built high-performing strategies for daily, hourly, and 15 minute log returns, where the focus of the strategy in building the signal began with viewing the periodogram as the main barometer in searching for optimal frequencies on which one should set the low-pass cutoff for the extracting target filter $\Gamma$ function.

Index futures, as in a futures contract on a financial index, as we will see in this article present no new challenges. With the success I had on the 15-minute return observation frequency that I utilized in my previous article in building a signal for the Japanese Yen, I will continue to use the 15 minute intervals for the index futures where I hope to shed some more light on the filter selection process. This includes deducing properties of the intrinsically optimal spectral peaks to trade on. To do this, I present a simple approach I take in these examples by first setting up a bandpass filter over the spectral peak in the periodogram and then study the in-sample and out-of-sample characteristics of this signal, both in performance and consistency. So without further ado, I present my experiments with financial trading on index futures using MDFA, in iMetrica.

### STOXX Europe 50 Index (STXE H3, Expiration March 18 2013)

The STOXX Europe 50 Index, Europe’s leading Blue-chip index, provides a representation of sector leaders in Europe. The index covers 50 stocks from 18 European countries and has the highest trading volume of any European index. One of the first things to notice with the 15-minute log-returns of STXE are the frequent large spikes. These spikes will occur every 27 observations at 13:30 (UTC time zone) due to the fact that there are 26 15-minute periods during the trading hours. These spikes represent the close-to-open jumps that the STOXX Europe 50 index has been subjected to and then reflected in the price of the futures contract. With this ‘seasonal’ pattern so obviously present in the log-return data, the frequency effects of this pattern should be clearly visible in the periodogram of the data. The beauty of MDFA (and iMetrica) is that we have the ability to explicitly engineer a trading signal to take advantage of this ‘seasonal’ pattern by building an appropriate extractor $\Gamma$.

Figure 1: Log-returns of STXE for the 15-min observations from 1-4-2013 to 2-1-2013

Regarding the periodogram of the data, Figure 2 depicts the periodograms for the 15 minute log-returns of STXE (red) and the explanatory series (pink) together on the same discretized frequency domain. Notice that in both log-return series, there is a principal spectral peak found between .23 and .32.   The trick is to localize the spectral peak that accounts for the cyclical pattern that is brought about by the close-to-open variation between 20:00 and 13:30 UTC.

Figure 2: The periodograms for the 15 minute log-returns of STXE (red) and the explanatory series (pink).

In order to see the effects of the MDFA filter when localizing this spectral peak, I use my target $\Gamma$ builder interface in iMetrica to set the necessary cutoffs for the bandpass filter directly covering both spectral peaks of the log-returns, which are found between .23 and .32. This is shown in Figure 3, where the two dashed red lines show indicate the cutoffs and spectral peak is clearly inside these two cutoffs, with the spectral peak for both series occurring in the vicinity of $\pi/12$.  Once the bandpass target $\Gamma$ was fixed on this small frequency range, I set the regularization parameters for the filter coefficients to be $\lambda_{smooth} = .86$,   $\lambda_{decay} = .11$,  and $\lambda_{decay2} = .11$.

Figure 3: Choosing the cutoffs for the band pass filter to localize the spectral peak.

Pinpointing this frequency range that zooms in on the largest spectral peak generates a filter that acts on the intrinsic cycles found in the 15 minute log-returns of the STXE futures index. The resulting trading signal produced by this spectral peak extraction is shown in Figure 4, with the returns (blue to pink line) generated from the trading signal (green) , and the price of the STXE futures index in gray. The cyclical effects in the signal include the close-to-open variations in the data. Notice how the signal predicts the variation of the close-to-open price in the index quite well, seen from the large jumps or falls in price every 27 observations. The performance of this simple design in extracting the spectral peak of STXE yields a 4 percent ROI on 200 observations out-of-sample with only 3 losses out of 20 total trades (85 percent trade success rate), with two of them being accounted for towards the very end of the out-of-sample observations in an uncharacteristic volatile period occurring on January 31st 2013.

Figure 4: The performance in-sample and out-of-sample of the spectral peak localizing bandpass filter.

The two concurrent frequency response (transfer) functions for the two filters acting on the STXE log-return data (purple) and the explanatory series (blue), respectively, are plotted below in Figure 5. Notice the presence of the spectral peaks for both series being accounted for in the vicinity of the frequency $\pi/12$, with mild damping at the peak. Slow damping of the noise in the higher frequencies is aided by the addition of a smoothing expweight parameter that was set at $\alpha = 4$.

Figure 5: The concurrent frequency response functions of the localizing spectral peak band-pass filter.

With the ideal characteristics of a trading signal quite present in this simple bandpass filter, namely smooth decaying filter coefficients, in-sample and out-of-sample performance properties identical, and accurate, consistent trading patterns, it would be hard to imagine on improving the trading signal for this European futures index even more. But we can. We simply keep the spectral peak frequencies intact, but also account for the local bias in log-return data by extending the lower cutoff to frequency zero. This will provide improved systematic trading characteristics by not only predicting the close-to-open variation and jumps, but also handling upswings and downswings, and highly volatile periods much better.

In this new design, I created a low-pass filter by keeping the upper cutoff $\omega_1$ from the band-pass design and setting the lower cutoff to 0. I also increased the smoothing parameter to $\alpha = 32$. In this newly designed filter, we see a vast improvement in the trading structure. As before, the filter was able to deduce the direction of every single close-to-open jump during the 200 out-of-sample observations, but notice that it was also able to become much more flexible in the trading during any upswing/downswing and volatile period. This is seen in more detail in Figure 7, where I added the letter ‘D’ to each of the 5 major buy/sell signals occurring before close.

Figure 6: Performance of filter both in-sample (left of cyan line) and on 210 observations out-of-sample (right of cyan line).

Notice that the signal predicted the jump correctly for each of these major jumps, resulting in large returns. For example, at the first “D” indicator,  the signal indicated sell/short (magenta dashed line) the STXE future index 5 observations before close, namely at 18:45 UTC, before market close at 20:00 UTC. Sure enough, the price of the STXE contract went down during overnight trading hours and opened far below the previous days close, with the filter signaling a buy (green dashed line) 45 minutes into trading. At the mark of the second “D”, we see that on the final observation before market close, the signal produced a buy/long indication, and indeed, the next day the price of the future jumped significantly.

Figure 7: Zooming in on the out-of-sample performance and showing all the signal responses that predicted the major close-to-open jumps.

Only two very small losses of less than .08 percent were accounted for.  One advantage of including the frequency zero along with the spectral peak frequency of STXE is that the local bias can help push-up or pull-down the signal resulting in a more ‘patient’ or ‘diligent’ filter that can better handle long upswings/downswings or volatile periods. This is seen in the improvement of the performance towards the end of the 200 observations out-of-sample, where the filter is more patient in signaling a sell/short after the previous buy. Compare this with the end of the performance from the band-pass filter, Figure 4.  With this trading signal out-of-sample, I computed a 5 percent ROI on the 200 observations out-of-sample with only 2 small losses. The trading statistics for the entire in-sample combined with out-of-sample are shown in Figure 8.

Figure 8: The total performance statistics of the STXEH3 trading signal in-sample plus out-of-sample. The max drop indicates -0 due to the fact that there was a truncation to two decimal places. Thus the losses were less than .01.

### S&P 500 Futures Index (ES H3, Expiration March 18 2013)

In this experiment trading S&P 500 future contracts (E-mini) on observations of 15 minute intervals from Jan 4th to Feb 1st 2013, I apply the same regimental approach as before.  In looking at the log-returns of ESH3 shown in Figure 10, the effect of close-to-open variation seem to be much less prominent here compared to that on the STXE future index. Because of this, the log-returns seem to be much closer to ‘white noise’ on this index. Let that not perturb our pursuit of a high performing trading signal however. The approach I take for extracting the trading signal, as always, begins with the periodogram.

Figure 10: The log-return data of ES H3 at 15 minute intervals from 1-4-2013 to 2-1-2013.

As the large variations in the close-to-open price are not nearly as prominent, it would make sense that the same spectral peak found before at near $\pi/12$ is not nearly as prominent either. We can clearly see this in the periodogram plotted below in Figure 11. In fact, the spectral peak at $\pi/12$ is slightly larger in the explanatory series (pink), thus we should still be able to take advantage of any sort of close-to-open variation that exists in the E-min future index.

Figure 11: Periodograms of ES H3 log-returns (red) and the explanatory series (pink). The red dashed vertical lines are framing the spectral peak between .23 and .32.

With this spectral peak extracted from the series, the resulting trading signal is shown in Figure 12 with the performance of the bandpass signal shown in Figure 13.

Figure 12: The signal built from the extracted spectral peak and the log-return ESH3 data.

One can clearly see that the trading signal performs very well during the consistent cyclical behavior in the ESH3 price, However, when breakdown occurs in this stochastic structure and follows more prominently another frequency, the trading signal dies and no longer trades systematically taking advantage of the intrinsic cycle found near $\pi/12$. This can be seen in the middle 90 or so observations. The price can be seen to follow more closely a random walk and the trading becomes inconsistent. After this period of 90 or so observations however, just after the beginning of the out-of-sample period, the trajectory of the ESH3 follows back on its consistent course with a $\pi/12$ cyclical component it had before.

Figure 13: The performance in-sample and out-of-sample of the simple bandpass filter extracting the spectral peak.

Now to improve on these results, we include the frequency zero by moving the lower cutoff of the previous band-pass filter to $\latex \omega_0 = 0$.  As I mentioned before, this lifts or pushes down the signal from the local bias and will trade much more systematically. I then lessened the amount of smoothing in the expweight function to $\alpha = 24$, down from $\alpha = 36$ as I had on the band-pass filter.  This allows for slightly higher frequencies than $\pi/12$ to be traded on. I then proceeded to adjust the regularization parameters to obtain a healthy dosage of smoothness and decay in the coefficients. The result of this new low-pass filter design is shown in Figure 14.

Figure 14: Performance out-of-sample (right of cyan line) of the ES H3 filter on 200 15 minute observations.

The improvement in the overall systematic trading and performance is clear. Most of the major improvements came from the middle 90 points where the trading became less cyclical. With 6 losses in the band-pass design during this period alone, I was able to turn those losses into two large gains and no losses.  Only one major loss was accounted for during the 200 observation out-of-sample testing of filter from January 18th to February 1st, with an ROI of nearly 4 percent during the 9 trading days. As with the STXE filter in the previous example, I was able to successfully build a filter that correctly predicts close-to-open variations, despite the added difficulty that such variations were much smaller. Both in-sample and out-of-sample, the filter performs consistently, which is exactly what one strives for thanks to regularization.

### ASX Futures (YAPH3, Expiration March 18, 2013)

In the final experiment, I build a trading signal for the Australian Stock Exchange futures, during the same period of the previous two experiments. The log-returns show moderately large jumps/drops in price during the entire sample from Jan 4th to Feb 1st, but not quite as large as in the STXE index. We still should be able to take advantage of these close-to-open variations.

Figure 15: The log-returns of the YAPH3 in 15-minute interval observations.

In looking at the periodograms for both the YAPH3 15 minute log-returns (red) and the explanatory series (pink), it is clear that the spectral peaks don’t align like they did in the previous two exampls. In fact, there hardly exists a dominant spectral peak in the explanatory series, whereas the $\pi/12$ spectral peak in YAPH3 is very prominent. This ultimately might effect the performance of the filter, and consequently the trades. After building the low-pass filter and setting a high smoothing expweight parameter $\alpha = 26.5$, I then set the regularization parameters to be  $\lambda_{smooth} = .85$,   $\lambda_{decay} = .11$,  and $\lambda_{decay2} = .11$ (same as first example).

Figure 16: The periodograms for YAPH3 and explanatory series with spectral peak in YAPH3 framed by the red dashed lines.

The performance of the filter in-sample and out-of-sample is shown in Figure 18. This was one of the more challenging index futures series to work with as I struggled finding an appropriate explanatory series (likely because I was lazy since it was late at night and I was getting tired). Nonetheless, the filter still seems to predict the close-to-open variation on the Australian stock exchange index fairly well. All the major jumps in price are accounted for if you look closely at the trades (green dashed lines are buys/long and magenta lines are sells/shorts) and the corresponding action on the price of the futures contract.   Five losses out-of-sample for a trade success ratio of 72 percent and an ROI out-of-sample on 200 observations of 4.2 percent.  As with all other experiments in building trading signals with MDFA, we check the consistency of the in-sample and out-of-sample performance, and these seem to match up nicely.

Figure 18 : The out-of-sample performance of the low-pass filter on YAPH3.

The filter coefficients for the YAPH3 log-difference series is shown in Figure 19. Notice the perfectly smooth undulating yet decaying structure of the coefficients as the lag increases. What a beauty.

Figure 19: Filter coefficients for the YAPH3 series.

### Conclusion

Studying the trading performance of spectral peaks by first constructing band-pass filters to extract the signal corresponding to the peak in these index futures enabled me to understand how I can better construct the lowpass filter to yield even better performance. In these examples, I demonstrated that the close-to-open variation in the index futures price can be seen in the periodogram and thus be controlled for in the MDFA trading signal construction. This trading frequency corresponds to roughly $\pi/12$ in the 15 minute observation data that I had from Jan 4th to Feb 1st. As I witnessed in my empirical studies using iMetrica, this peak is more prominent when the close-to-open variations are larger and more often, promoting a very cyclical structure in the log-return data.  As I look deeper and deeper into studying the effects of extracting spectral peaks in the periodogram of financial data log-returns and the trading performance, I seem to improve on results even more and building the trading signals becomes even easier.

Stay tuned very soon for a tutorial using R (and MDFA) for one of these examples on high-frequency trading on index futures.  If you have any questions or would like to request a certain index future (out of one of the above examples or another) to be dissected in my second and upcoming R tutorial, feel free to shoot me an email.

Happy extracting!

# High-Frequency Financial Trading with Multivariate Direct Filtering I: FOREX and Futures

Animation 1: Click to see animation of the Japanese Yen filter in action on 164 hourly out-of-sample observations.

In my previous articles, I was working uniquely with daily log-return data from different time spans from a year to a year and a half. This enabled the in-sample period of computing the filter coefficients for the signal extraction to include all the most recent annual phases and seasons of markets, from holiday effects, to the transitioning period of August to September that is regularly highly influential on stock market prices and commodities as trading volume increases a significant amount. One immediate question that is raised in migrating to higher-frequency intraday data is what kind of in-sample/out-of-sample time spans should be used to compute the filter in-sample and then for how long do we apply the filter out-of-sample to produce the trades? Another question that is raised with intraday data is how do we account for the close-to-open variation in price? Certainly, after close, the after-hour bids and asks will force a jump into the next trading day. How do we deal with this jump in an optimal manner? As the observation frequency gets higher, say from one hour to 30 minutes, this close-to-open jump/fall should most likely be larger. I will start by saying that, as you will see in the results of this article, with a clever choice of the extractor $\Gamma$ and explanatory series, MDFA can handle these jumps beautifully (both aesthetically and financially). In fact, I would go so far as to say that the MDFA does a superb job in predicting the overnight variation.

One advantage of building trading signals for higher intraday frequencies is that the signals produce trading strategies that are immediately actionable. Namely one can act upon a signal to enter a long or short position immediately when they happen. In building trading signals for the daily log-return, this is not the case since the observations are not actionable points, namely the log difference of today’s ending price with yesterday’s ending price are produced after hours and thus not actionable during open market hours and only actionable the next trading day. Thus trading on intraday observations can lead to better efficiency in trading.

In this first installment in my series on high-frequency financial trading using multivariate direct filtering in iMetrica, I consider building trading signals on hourly returns of foreign exchange currencies. I’ve received a few requests after my recent articles on the Frequency Effect in seeing iMetrica and MDFA in action on the FOREX sector. So to satisfy those curiosities, I give a series of (financially) satisfying and exciting results in combining MDFA and the FOREX. I won’t give all my secretes away into building these signals (as that would of course wipe out my competitive advantage), but I will give some of the parameters and strategies used so any courageously curious reader may try them at home (or the office). In the conclusion, I give a series of even more tricks and hacks.  The results below speak for themselves  So without further ado, let the games begin.

#### Japanese Yen

Frequency: One hour returns
30 day out-of-sample ROI: 12 percent
Trade success ratio: 92 percent

Yen Filter Parameters: $\lambda$ = 9.2 $\alpha$ = 13.2, $\omega_0 = \pi/5$
Regularization: smooth = .918, decay = .139, decay2 = .79, cross = 0

In the first experiment, I consider hourly log-returns of a ETF index that mimics the Japanese Yen called FXY. As for one of the explanatory series, I consider the hourly log-returns of the price of GOLD which is traded on NASDAQ. The out-of-sample results of the trading signal built using a low-pass filter and the parameters above are shown in Figure 1.  The in-sample trading signal (left of cyan line) was built using 400 hourly observations of the Yen during US market hours dating back to 1 October 2012. The filter was then applied to the out-of-sample data for 180 hours, roughly 30 trading days up until Friday, 1 February 2013.

Figure 1: Out-of-sample results for the Japanese Yen. The in-sample trading signal was built using 400 hourly observations of the Yen during US market hours dating back to October 1st, 2012. The out-of-sample portion passed the cyan line is on 180 hourly observations, about 30 trading days.

This beauty of this filter is that it yields a trading signal exhibiting all the characteristics that one should strive for in building a robust and successful trading filter.

1. Consistency: The in-sample portion of the filter performs exactly as it does out-of-sample (after cyan line) in both trade success ratio and systematic trading performance.
2. Dropdowns: One small dropdown out-of-sample for a loss of only .8 percent (nearly the cost of the transaction).
3. Detects the cycles as it should: Although the filter is not able to pinpoint with perfect accuracy every local small upturn during the descent of the Yen against the dollar, it does detect them nonetheless and knows when to sell at their peaks (the magenta lines).
4. Self-correction: What I love about a robust filter is that it will tend to self-correct itself very quickly to minimize a loss in an erroneous trade. Notice how it did this in the second series of buy-sell transactions during the only loss out-of-sample. The filter detects momentum but quickly sold right before the ensuing downfall. My intuition is that only frequency-based methods such as the MDFA are able to achieve this consistently. This is the sign of a skillfully smart filter.

The coefficients for this Yen filter are shown below. Notice the smoothness of the coefficients from applying the heavy smooth regularization and the strong decay at the very end.  This is exactly the type of smooth/decay combo that one should desire. There is some obvious correlation between the first and second explanatory series in the first 30 lags or so as well. The third explanatory series seems to not provide much support until the middle lags .

Figure 2: Coefficients of the Yen filter. Here we use three different explanatory series to extract the trading signal.

One of the first things that I always recommend doing when first attempting to build a trading signal is to take a glance at the periodogram. Figure 2 shows the periodogram of the log-return data of the Japanese Yen over 580 hours.  Compare this with the periodogram of the same asset using log-returns of daily data over 580 days, shown in Figure 3.  Notice the much larger prominent spectral peaks at the lower frequencies in the daily log-return data. These prominent spectral peaks renders multibandpass filters much more advantageous and to use as we can take advantage of them by placing a band-pass filter directly over them to extract that particular frequency (see my article on multibandpass filters). However, in the hourly data, we don’t see any obvious spectral peaks to consider, thus I chose a low-pass filter and set the cutoff frequency at $\pi/5$, a standard choice, and good place to begin.

Figure 3: Periodogram of hourly log-returns of the Japanese Yen over 580 hours.

Figure 4: Periodogram of Japanese Yen using 580 daily log-return observations. Many more spectral peaks are present in the lower frequencies.

#### Japanese Yen

Frequency: 15 minute returns
7 day out-of-sample ROI: 5 percent
Trade success ratio: 82 percent

Yen Filter Parameters: $\lambda$ = 3.7 $\alpha$ = 13, $\omega_0 = \pi/9$
Regularizationsmooth = .90, decay = .11, decay2 = .09, cross = 0

In the next trading experiment, I consider the Japanese Yen again, only this time I look at trading on even high-frequency log-return data than before, namely on 15 minute log-returns of the Yen from the opening bell to market close.  This presents slightly new challenges than before as the close-to-open jumps are much larger than before, but these larger jumps do not necessarily pose problems for the MDFA. In fact, I look to exploit these and take advantage to gain profit by predicting the direction of the jump.  For this higher frequency experiment, I considered 350 15-minute in-sample observations to build and optimize the trading signal, and then applied it over the span of 200 15-minute out-of-sample observations. This produced the results shown in the Figure 5 below. Out of 17 total trades out-of-sample, there were only 3 small losses each less than .5 percent drops and thus 14 gains during the 200 15-minute out-of-sample time period.  The beauty of this filter is its impeccable ability to predict the close-to-open jump in the price of the Yen. Over the nearly 7 day trading span, it was able to correctly deduce whether to buy or short-sell before market close on every single trading day change. In the figure below, the four largest close-to-open variation in Yen price is marked with a “D” and you can clearly see how well the signal was able to correctly deduce a short-sell before market close. This is also consistent with the in-sample performance as well, where you can notice the buys and/or short-sells at the largest close-to-open jumps (notice the large gain in the in-sample period right before the out-of-sample period begins, when the Yen jumped over 1 percent over night.  This performance is most likely aided by the explanatory time series I used for helping predict the close-to-open variation in the price of the Yen. In this example, I only used two explanatory series (the price of Yen, and another closely related to the Yen).

Figure 5: Out-of-sample performance of the Japanese Yen filter on 15 minute log-return data.

We look at the filter transfer functions to see what frequencies they are being privileged in the construction of the filter. Notice that some noise leaks out passed the frequency cutoff at $\pi/9$, but this is typically normal and a non-issue. I had to balance for both timeliness and smoothness in this filter using both the customization parameters $\lambda$ and $\alpha$. Not much at frequency 0 is emphasized, with more emphasis stemming from the large spectral peak found right at $\pi/9$.

Figure 6: The filter transfer functions.

#### British Pound

Frequency: 30 minute returns
14 day out-of-sample ROI: 4 percent
Trade success ratio: 76 percent

British Pound Filter Parameters: $\lambda$ = 5 $\alpha$ = 15, $\omega_0 = \pi/9$
Regularizationsmooth = .109, decay = .165, decay2 = .19, cross = 0

In this example we consider the frequency of the data to 30 minute returns and attempt to build a robust trading signal for a derivative of the British Pound (BP) on this higher frequency. Instead of using the cash value of the BP, I use 30 minute returns of the BP Futures contract expiring in March (BPH3). Although I don’t have access to tick data from the FOREX, I do have tick data from GLOBEX for the past 5 years.  Thus the futures series won’t be an exact replication of the cash price series of the BP, but it should be quite close due to very low interest rates.

The results of the out-of-sample performance of the BP futures filter are shown in Figure 7. I constructed the filter using an initial in-sample size of 390 30 minute returns dating back to 1 December 2012. After pinpointing a frequency cutoff in the frequency domain for the $\Gamma$ that yielded decent trading results in-sample, I then proceeded to optimize the filter in-sample on smoothness and regularization to achieve similar out-of-sample performance. Applying the resulting filter out-of-sample on 168 30-minute log-return observations of the BP futures series along with 3 explanatory series, I get the results shown below. There were 13 trades made and 10 of them were successful. Notice that the filter does an exquisite job at triggering trades near local optimums associated with the frequencies inside the cutoff of the filter.

Figure 7: The out-of-sample results of the British Pound using 30-minute return data.

In looking at the coefficients of the filter for each series in the extraction, we can clearly see the effects of the regularization: the smoothness of the coefficients the fast decay at the very end. Notice that I never really apply any cross regularization to stress the latitudinal likeliness between the 3 explanatory series as I feel this would detract from the predicting advantages brought by the explanatory series that I used.

Figure 8: The coefficients for the 3 explanatory series of the BP futures,

#### Euro

Frequency: 30 min returns
30 day out-of-sample ROI: 4 percent
Trade success ratio: 71 percent

Euro Filter Parameters: $\lambda$ = 0, $\alpha$ = 6.4, $\omega_0 = \pi/9$
Regularizationsmooth = .85, decay = .27, decay2 = .12, cross = .001

Continuing with the 30 minute frequency of log-returns, in this example I build a trading signal for the Euro futures contract with expiration on 18 March 2013 (UROH3 on the GLOBEX). My in-sample period, being the same as my previous experiment, is from 1 December 2012 to 4 January 2013 on 30 minute returns using three explanatory time series.  In this example, after inspecting the periodogram, I decided upon a low-pass filter with a frequency cutoff of $\pi/9$. After optimizing the customization and applying the filter to one month of 30 minute frequency return data out-of-sample (month of January 2013, after cyan line) we see the performance is akin to the performance in-sample, exactly what one strives for. This is due primarily to the heavy regularization of the filter coefficients involved. Only four very small losses of less than .02 percent are suffered during the out-of-sample span that includes 10 successful trades, with the losses only due to the transaction costs. Without transaction costs, there is only one loss suffered at the very beginning of the out-of-sample period.

Figure 9 : Out-of-sample performance on the 30-min log-returns of Euro futures contract UROH3.

As in the first example using hourly returns, this filter again exhibits the desired characteristics of a robust and high-performing financial trading filter. Notice the out-of-sample performance behaves akin to the in-sample performance, where large upswings and downswings are pinpointed to high-accuracy. In fact, this is where the filter performs best during these periods. No need for taking advantage of a multibandpass filter here, all the profitable trading frequencies are found at less than $\pi/9$.  Just as with the previous two experiments with the Yen and the British Pound, notice that the filter cleanly predicts the close-to-open variation (jump or drop) in the futures value and buys or sells as needed.  This can be seen from many of the large jumps in the out-of-sample period (after cyan line).

One reason why these trading signals perform so well is due to their approximation power of the symmetric filter. In comparing the trading signal (green) with a high-order approximation of the symmetric filter (gray line) transfer function $\Gamma$ shown in Figure 10, we see that trading signal does an outstanding job at approximating the symmetric filter uniformly. Even at the latest observation (the right most point), the asymmetric filter hones in on the symmetric signal (gray line) with near perfection. Most importantly, the signal crosses zero almost exactly where required.  This is exactly what you want when building a high-performing trading signal.

Figure 10: Plot of approximation of the real-time trading signal for UROH3 with a high order approximation of the symmetric filter transfer function.

In looking at the periodogram of the log-return data and the output trading signal differences (colored in blue), we see that the majority of the frequencies were accounted for as expected in comparing the signal with the symmetric signal. Only an inconsequential amount of noise leakage passed the frequency cutoff of $\pi/9$ is found.  Notice the larger trading frequencies, the more prominent spectral peaks, are located just after $\pi/6$. These could be taken into account with a smart multibandpass filter in order to manifest even more trades, but I wanted to keep things simple for my first trials with high-frequency foreign exchange data.  I’m quite content with the results that I’ve achieved so far.

Figure 11: Comparing the periodogram of the signal with the log-return data.

#### Conclusion

I must admit, at first I was a bit skeptical of the effectiveness that the MDFA would have in building any sort of successful trading signal for FOREX/GLOBEX high frequency data. I always considered the FOREX market rather ‘efficient’ due to the fact that it receives one of the highest trading volumes in the world.  Most strategies that supposedly work well on high-frequency FOREX all seem to use some form of technical analysis or charting (techniques I’m particularly not very fond of), most of which are purely time-domain based. The direct filter approach is a completely different beast, utilizing a transformation into the frequency domain and a ‘bending and warping’ of the metric space for the filter coefficients to extract a signal within the noise that is the log-return data of financial assets.  For the MDFA to be very effective at building timely trading signals, the log-returns of the asset need to diverge from white noise a bit, giving room for pinpointing intrinsically important cycles in the data. However, after weeks of experimenting, I have discovered that building financial trading signals using MDFA and iMetrica on FOREX data is as rewarding as any other.

As my confidence has now been bolstered and amplified even more after my experience with building financial trading signals with MDFA and iMetrica for high-frequency data on foreign exchange log-returns at nearly any frequency, I’d be willing to engage in a friendly competition with anyone out there who is certain that they can build better trading strategies using time domain based methods such as technical analysis or any other statistical arbitrage technique.  I strongly believe these frequency based methods are the way to go, and the new wave in financial trading.  But it takes experience and a good eye for the frequency domain and periodograms to get used to. I haven’t seen many trading benchmarks that utilize other types of strategies, but i’m willing to bet that they are not as consistent as these results using this large of an out-of-sample to in-sample ratio (the ratios in these experiments were between .50 and .80).  If anyone would like to take me up on my offer for a friendly competition (or know anyone that would), feel free to contact me.

After working with a multitude of different financial time series and building many different types of filters, I have come to the point where I can almost eyeball many of the filter parameter choices including the most important ones being the extractor $\Gamma$ along with the regularization parameters, without resorting to time consuming, and many times inconsistent, optimization routines.  Thanks to iMetrica, transitioning from visualizing the periodogram to the transfer functions and to the filter coefficients and back to the time domain to compare with the approximate symmetric filter in order to gauge parameter choices is an easy task, and an imperative one if one wants to build successful trading signals using MDFA.

Here are some overall tips and tricks to build your own high performance trading signals on high-frequency data at home:

• Pay close attention to the periodogram. This is your best friend in choosing the extractor $\Gamma$. The best performing signals are not the ones that trade often, but trade on the most important frequencies found in the data. Not all frequencies are created equal. This is true when building either low-pass or multibandpass frequencies.
• When tweaking customization, always begin with $\alpha$, the parameter for smoothness. $\lambda$ for timeliness should be the last resort. In fact, this parameter will most likely be next to useless due to the fact that the log-return of financial data is stationary. You probably won’t ever need it.
• You don’t need many explanatory series. Like most things in life, quality is superior to quantity. Using the log-return data of the asset you’re trading along with one and maybe two explanatory series that somewhat correlate with the financial asset you’re trading on is sufficient. Anymore than that is ridiculous overkill, probably leading to over-fitting (even the power of regularization at your fingertips won’t help you).

In my next article, I will continue with even more high-frequency trading strategies with the MDFA and iMetrica where I will engage in the sector of Funds and ETFs. If any curious reader would like even more advice/hints/comments on how to build these trading signals on high-frequency data for the FOREX (or the coefficients built in these examples), feel free to get in contact with me via email. I’ll be happy to help.

Happy extracting!

# Building a Multi-Bandpass Financial Portfolio

Animation 1: Click the image to view the animation. The changing periodogram for different in-sample sizes and selecting an appropriate band-pass component to the multi-bandpass filter.

In my previous article, the third installment of the Frequency Effect trilogy, I introduced the multi-bandpass (MBP) filter design as a practical device for the extraction of signals in financial data that can be used for trading in multiple types of market environments.  As depicted through various examples using daily log-returns of Google (GOOG) as my trading platform, the MBP demonstrated a promising ability to tackle the issue of combining both lowpass filters to include a local bias and slow moving trend while at the same time providing access to higher trading frequencies for systematic trading during sideways and volatile market trajectories. I identified four different types of market environments and showed through three different examples how one can attempt to pinpoint and trade optimally in these different environments.

After reading a well-written and informative critique of my latest article, I became motivated to continue along on the MBP bandwagon by extending the exploration of engineering robust trading signals using the new design. In Marc’s words (the reviewer) regarding the initial results of this latest design in MDFA signal extraction for financial trading : “I tend to believe that some of the results are not necessarily systematic and that some of the results – Chris’ preference – does not match my own priority. I understand that comparisons across various designs of the triptic may require a fixed empirical framework (Google/Apple on a fixed time span).  But this restricted setting does not allow for more general inference (on other assets and time spans). And some of the critical trades are (at least in my perspective) close to luck.”

### Portfolio selection

In choosing the assets for my portfolio, I arranged a group of companies/commodities whose products or services I use on a consistent basis (as arbitrary as any other portfolio selection method, right?). To this end, I chose  Verizon (VZ) (service provider for my iPhone5), Microsoft (MSFT) (even though I mostly use Linux for my computing needs), Toyota (TM) ( I drive a Camry), Coffee (JO) (my morning espresso keeps the wheels turning), and Gold (GLD) (who doesn’t like Gold, a great hedge to any currency).  For each of these assets, I built a trading signal using various in-sample time periods beginning summer of 2011 and ending toward the end of summer 2012, to ensure all seasonal market effects were included. The out-of-sample time period in which I test the performance of the filter for each asset ranges anywhere from 90 days to 125 days out-of-sample. I tried to keep the selection of in-sample and out-of-sample points as arbitrary as possible.

### Portfolio Performance

And so here we go. The performance of the portfolio.

#### Coffee (NYSEARCA:JO)

• Regularization: smooth = .22, decay = .22, decay2 = .02, cross = 0
• MBP = [0, .2], [.44,.55]
• Out-of-sample performance: 32 percent ROI in 110 days

In order to work with commodities in this portfolio, the easiest way is through the use of ETFs that are traded in open markets just as any other asset. I chose the Dow Jones-UBS Coffee Subindex JO which is intended to reflect the returns that are potentially available through an unleveraged investment in one futures contract on the commodity of coffee as well as the rate of interest that could be earned on cash collateral invested in specified Treasury Bills.  To create the MBP filter for the JO index, I used JO and USO (a US Oil ETF) as the explanatory series from the dates of 5-5-2011 until 1-13-2013 (just a random date I picked from mid 2011, cinqo de mayo) and set the initial low-pass portion for the trend component of the MBP filter to [0, .17]. After a significant amount of regularization was applied, I added a bandpass portion to the filter by initializing an interval at [.4, .5]. This corresponded to the principal spectral peak in the periodogram which was located just below $\pi/6$ for the coffee fund. After setting the number of out-of-sample observations to 110,  I then proceeded to optimize the regularization parameters in-sample while ensuring that the transfer functions of the filter were no greater than 1 at any point in the frequency domain. The result of the filter is plotted below in Figure 1, with the transfer functions of the filters plotted below it. The resulting trading signal from the MBP filter is in green and the out-of-sample portion after the cyan line, with the cumulative return on investment (ROI) percentage in blue-pink and the daily price of JO the coffee fund in gray.

Figure 1: The MBP filter for JO applied 110 Out-of-sample points (after cyan line).

Figure 2: Transfer function for the JO and USO MBP filters.

Notice the out-of-sample portion of 110 observations behaving akin to the in-sample portion before it, with a .97 rank coefficient of the cumulative ROI resulting from the trades. The ROI in the out-of-sample portion was 32 percent total and suffered only 4 small losses out of 18 trades. The concurrent transfer functions of the MBP filter clearly indicate where the principal spectral peak for JO (blue-ish line) is directly under the bandpass portion of the filter. Notice the signal produced no trades during the steepest descent and rise in the price of coffee, while pinpointing precisely at the right moment the major turning point (right after the in-sample period). This is exactly what you would like the MBP signal to achieve.

#### Gold (SPDR Gold Trust, NYSEARCA:GLD)

As one of the more difficult assets to form a well-performing signal both in-sample and out-of-sample using the MBP filter, the GLD (NYSEARCA:GLD) ETF proved to be quite cumbersome in not only locating an optimal bandpass portion to the MBP, but also finding a relevant explaining series for GLD. In the following formulation, I settled upon using a US dollar index given by the PowerShares ETF UUP (NYSEARCA:UUP), as it ended up giving me a very linear performance that is consistent both in-sample and out-of-sample. The parameterization for this filter is given as follows:

• Regularization: smooth = .22, decay = .22, decay2 = .02, cross = 0
• MBP = [0, .2], [.44,.55]
• Out-of-sample performance: 11 percent ROI in 102 days

Figure 3 : Out-of-sample results of the MBP applied to the GLD ETF for 102 observations

Figure 4 : The Transfer Functions for the GLD and DIG filter.

Figure 5: Coefficients for the GLD and DIG filters. Each are of length 76.

The smoothness and decay in the coefficients is quite noticeable along with a slight lag correlation along the middle of the coefficients between lags 10 and 38.  This trio of characteristics in the above three plots is exactly what one strives for in building financial trading signals. 1) The smoothness and decay of the coefficients, 2) the transfer functions of the filter not exceeding 1 in the low and band pass, and 3) linear performance both in-sample and out-of-sample of the trading signal.

#### Verizon (NYSE:VZ)

• Regularization: smooth = .22, decay = 0, decay2 = 0, cross = .24
• MBP = [0, .17], [.58,.68]
• Out-of-sample performance: 44 percent ROI in 124 days trading

The experience of engineering a trading signal for Verizon was one of the longest and more difficult experiences out of the 5 assets in this portfolio. Strangely a very difficult asset to work with. Nevertheless, I was determined to find something that worked. To begin, I ended up using AAPL as my explanatory series (which isn’t a far fetched idea I would imagine. After all, I utilize Verizon as my carrier service for my iPhone 5).  After playing around with the regularization parameters in-sample, I chose a 124 day out-of-day horizon for my Verizon to apply the filter to and test the performance. Surprisingly, the cross regularization seemed to produce very good results both out-of-sample. This was the only asset in the portfolio that required a significant amount of cross regularization, with the parameter touching the vicinity of .24. Another surprise was how high the timeliness parameter $\lambda$ was (40) in order to produce good in-sample and out-of-sample trading results. By far the highest amount of the 5 assets in this study. The amount of smoothing from the weighting function $W(\omega; \alpha)$ was also relatively high, reaching a value of 20.

The out-of-sample performance is shown in Figure 6. Notice how dampened the values of the trading signal are in this example, where the local bias during the long upswings is present, but not visible due to the size of the plot. The out-of-sample performance (after the cyan line) seems to be superior to that of the in-sample portion. This is most likely due to the fact that the majority of the frequencies that we were interested in, near $\pi/6$, failed to become prominent in the data until the out-of-sample portion (there were around 120 trading days not shown in the plot as I only keep a maximum of 250 plotted on the canvas).  With 124 out-of-sample observations, the signal produced a performance of 44 percent ROI. The filter seems to cleanly and consistently pick out local turning points, although not always at their optimal point, but the performance is quite linear, which is exactly what you strive for.

Figure 6: The out-of-sample performance on 124 observations from 7-2012 to 1-13-2013.

Figure 7: Coefficients of lag up to 76 of the Verizon-Apple filter,

In the coefficients for the VZ and AAPL data shown in Figure 7, one can clearly see the distinguishing effects of the cross regularization along with the smooth regularization. Note that no decay regularization was needed in this example, with the resulting number of effective degrees of freedom in the construction of this filter being 48.2 an important number to consider when applying regularization to filter coefficients (filter length was 76),

#### Microsoft (NASDAQ:MSFT)

• Regularization: smooth = .42, decay = .24, decay2 = .15, cross = 0
• MBP = [0, .2], [.59,.72]
• Out-of-sample performance: 31 percent ROI in 90 days trading

In the Microsoft data I used a time span of a year and three months for my in-sample period and a 90 day out-of-sample period from August through 1-13-2012. My explanatory series was GOOG (the search engine Bing and Google seem to have quite the competition going on, so why not) which seemed to correlate rather cleanly with the share price of MSFT. The first step in obtaining a bandpass after setting my lowpass filter to [0, .2] was to locate the principal spectral peak (shown in the periodogram figure below). I then adjusted the width until I had near monotone performance in-sample. Once the customization and regularization parameters were found, I applied the MSFT/AAPL filter to the 90 day out-of-sample period and the result is shown below. Notice that the effect of the local bias and slow moving trends from the lowpass filter are seen in the output trading signal (green) and help in identifying the long down swings found in the share price. During the long down swings, there are no trades due to the local bias from frequency zero.

Figure 8: Microsoft trading signal for 90 out-of-sample observations. The ROI out-of-sample is 31 percent.

Figure 9: Aggregate periodogram of MSFT and Google showing the principal spectral peak directly inside the bandpass.

Figure 10: The coefficients for the MSFT and GOOG series up to lag 76.

With a healthy amount of regularization applied to the coefficient space, we can clearly see the smoothness and decay towards the end of the coefficient lags. The cross regularization parameter provided no improvement to either in-sample or out-of-sample performance and was left set to 0.

Despite the superb performance of the signal out-of-sample with a 31 percent ROI in 90 days in a period which saw the share price descend by 10 percent, and relatively smooth decaying coefficients with consistent performance both in and out-of-sample, I still feel like I could improve on these results with a better explanatory series than AAPL. That is one area of this methodology in which I struggle, namely finding “good” explanatory series to better fortify the in-sample metric space and produce more even more anticipation in the signals. At this point it’s a game of trial and error. I suppose I should find a good market economist to direct these questions to.

#### Toyota (NYSE:TM)

• Regularization: smooth = .90, decay = .14, decay2 = .72, cross = 0
• MBP = [0, .21], [.49,.67]
• Out-of-sample performance: 21 percent ROI in 85 days trading

For the Toyota series, I figured my first explanatory series to test things with would be an asset pertaining to the price of oil. So I decided to dig up some research and found that DIG ( NYSEARCA:DIG), a ProShares ETF, provides direct exposure to the global price of oil and gas (in fact it is leveraged so it corresponds to twice the daily performance of the Dow Jones U.S. Oil & Gas Index).  The out-of-sample performance, with heavy regularization in both smooth and decay, seems to perform quite consistently with in-sample, The signal shows signs of patience during volatile upswings, which is a sign that the local bias and slow moving trend extraction is quietly at work. Otherwise, the gains are consistent with just a few very small losses. At the end of the out-of-sample portion, namely the past several weeks since Black Friday (November 23rd), notice the quick climb in stock price of Toyota. The signal is easily able to deduce this fast climb and is now showing signs of slowdown from the recent rise (the signal is approaching the zero crossing, that’s how I know).  I love what you do for me, Toyota! (If you were living in the US in the1990s, you’ll understand what I’m referring to).

Figure 11: Out-of-sample performance of the Toyota trading signal on 85 trading days.

Figure 12: Coefficients for the TM and DIG log-return series.

Figure 13: The transfer functions for the TM and DIG filter coefficients.

The coefficients for the TM and DIG series depicted in Figure 12 show the heavy amount of smooth and decay (and decay2) regularization, a trio of parameters that was not easy to pinpoint at first without significant leakage above one in the filter transfer functions (shown in Figure 13). One can see that two major spectral peaks are present under the lowpass portion and another large one in the bandpass portion that accounts for the more frequent trades.

### Conclusion

With these trading signals constructed for these five assets, I imagine I have a small but somewhat diverse portfolio, ranging from tech and auto to two popular commodities. I’ll be tracking the performance of these trading signals together combined as a portfolio over the next few months and continuously give updates. As the in-sample periods for the construction of these filters ended around the end of last summer and were already applied to out-of-sample periods ranging from 90 days to 124 (roughly one half to one third of the original in-sample period), with the significant amount of regularization applied, I am quite optimistic that the out-of-sample performance will continue to be the same over the next few months, but of course one can never be too sure of anything when it comes to market behavior. In the worse case scenario, I can always look into digging though my dynamic adaptive filtering and signal extraction toolkit.

Overall, although I’m quite inspired and optimistic with these results. there is still slight room for improvement in building these MBP filters, especially for low volatility sideways markets (for example, the one occurring in the Toyota stock price in the middle of the plot in Figure 11). In general, this is a difficult type of stock price movement in which any type of signal will have success. With low volatility and no trending movements, the log-returns are basically white noise – there is no pertinent information to extract. The markets are currently efficient and there is nothing you can do about it. Only good luck will win (in that case you’re as well off building a signal based on a coin flip). Typically the best you can do in these types of areas is prevent trading altogether with some sort of threshold on the signal, which is an idea I’ve had in my mind recently but haven’t implemented, or make sure any losses are small, which is exactly what my signal achieved in Figure 11 (and which is what any robust signal should do in the first place.)

Lastly, if you have a particular financial asset for which you would like to build a trading signal (similar to the examples shown above), I will be happy to take a stab at it using iMetrica (and/or give you pointers in the right direction if you would prefer to pursue the endeavor yourself). Just send me what asset you would like to trade on, and I’ll build the filter and send you the coefficients along with the parameters used. Offer holds for a limited time only!

Happy extracting.

# The Frequency Effect Part III: Revelations of Multi-Bandpass Filters and Signal Extraction for Financial Trading

Animation of the out-of-sample performance of one of the multibandpass filters built in this article for the daily returns of the price of Google. The resulting trading signal was extracted and yielded a trading performance near 39 percent ROI during an 80 day out-of-sample period on trading shares of Google.

To conclude the trilogy on this recent voyage through various variations on frequency domain configurations and optimizations in financial trading using MDFA and iMetrica, I venture into the world of what I call multi-bandpass filters that I recently implemented in iMetrica.  The motivation of this latest endeavor in highlighting the fundamental importance of the spectral frequency domain in financial trading applications was wanting to gain better control of extracting signals and engineering different trading strategies through many different types of market movement in financial assets. There are typically four different basic types of movement a price pattern will take during its fractalesque voyage throughout the duration that an asset is traded on a financial market. These patterns/trajectories include

1. steady up-trends in share price
2. low volatility sideways patterns (close to white noise)
3. highly volatile sideways patterns (usually cyclical)
4. long downswings/trends in share price.

Using MDFA for signal extraction in financial time series, one typically indicates an a priori trading strategy through the design of the extractor, namely the target function $\Gamma(\omega)$ (see my previous two articles on The Frequency Effect). Designating a lowpass or bandpass filter in the frequency domain will give an indication of what kind of patterns the extracted trading signal will trade on. Traditionally one can set a lowpass with the goal of extracting trends (with the proper amount of timeliness prioritized in the parameterization), or one can opt for a bandpass to extract smaller cyclical events for more systematic trading during volatile periods. But now suppose we could have the best of both worlds at the same time. Namely, be profitable in both steady climbs and long tumbles, while at the same time systematically hacking our way through rough sideways volatile territory, making trades at specific frequencies embedded in the share price actions not found in long trends. The answer is through the construction of multi-band pass filters. Their construction is relatively simple, but as I will demonstrate in this article with many examples, they are a bit more difficult to pinpoint optimally (but it can be done, and the results are beautiful… both aesthetically and financially).

With the multi-bandpass defined as two separate bands given by $A := 1_{[\omega_0, \omega_1]}$$B := 1_{[\omega_2, \omega_3]}$ with $0 \leq \omega_0$ and $\omega_1 < \omega_2$, zero everywhere else, it is easy to see that the motivation here is to seek a detection of both lower frequencies and low-mid frequencies in the data concurrently. With now up to four cutoff frequencies to choose from, this adds yet another few wrinkles in the degrees of freedom in parameterizing the MDFA setup. If choosing and optimizing one cutoff frequency for a simple low-pass filter in addition to customization and regularization parameters wasn’t enough, now imagine extracting signals with the addition of up to three more cutoff frequencies. Despite these additional degrees of freedom in frequency interval selection, I will later give a couple of useful hacks that I’ve found helpful to get one started down the right path toward successful extraction.

With this multi-bandpass definition for $\Gamma$ comes the responsibility to ensure that the customization of smoothness and timeliness is adjusted for the additional passband. The smoothing function $W(\omega; \alpha)$ for $\alpha \geq 0$ that acts on the periodogram (or discrete Fourier transforms in multivariate mode) is now defined piecewise according to the different intervals $[0,\omega_0]$, $[\omega_1, \omega_2]$, and $[\omega_3, \pi]$.  For example, $\alpha = 20$ gives a piecewise quadratic weighting function (an example shown in Figure 1) and for $\alpha = 10$, the weighting function is piecewise linear. In practice, the piecewise power function smooths and rids of unwanted frequencies in the stop band much better than using a piecewise constant function. With these preliminaries defined, we now move on to the first steps in building and applying multiband pass filters.

Figure 1: Plot of the Piecewise Smoothing Function for alpha = 15 on a mutli-band pass filter.

To motivate this newly customized approach to building financial trading signals, I begin with a simple example where I build a trading signal for the daily share price of Google. We begin with a simple lowpass filter defined by $\Gamma(\omega) = 1$ if $\omega \in [0,.17]$, and 0 otherwise. This formulation, as it includes the zero frequency, should provide a local bias as well as extract very slow moving trends. The trick with these filters for building consistent trading performance is ensure a proper grip on the timeliness characteristics of the filter in a very low and narrow filter passage. Regularization and smoothness using the weighting function shouldn’t be too much of a problem or priority as typically just only a small fraction of the available degrees of freedom on the frequency domain are being utilized, so not much concern for overfitting as long as you’re not using too long of a filter.  In my example, I maxed out the timeliness $\lambda$ parameter and set the $\lambda_{smooth}$ regularization parameter to .3. Fortunately, no optimization of any parameter was needed in this example, as the performance was spiffy enough nearly right after gauging the timeliness parameter $\lambda$. Figure 2 shows the resulting extracted trend trading signal in both the in-sample portion (left of the cyan colored line) and applied to 80 out-of-sample points (right of the cyan line, the most recent 80 daily returns of Google, namely 9-29-12 through today, 1-10-13). The blue-pink line shows the progression of the trading account, in return-on-investment percentage. The out-of-sample gains on the trades made were 22 percent ROI during the 80 day period.

Figure 2: The in-sample and out-of-sample gains made by constructing a low-pass filter employing a very high timeliness parameter and small amount of regularization in smoothness. The out-of-sample gains are nearly 30 percent and no losses on any trades.

Although not perfect, the trading signal produces a monotonic performance both in-sample and out-of-sample, which is exactly what you strive for when building these trend signals for trading. The performance out-of-sample is also highly consistent (in regards to trading frequency and no losses on any trades) with the in-sample performance. With only 4 trades being made, they were done at very interesting points in the trajectory of the Google share price. Firstly, notice that the local bias in the largest upswing is accounted for due to the inclusion of frequency zero in the low pass filter. This (positive) local bias continues out-of-sample until, interestingly enough, two days before one of the largest losses in the share price of Google over the past couple years. A slightly earlier exit out of this long position (optimally at the peak before the down turn a few days before) would have been more strategic; perhaps further tweaking of various parameters would have achieved this, but I happy with it for now. The long position resumes a few days after the dust settles from the major loss, and the local bias in the signal helps once again (after trade 2). The next few weeks sees shorter downtrending cyclical effects, and the signal fortunately turns positively increasingly right before another major turning point for an upswing in the share price. Finally, the third transaction ends the long position at another peak (3), perfect timing. The fourth transaction (no loss or gain) was quickly activated after the signal saw another upturn, and thus is now in the long position (hint: Google trending upward).  Figure 3 shows the transfer functions $\hat{\Gamma}$ for both the sets of explanatory log-return data and Figure 4 depicts the coefficients for the filter. Notice that in the coefficients plot, much more weight is being assigned to past values of the log-return data with extreme (min and max values) at around lags 15 and 30 for the GOOG coefficients (blue-ish line). The coefficients are also quite smooth due to the slight amount of smooth regularization imposed.

Figure 3: Transfer functions for the concurrent trend filter applied to GOOG.

Figure 4: The filter coefficients for the log-return data.

Now suppose we wish to extract a trading signal that performs like a trend signal during long sweeping upswings or downswings, and at the same time shares the property that it extracts smaller cyclical swings during a sideways or highly volatile period. This type of signal would be endowed with the advantage that we could engage in a long position during upswings, trade systematically during sideways and volatile times, and on the same token avoid aggressive long-winded downturns in the price. Financial trading can’t get more optimistic then that, right? Here is where the magic of the multi-bandpass comes in. I give my general “how-to” guidelines in the following paragraphs as a step-by-step approach. As a forewarning, these signals are not easy to build, but with some clever optimization and patience they can be done.

In this new formulation, I envision not only being able to extract a local bias embedded in the log-return data but also gain information on other important frequencies to trade on while in sideways markets. To do this, I set up the lowpass filter as I did earlier on $[0,\omega_0]$. The choice of $\omega_0$ is highly dependent on the data and should be located through a priori investigations (as I did above, without the additional bandpass).

Click on the Animation 2: Example of constructing a multiband pass using the Target Filter control panel in iMetrica. Initially, a low-pass filter is set, then the additional bandpass is added by clicking “Multi-Pass” checkbox. The location is then moved to the desired location using the scrollbars. The new filters are computed automaticall if “Auto” is checked on (lower left corner).

Before setting any parameterization regarding customization, regularization, or filter constraints, I perform a quick scan of the periodogram (averaged periodogram if in multivariate mode) to locate what I call principal trading frequencies in the data. In the averaged periodogram, these frequencies are located at the largest spectral peaks, with the most useful ones for our purposes of financial trading typically before $\pi/4$. The largest of these peaks will be defined from here on out as the principal spectral peak (PSP). Figure 6 shows an example of an averaged periodogram of the log-return for GOOG and AAPL with the PSP indicated. You might note that there exists a much larger spectral peak located at $7\pi/12$, but no need to worry about that one (unless you really enjoy transaction costs). I locate this PSP as a starting point for where I want my signal to trade.

Figure 5: Principal spectral peak in the log-return data of GOOG and AAPL.

In the next step, I place a bandpass of width around .15 so that the PSP is dead-centered in the bandpass. Fortunately with iMetrica, this is a seamlessly simple task with just the use of a scrollbar to slide the positioning of this bandpass (and also adjust  the lowpass) to where I desire. Animation 2 above (click on it to see the animation) shows this process of setting a multi-passband in the MDFA Target Filter control panel. Notice as I move the controls for the location of the bandpass, the filter is automatically recomputed and I can see the changes in the frequency response functions $\hat{\Gamma}$ instantaneously.

With the bandpass set along with the lowpass, we can now view how the in-sample performance is behaving at the initial configuration. Slightly tweaking the location of the bandpass might be necessary (width not so much, in my experience between .15 and .20 is sufficient).  The next step in this approach is now to not only adjust for the location of the bandpass while keeping the PSP located somewhat centered, but also adding the effects of regularization to the filter as well. With this additional bandpass, the filter has a tendency to succumb to overfitting if one is not careful enough.

In my first filter construction attempt, I placed my bandpass at $[.49,.65]$ with the PSP directly under it. I then optimized the regularization controls in-sample (a feature I haven’t discussed yet) and slightly tweaked the timeliness parameter (ended up setting it to 3) and my result (drumroll…)  is shown in Figure 6.

Figure 6: The trading performance and signal for the initial attempt at a building a multiband pass fitler.

Not bad for a first attempt. I was actually surprised at how few trades there were out-of-sample. Although there are no losses during the 80 days out-of-sample (after cyan line), and the signal is sort of what I had in mind a priori, the trades are minimal and not yielding any trading action during the period right after the large loss in Google when the market was going sideways and highly volatile. Notice that the trend signal gained from the lowpass filter indeed did its job by providing the local bias during the large upswing and then selling directly at the peak (first magenta dotted line after the cyan line).  There are small transactions (gains) directly after this point, but still not enough during the sideways market after the drop.  I needed to find a way to tweak the parameters and/or cutoff to include higher frequencies in the transactions.

In my second attempt, I kept the regularization parameters as they were but this time increased the bandpass to the interval $[.51, .68]$, with the PSP still underneath the bandpass, but now catching on to a few more higher frequencies then before.  I also slightly increased the length of the filter to see if that had any affect. After optimizing on the timeliness parameter $\lambda$ in-sample, I get a much improved signal. Figure 7 shows this second attempt.

Figure 7: The trading performance and signal for the second attempt at construction a multiband pass filter. This one included a few more higher frequencies.

Upon inspection, this signal behaves more consistently with what I had in mind. Notice that directly out-of-sample during the long upswing, the signal (barely) shows signs of the local bias, but enough not to make any trades fortunately. However, in this signal, we see that filter is much too late in detecting the huge loss posted by Google, and instead sells immediately after (still a profit however). Then during the volatile sideways market, we see more of what we were wishing for; timely trades to the earn the signal a quick 9 percent in the span of a couple weeks. Then the local bias kicks in again and we see not another trade posted during this short upswing, taking advantage of the local trend. This signal earned a near 22 percent ROI during the 80 day out-of-sample trading period, however not as good as the previous signal at  32 percent ROI.

Now my priority was to find another tweak that I could perform to change the trading structure even more. I’d like it to be even more sensitive to quick downturns, but at the same time keep intact the sideways trading from the signal in Figure 7. My immediate intuition was to turn on the i2 filter constraint and optimize the time-shift, similar to what I did in my previous article, part deux of the Frequency Effect. I also lessened the amount of smoothing from my weighting function $W(\omega; \alpha)$, turned off any amount of decay regularization that I had and voila, my final result in Figure 8.

Figure 8: Third attempt at building a multiband pass filter. Here, I turn on i2 filter constraint and optimize the time shift.

While the consistency with the in-sample performance to out-of-sample performance is somewhat less than my previous attempts, out-of-sample performs nearly exactly how I envisioned. There are only two small losses of less than 1 percent each, and the timeliness of choosing when to sell at the tip of the peak in the share price of Google couldn’t have been better. There is systematic trading governed by the added multiband pass filter during the sideways and slight upswing toward the end. Some of the trades are made later than what would be optimal (the green lines enter a long position, magenta sells and enters short position), but for the most part, they are quite consistent.  It’s also very quick in pinpointing its own erronous trades (namely no huge losses in-sample or out of sample). There you have it, a near monotonic performance out-of-sample with 39 percent ROI.

In examining the coefficients of this filter in Figure 9, we see characteristics of a trend filter as coefficients are largely weighting the middle lags much more than than initial or end lags (note that no decay regularization was added to this filter, only smoothness) . While at the same time however, the coefficients also weight the most recent log-return observations unlike the trend filter from Figure 4, in order to extract signals for the more volatile areas. The undulating patterns also assist in obtaining good performance in the cyclical regions.

Figure 9: The coefficients of the final filter depicting characteristics of both a trend and bandpass filter, as expected.

Finally, the frequency response functions of the concurrent filters show the effect of including the PSP in the bandpass (figure 10). Notice, the largest peak in the bandpass function is found directly at the frequency of the PSP, ahh the PSP. I need to study this frequency with more examples to get a more clear picture to what it means. In the meantime, this is the strategy that I would propose. If you have any questions about any of this, feel free to email me. Until next time, happy extracting!

Figure 10: The frequency response functions of the multi-bandpass filter.

# The Frequency Effect Part Deux: Shifting Time at Frequency Zero For Better Trading Performance

Animation 1: The out-of-sample performance over 60 trading days of a signal built using an optimized time-shift criterion. With 5 trades and 4 successful, the ROI is nearly 40 percent over 3 month.

What is an optimized time-shift? Is it important to use when building successful financial trading signals? While the theoretical aspects of the frequency zero and vanishing time-shift can be discussed in a very formal and mathematical manner,  I hope to answer these questions in a more simple (and applicable) way in this article. To do this, I will give an informative and illustrated real world example in this unforeseen continuation of my previous article on the frequency effect a few days ago. I discovered something quite interesting after I got an e-mail from Herr Doktor Marc (Wildi) that nudged me even further into my circus of investigations in carving out optimal frequency intervals for financial trading (see his blog for the exact email and response).  So I thought about it  and soon after I sent my response to Marc, I began to question a few things even further at 3am in the morning while sipping on some Asian raspberry white tea (my sleeping patterns lately have been as erratic as fiscal cliff negotiations), and came up with an idea. Firstly, there has to be a way to include information about the zero-frequency (this wasn’t included in my previous article on optimal frequency selection). Secondly, if I’m seeing promising results using a narrow band-pass approach after optimizing the location and distance, is there anyway to still incorporate the zero-frequency and maybe improve results even more with this additional frequency information?

Frequency zero is an important frequency in the world of nonstationary time series and model-based time series methodologies as it deals with the topic of unit roots, integrated processes,  and (for multivariate data) cointegration. Fortunately for you (and me), I don’t need to dwell further into this mess of a topic that is cointegration since typically, the type of data we want to deal with in financial trading (log-returns) is closer to being stationary (namely close to being white noise, ehem, again, close, but not quite). Nonetheless, a typical sequence of log-return data over time is never zero-mean, and full of interesting turning points at certain frequency bands. In essence, we’d somehow like to take advantage of that and perhaps better locate local turning points intrinsic to the optimal trading frequency range we are dealing with.

The perfect way to do this is through the use of the time-shift value of the filter. The time-shift is defined by the derivative of the frequency response (or transfer) function at zero. Suppose we have an optimal bandpass set at $(\omega_0, \omega_1) \subset [0,\pi]$ where $\omega_0 > 0$. We can introduce a constraint on the filter coefficients so as to impose a vanishing time-shift at frequency zero. As Wildi says on page 24 of the Elements paper: “A vanishing time-shift is highly desirable because turning-points in the filtered series are concomitant with turning-points in the original data.” In fact, we can take this a step further and even impose an arbitrary time-shift with the value $s$ at frequency zero, where $s$ is any real number. In this case, the derivative of the frequency response function (transfer function) $\hat{\Gamma}(\omega)$ at zero is $s$. As explained on page 25 of Elements,  this is implemented as $\frac{d}{d \omega} |_{\omega=0} \sum_{l=0}^{L-1} b_j \exp(-i j \omega) = s$, which implies $b_1 + 2b_2 + \cdots + (L-1) b_{L-1} = s$.

This constraint can be integrated into the MDFA formulation, but then of course adds another parameter to an already full-flight of parameters.  Furthermore, the search for the optimal $s$ with respect to a given financial trading criterion is tricky and takes some hefty computational assistance by a robust (highly nonlinear) optimization routine, but it can be done. In iMetrica I’ve implemented a time-shift turning point optimizer, something that works well so far for my taste buds, but takes a large burden of computational time to find.

To illustrate this methodology in a real financial trading application, I return to the same example I used in my previous article, namely using daily log-returns of GOOG and AAPL from 6-3-2011 to 12-31-2012 to build a trading signal. This time to freshen things up a but, I’m going to target and trade shares of Apple Inc. instead of Google.  Quickly, before I begin, I will swiftly go through the basic steps of building trading signals. If you’re already familiar, feel free to skip down two paragraphs.

As I’ve mentioned in the past, fundamentally the most important step to building a successful and robust trading signal is in choosing an appropriate preliminary in-sample metric space in which the filter coefficients for the signal are computed. This preliminary in-sample metric space represents by far the most critically important aspect of building a successful trading signal and is built using the following ingredients:

• The target and explanatory series (i.e. minute, hourly, daily log-returns of financial assets)
• The time span of in-sample observations (i.e. 6 hours, 20 days, 168 days, 3 years, etc.)

Choosing the appropriate preliminary in-sample metric space is beyond the scope of this article, but will certainly be discussed in a future article.  Once this in-sample metric space has been chosen, one can then proceed by choosing the optimal extractor (the frequency bandpass interval) for the metric space. While concurrently selecting the optimal extractor, one must  begin warping and bending the preliminary metric space through the use of the various customization and regularization tools (see my previous Frequency Effect article, as well as Marc’s Elements paper for an in-depth look at the mathematics of regularization and customization). These are the principle steps.

Now let’s look at an example. In the .gif animation at the top of this article, I featured a signal that I built using this time-shift optimizer and a frequency bandpass extractor heavily centered around the frequency $\pi/12$, which is not a very frequent trading frequency, but has its benefits, as we’ll see. The preliminary metric space was constructed by an in-sample period using the daily log-returns of GOOG and AAPL and AAPL as my target is from 6-4-2011 to 9-25-2012, nearly 16 months of data. Thus we mention that the in-sample includes many important news events from Apple Inc. such as the announcement of the iPad mini, the iPhone 4S and 5, and the unfortunate sad passing of Steve Jobs. I then proceeded to bend the preliminary metric space with a heavy dosage of regularization, but only a tablespoon of customization¹. Finally, I set the time-shift constraint and applied my optimization routine in iMetrica to find the value $s$ that yields the best possible turning-point detector for the in-sample metric space. The result is shown in Figure 1 below in the slide-show. The in-sample signal from the last 12 months or so (no out-of-sample yet applied) is plotted in green, and since I have future data available (more than 60 trading days worth from 9-25 to present), I can also approximate the target symmetric filter (the theoretically optimal target signal) in order to compare things (a quite useful option available with the click of a button in iMetrica I might add). I do this so I can have a good barometer of over-fitting and concurrent filter robustness at the most recent in-sample observation. Figure 1 in the slide-show below, the trading signal is in green, the AAPL log-return data in red, and the approximated target signal in gray (recall that if you can approximate this target signal (in gray) arbitrarily well, you win, big).

Notice that at the very endpoint (the most challenging point to achieve greatness) of the signal in Figure 1, the filter does a very fine job at getting extremely close. In fact, since the theoretical target signal is only a Fourier approximation of order 60, my concurrent signal that I built might even be closer to the ‘true value’, who knows. Achieving exact replication of the target signal (gray) elsewhere is a little less critical in my experience. All that really matters is that it is close in moving above and below zero to the targeted intention (the symmetric filter) and close at the most recent in-sample observation. Figure 2 above shows the signal without the time-shift constraint and optimization. You might be inclined to say that there is no real big difference. In fact, the signal with no time-shift constraint looks even better. It’s hard to make such a conclusion in-sample, but now here is where things get interesting.

We apply the filter to the out-of-sample data, namely the 60 tradings days. Figure 3 shows the out-of-sample performance over these past 60 trading days, roughly October, November, and December, (12-31-2012 was the latest trading day), of the signal without the time-shift constraint. Compare that to Figure 4 which depicts the performance with the constraint and optimization. Hard to tell a difference, but let’s look closer at the vertical lines. These lines can be easily plotted in iMetrica using the plot button below the canvas named Buy Indicators. The green line represents where the long position begins (we buy shares) and the exit of a short position. The magenta line represents where selling the shares occurs and the entering of a short position. These lines, in other words, are the turning point detection lines. They determine where one buys/sells (enter into a long/short position). Compare the two figures in the out-of-sample-portion after the light cyan line (indicated in Figure 4 but not Figure 3, sorry).

Figure 3: Out-of-sample performance of the signal built without time-shift constraint The out-of-sample period beings where the light cyan line is from Figure 4 below.

Figure 4: Out-of-sample performance of the signal built with time-shift constraint and optimized for turning point-detection, The out-of-sample period beings where the light cyan is.

Notice how the optimized time-shift constraint in the trading signal in Figure 4 pinpoints to a close perfection where the turning points are (specifically at points 3, 4,and 5).  The local minimum turning point was detected exactly at 3, and nearly exact at 4 and 5. The only loss out of the 5 trades occurred at 2, but this was more the fault of the long unexpected fall in the share price of Apple in October. Fortunately we were able to make up for those losses (and then some) at the next trade exactly at the moment a big turning point came (3).  Compare this to the non optimized time-shift constrained signal (Figure 3), and how the second and third turning points are a bit too late and too early, respectively. And remember, this performance is all out-of-sample, no adjustments to the filter have been made, nothing adaptive. To see even more clearly how the two signals compare, here are gains and losses of the 5 actual trades performed out-of-sample (all numbers are in percentage according to gains and losses in the trading account governed only by the signal. Positive number is a gain, negative a loss)

Without Time-Shift Optimization              With Time-Shift Optimization

Trade 1:                              29.1 -> 38.7 =  9.6                          14.1 -> 22.3  =  8.2
Trade 2:                              38.7 -> 32.0  = -6.7                         22.3 -> 17.1  = -5.2
Trade 3:                              32.0 -> 40.7  =  8.7                         17.1  -> 30.5  = 13.4
Trade 4:                              40.7 -> 48.2  =  7.5                         30.5 -> 41.2   = 10.7
Trade 5:                              48.2 -> 60.2  = 12.0                        41.2 -> 53.2   = 12.0

The optimized time-shift signal is clearly better, with an ROI of nearly 40 percent in 3 months of trading. Compare this to roughly 30 percent ROI in the non-constrained signal. I’ll take the optimized time-shift constrained signal any day. I can sleep at night with this type of trading signal. Notice that this trading was applied over a period in which Apple Inc. lost nearly 20 percent of its share price.

Another nice aspect of this trading frequency interval that I used is that trading costs aren’t much of an issue since only 10 transactions (2 transaction each trade) were made in the span of 3 months, even though I did set them to be .01 percent for each transaction nonetheless.

To dig a bit deeper into plausible reasons as to why the optimization of the time-shift constraint matters (if only even just a little bit), let’s take a look at the plots of the coefficients of each respective filter. Figure 5 depicts the filter coefficients with the optimized time-shift constraint, and Figure 6 shows the coefficients without it.  Notice how in the filter for the AAPL log-return data (blue-ish tinted line) the filter privileges the latest observation much more, while slowly modifying the others less. In the non optimized time-shift filter, the most recent observation has much less importance, and in fact, privileges a larger lag more. For timely turning point detection, this is (probably) not a good thing.  Another interesting observation is that the optimized time-shift filter completely disregards the latest observation in the log-return data of GOOG (purplish-line) in order to determine the turning points. Maybe a “better” financial asset could be used for trading AAPL? Hmmm…. well in any case I’m quite ecstatic with these results so far.  I just need to hack my way into writing a better time-shift optimization routine, it’s a bit slow at this point.  Until next time, happy extracting. And feel free to contact me with any questions.

Figure 5: The filter coefficients with time-shift optimization.

Figure 6: The filter coefficients without the time-shift optimization.

¹ I won’t disclose quite yet how I found these optimal parameters and frequency interval or reveal what they are as I need to keep some sort of competitive advantage as I presently look for consulting opportunities 😉 .

# Dream within a dream: How science fiction concepts from the movie Inception can be accomplished in real life (via MDFA)

“Careful, we may be in a model…within a model.” (From an Inception movie poster.)

Have you ever seen the movie Inception and wondered, “Gee, wouldn’t it be neat if I could do all that fancy subconscious dream within a dream manipulation stuff”? Well now you can (in a metaphorical way) using MDFA and iMetrica. I explain how in this article.

Before I begin, may I first draw your attention to a brief introduction of the context in which I am speaking, and that is real-time signal extraction in (nonlinear, nonstationary) information flow. The principle goal of filtering and signal extraction in real-time data analysis for whatever purpose necessary (financial trading, risk analysis, real-time trend detection, seasonal adjustment) is to detect and pinpoint as timely as possible a desired sequence of events in an incoming flow of data observations. We emphasize that this detection should be fast, in that the desired signal, or sequence of events, should be so robust in its timeliness and accuracy so as to detect turning points or actions in targeted events as they happen, or even become so awesome that it manages to anticipate what will happen in the future. Of course, this is never an exact science nor even always possible (otherwise we’d all be billionaires right?) and thus we rely on creative ways to cope with the unknown.

We can also think of signal extraction in more abstract terms. Real-time signal extraction entails the construction of a ‘smart’ illusion, an alternative to reality, where reality in this context is a time series, the information flow, the raw data. This ‘smart’ illusion that is being constructed is the signal, the vital information that has been extracted from an abundance of “noise” embedded in the reality. And the signal must produce important underlying secrets to satisfy the needs of the user, the signal extractor. How these signals are extracted from reality is the grand challenge. How are they produced in a robust, fast, and feasible manner so as to be effective in the real-time flow of information? The answer is in MDFA, or in other words as I’ll describe in this article, penetrating the subconscious state of reality to gain access to hidden treasures.

After recently re-watching the Christopher Nolan opus entitled Inception starring Leonardo DiCaprio and what seems like most of the cast from the Dark Knight Trilogy, I began to see some similarities between the main concepts entertainingly presented in the movie (using some pimped-up CGI), and the mathematics of  signal extraction using the multivariate direct filtering approach (MDFA). In this article I present some of these interesting parallels that I’ve managed to weave together.  My ultimate goal with this article is to hopefully paint a vivid picture of some interesting details stemming from the mathematics of the direct filtering approach by using the parallels that I’ve contrived between the two. Afterwards, hopefully you’ll be on your way to entering the realm of ‘dreaming within dreaming’, and extracting pertinent hidden secrets embedded in a flurry of noise.

The film introduces a slick con man by the name of Cobb (played by DiCaprio), and his team of super well-dressed con artists with leather jackets and slicked back hair (the classic con man look right?). The catchy idea that resides in the premise of the film is that these aren’t ordinary con men: they have a unique way of manipulating reality: by entering the dreams (subconscious ) of their targets (or marks as they call them in the film) and manipulate their subconscious dream state under the goal of extracting a desired idea or hidden secret. Like any group of con men, they attempt to construct a false reality by creating a certain architecture and environment in the target’s dream. The effectiveness of this ‘heist’ to capture the desired signals in the dream relies on the quality of the architecture and environment of the dream.

So how does all this relate to the mathematics of the MDFA for signal extraction. My vision can be seen as follows. In manipulating the target’s subconscious , Cobb’s group basically involves a collection of four components. Each one can be associated with a mathematical concept embedded in the MDFA.

The Target – At the highest level, we have reality. The real world in which the characters, and the target (victim), live. The target victim has an abundance of hidden information among the large capacity of mostly noise, from which Cobb’s group wish to manipulate and extract a hidden secret, the signal. In the MDFA world, we can associate or represent the information flow, the time series on which we perform the signal extraction process as the target victim in the real world. This is the data that we see, the reality. This data of course is non-deterministic,  namely we have no idea what the target victim has in mind for the future. The process of extracting the hidden thoughts or ideas from this target victim is akin to, in the MDFA world, the signal extraction process. The tools used to do the extracting are as follows.

The Extractor – The extractor is depicted in Inception as a master con man, a person who knows how to manipulate a subject (the target) in their subconscious dreaming world into revealing their deepest mental secrets. As the extractor’s goal is manipulation of the subconscious of a target to reveal a certain signal buried within reality, the extractor must transform the real-world conscious mental state of the target from reality into the dreaming subconscious world, by inducing a dream state. The multivariate direct filtering process of transforming the data (reality) into spectral frequency space (the subconscious ) via the Fourier transform to reveal the signal given the desired target data is metaphorically very similar to this process. The Inception extractor can be seen as being parallel to the process of transforming the data from reality into a subconscious world, the spectral frequency domain. It’s in this dreaming subconscious world, the frequency domain, where the real manipulation begins, using an architect.

The Architect – The Inception architect is the designer of the dream who constructs and builds the subconscious world into which the extractor brings the subject, or target. Just as the architect manipulates real world architecture and physics in order to create paradoxes like an endless staircase, folding buildings, smooth transitions from one place to another and other various phenomena otherwise impossible in the real world, the architect in the filtering world is the toolkit of filtering parameters that render the finite-dimensional metric space in which one constructs the filter coefficients to produce the desired signal. This includes the extraction rules (namely the symmetric target filter), customization for timeliness and speed, and regularization to warp and bend the finite dimensional filter metric space. Just as many different paths in the subconscious world toward the manipulation of the target subject exist and it is the architect’s job to create the optimal environment for extracting the desired signal, the architect in the direct filtering world uses the wide ranging set of filter parameters to bend and manipulate the metric space from which the filter coefficients are built and then used in the signal extraction process. Just as changing dynamics in the Inception real world (like the state of free-falling) will change the physics of the dreamt subconscious world (like floating in hotel elevator shafts while engaging in physical combat, Matrix style),  changing dynamics in the information flow will alter the geometry of the consequent architecture being built for the filter. And furthermore, just as the dream architect must be highly skilled in order to manipulate correctly, the MDFA architect must be highly skilled in order to construct the appropriate space in which the optimal signal is extracted (hint hint, call me or Marc, we’re the extractors and architects).

Dream within a dream – As one of the more fascinating concepts introduced in Inception, the concept of the dream within a dream was also the main trick to their success in dream manipulation. Starting from reality, each level of the dreaming subconscious state can be further transposed into another level of subconscious , namely dreaming within a dream. The dream within a dream process puts you into a deeper state of dreaming. The deeper you go, the further one’s mind is removed from reality. This is where the subject of dynamic adaptive filtering comes into play (see my previous article here for an intro and basics to dynamic adaptive filtering in iMetrica). In the direct filtering world, dynamic adaptive filtering is akin to the dream within a dream concept: Once in a level of subconscious (the spectral frequency space in MDFA), and the architect has created the dream used for manipulation (the metric space for the filter coefficients), a new level of subconscious can then be entered by introducing a newly adapted metric space based on the information extracted from the first level of subconscious.

In the dream within a dream, time is the other factor. The deeper you go into a dream state, the faster your mind is able to imagine and perceive things within that dream state. For example, one minute in reality can seem like one hour in the dream state. At the next level of subconscious, at each level in the subconscious , the element of time speeds up exponentially. A similar analogy can be extracted (no pun intended) in the concept of dynamic adaptive filtering. In dynamic adaptive filtering, we first begin by extracting a signal with the desired filter architecture at the first level transformation from reality to the spectral frequency space. When new information is received and our extracted signal is not behaving how we desire, we can build a new filter architecture for manipulating the signal with the newly provided information, with all the filter parameters available to control the desired filter properties. We are inherently building a new updated filter architecture on top of the old filter architecture, and consequently building a new signal from the output of the old signal by correcting (manipulating) this old signal toward our desired goals. This is akin to the dream within a dream concept. And just like the idea of time passing much faster at each subconscious level, the effects of filter parameters for controlling regularization and speed occur at a much faster rate since we are dealing with less information, a much shorter time frame (namely the newly arrived information) at each subsequent filtering level. One can even continue down the levels of subconscious, building a new architecture on top of the previous architecture, continuously using the newly provided information at each level to build the next level of subconsciousness; dream within a dream within a dream.

To summarize these analogies, I’ll be adding a graphic soon to this article that explains in a more succinct manner these parallels described above between Inception and MDFA. In the meantime, here are the temporary replacements.

Haters gonna hate… extractors gonna extract.

Nolan, why you leavin’ Leo out?