# Forecasting, Seasonal Adjustment, and Signal Extraction with uSimX13 in iMetrica

The uSimX13-SEATS (uSimX13) module featured in the iMetrica software suite is an interactive graphical user-interfaced time series modeling and simulation environment. The main attraction of uSimX13 is that it features computational modeling routines from the X- 13ARIMA-SEATS (X-13A-S) software developed and published by the Census Bureau of the US Department of Commerce. The uSimX13 environment offers a unique time series modeling software with the primary goal of analyzing economic time series data using the most commonly used features of X-13ARIMA-SEATS, while providing a large array of classical and modern goodness-of-fit tests to assess different model fits of the data, many different graphical representations of the time series data, adaptive time series decomposition capabilities, and much more all while being accessible to both beginners in the field of econometrics wanting to visualize frequently used tools, and practitioners wanting to obtain forecasts, seasonal/trend adjustments, and/or test and apply regression components to their data.

While there are other X-13A-S “engines” and interfaces in existence, including the original Fortran program and the excellent R package entitled seasonal , uSimX13 in comparison serves to provide most of the commonly used and important features of X-13-ARIMA-SEATS without the use of any programming interface – one simply loads the data into the module (I will show how in this article), and then all the aspects of the modeling, including forecasting, seasonal adjustment, auto-model, and model selection, can be done with just using the iMetrica user-interface. In addition, several interactive features are available to aid in model selection in determining the best model fit for one’s data, some of which are not available in the original Fortran program nor the R package.

To get started, once iMetrica has been launched, the easiest way to get data into the uSimX13 module is to click on the uSimX13 tab, and then access the menu for the uSimX13 tab at the top in the menu bar, as shown in Figure 1.

For a single data file, click ‘Open Data File’ and a file selection dialog box will appear to choose your data file. I have included a few dozen example real economic time series in the folder called ‘data’ that comes with the iMetrica distribution on my Github. The data files accepted for uSimX13 are very trivial, in that they are simply numerical values given for each time period in each row. If there is a date associated with each time series observation, the date and the observation must be seperated by either a comma or a space. There is also the option of loading in many time series, and this can be achieved by selecting ‘Open Metafile’, and choosing a file that lists all the files to the time series that need to be loaded. It is assumed that all the files are found in the same folder as the Metafile. An example is also given in the ‘data’ directory. Scrolling though the different series that were uploaded can be achieved by accessing the menu, then selecting ‘Simulator Panel’. This will bring up a satellite panel, where on the bottom right you will see a scroll bar with all the loaded time series.

Once the data has been loaded in uSimX13, you will see a plot of the data automatically on the main plotting canvas. The plot should be gray at this point. To turn on the automatic features of the module and begin analyzing, click on ‘Activate uSimX13 engine’ as showm io Figure 1.

With the uSimX13 engine activated, this essentially turns on all the automatic estimation components of X-13-AS. The original data will be plotted in cyan, while all the extraction signals will be accessible through clicking on the checkbars in the control panel. One can change the modeling SARIMA dimensions, add outlier or other regression detection components, and visualize automatically the changes in the extracted components. All signal extraction and goodness-of-fit diagnoistcs are shown on the bottom of the control panel.

Once the data has been loaded in, one exploratory feature on the module is the ‘Sliding Span Activate’ that offers a unique approach to model selection and goodness-of-fit by addressing multi-step ahead forecasting error on ‘test’ portions of the data. Such analysis can be readily achieved by using the ‘Sliding Span Activate’ component along with the ‘Sweep Time Series Control Panel’. Further details of this interactive model selection feature can be found in one of my previous articles on model selection here.

That will get you started with the basic features in loading data into the model. To learn more, click on the .pdf file here usimx13 for a full guide on how to use all the components in the uSimX13, including a quick guide on the inference of model selection with signal extraction goodness-of-fit diagnostics that was featured in a recent paper by myself and colleague Tucker McElroy here .

Coming next week: Interactive modeling with State Space and RegComponent models using the State Space Modeling module in iMetrica.

# iMetrica for Linux Ubuntu 64 now available

My first open-source release of iMetrica for Linux Ubuntu 64 can now be downloaded at my Github, with a Windows 64 version soon to follow. iMetrica is a fast, interactive, GUI-oriented software suite for predictive modeling, multivariate time series analysis, real-time signal extraction, Bayesian financial econometrics, and much more.

The principal use of iMetrica is to provide an interactive environment for the numerical and visual analysis of (multivariate) time series modeling, real-time filtering, and signal extraction. The interactive features in iMetrica boast a modeling and graphics environment for analysts, practitioners, and students of econometrics, finance, and real-time data analysis where no coding or modeling experience is necessary. All the system needs is data which can be piped into the system in many forms, including .csv, .txt, Google/Yahoo Finance, Quandle, .RData, and more. A module for connecting to MySQL databases is currently being developed. One can also simulate their own data from a one or a combination of several different popular data generating models.

With the design intending to be interactive and self-enclosed, one can change modeling data/parameter inputs and see the effects in both graphical and numerical form automatically. This feature is designed to help understand the underlying mechanics of the modeling or filtering process. One can test many attributes of the modeling or filtering process this way both visually and numerically such as sensitivity, nonlinearity, goodness-of-fit, any overfitting issues, stability, etc.

All the computational libraries were written in GNU C and/or Fortran and have been provided as Native libraries to the Java platform via JNI, where Java provides the user-interface, control, graphics, and several other components in a module format, and where each module specializes in a different data analysis paradigm. The modules available in this open-source version of iMetrica are as follows:

**1) Data simulation, modeling and fitting using several popular econometric models**

- (S)ARIMA, (E)GARCH, (Multivariate) Factor models, Stochastic Volatility, High-frequency volatility models, Cycles/Trends, and more
- Random number generators from several different types of parameterized distributions to create shocks, outliers, regression components, etc.
- Visualize in real-time all components of the modeling process

**2) An interactive GUI for multivariate real-time signal extraction using the multivariate direct filter approach (MDFA)**

- Construct mulitvariate MA filter designs, classical ARMA ZPA filtering designs, or hybrid filtering designs.
- Analyze all components of the filtering and signal extraction process, from time-delay and smoothing control, to regularization.
- Adaptive real-time filtering
- Construct financial trading signals and forecasts
- Includes a real-time/frequency analysis module using MDFA

**3) An interactive GUI for X-13-ARIMA-SEATS called uSimX13**

- Perform automatic seasonal adjustment on thousands of economic time series
- Compare SARIMA model choices using several different novel signal extraction diagnostics and tools available only in iMetrica
- Visualize in real-time several components of modeling process
- Analyze forecasts and compare with other models
- All of the most important features of X-13-ARIMA-SEATS included

**4) An interactive GUI for RegComponent (State Space and Unobserved Component Models)**

- Construct unobserved signal components and time-varying regression components
- Obtain forecasts automatically and compare with other forecasting models

**5) Empirical Mode Decomposition**

- Applies a fast adaptive EMD algorithm to decompose nonlinear, nonstationary data into a trend and instrinsic modes.
- Visualize all time-frequency components with automatically generated 2D heat maps.

**6) Bayesian Time Series Modeling of ARIMA, (E)GARCH, Multivariate Stochastic Volatility, HEAVY models**

- Compute and visualize posterior distribtions for all modeling parameters
- Easily compare different model dimensions

**7) Financial Trading Strategy Engineering with MDFA**

- Construct financial trading signals in the MDFA module and backtest the strategies on any frequency of data
- Perform analysis of the strategies using forward-walk schemes
- Automatically optimize certain components of the signal extraction on in-sample data.
- Features a toolkit for minimizing probability of backtest overfit

Tutorials on how to use iMetrica can be found on this blog and will be added on a weekly basis, with new tools, features, and modules being added and improved on a consistent basis.

Please send any bug reports, comments, complaints, to clisztian@gmail.com.

# TWS-iMetrica: The Automated Intraday Financial Trading Interface Using Adaptive Multivariate Direct Filtering

### Introduction

I realize that I’ve been MIA (*missing in action* for non-anglophones) the past three months on this blog, but I assure you there has been good reason for my long absence. Not only have I developed a large slew of various optimization, analysis, and statistical tools in iMetrica for constructing high-performance financial trading signals geared towards intraday trading which I will (slowly) be sharing over the next several months (with some of the secret-sauce-recipes kept to myself and my current clients of course), but I have also built, engineered, tested, and finally put into commission on a daily basis the planet’s first automated financial trading platform completely based on the recently developed FT-AMDFA (adaptive multivariate direct filtering approach for financial trading). I introduce to you iMetrica’s little sister, TWS-iMetrica.

Coupled with the original software I developed for hybrid econometrics, time series analysis, signal extraction, and multivariate direct filter engineering called iMetrica, the TWS-iMetrica platform was built in a way to provide an easy to use yet powerful, adaptive, versatile, and automated trading machine for intraday financial trading with a variety of options for building your own day trading strategies using MDFA based on your own financial priorities. Being written completely in Java and gnu c, the TWS-iMetrica system currently uses the* Interactive Brokers* (IB) trading workstation (TWS) Java API in order to construct the automated trades, connect to the necessary historical data feeds, and provide a variety of tick data. Thus in order to run, the system will require an activated IB trading account. However, as I discuss in the conclusion of this article, the software was written in a way to be seamlessly adapted to any other brokerage/trading platform API, as long as the API is available in Java or has Java wrappers available.

The steps for setting up and building an intraday financial trading environment using iMetrica + TWS-iMetrica are easy. There are four of them. No technical analysis indicator garbage is used here, no time domain methodologies, or stochastic calculus. TWS-iMetrica is based completely on the frequency domain approach to building robust real-time multivariate filters that are designed to extract signals from tradable financial assets at any fixed observation of frequencies (the most commonly used in my trading experience with FT-AMDFA being 5, 15, 30, or 60 minute intervals). What makes this paradigm of financial trading versatile is the ability to construct trading signals based on your own trading priorities with each filter designed uniquely for a targeted asset to be traded. With that being said, the four main steps using both iMetrica and TWS-iMetrica are as follows:

- The first step to building an intraday trading environment is to construct what I call an MDFA portfolio (which I’ll define in a brief moment). This is achieved in the TWS-iMetrica interface that is endowed with a user-friendly portfolio construction panel shown below in Figure 4.
- With the desired MDFA portfolio, selected, one then proceeds in connecting TWS-iMetrica to IB by simply pressing the
*Connect*button on the interface in order to download the historical data (see Figure 3). - With the historical data saved, the iMetrica software is then used to upload the saved historical data and build the filters for the given portfolio using the MDFA module in iMetrica (see Figure 2). The filters are constructed using a sequence of proprietary MDFA optimization and analysis tools. Within the iMetrica MDFA module, three different types of filters can be built 1) a trend filter that extracts a
*fast*moving trend 2) a band-pass filter for extracting local cycles, and 3) A multi-bandpass filter that extracts both a*slow*moving trend and local cycles simultaneously. - Once the filters are constructed and saved in a file (a .cft file), the TWS-iMetrica system is ready to be used for intrady trading using the newly constructed and optimized filters (see Figure 6).

In the remaining part of this article, I give an overview of the main features of the TWS-iMetrica software and how easily one can create a high-performing automated trading strategy that fits the needs of the user.

### The TWS-iMetrica Interface

The main TWS-iMetrica graphical user interface is composed of several components that allow for constructing a multitude of various MDFA intraday trading strategies, depending on one’s trading priorities. Figure 3 shows the layout of the GUI after first being launched. The first component is the top menu featuring **TWS System,** some basic TWS connection variables which, in most cases, these variables are left in their default mode, and the** Portfolio** menu. To access the main menu for setting up the MDFA trading environment, click *Setup MDFA Portfolio *under the** Portfolio **menu. Once this is clicked, a panel is displayed (shown in Figure 4) featuring the required* a priori* parameters for building the MDFA trading environment that should all be filled before MDFA filter construction and trading is to take place. The parameters and their possible values are given below Figure 4.

**Portfolio**– The portfolio is the basis for the MDFA trading platform and consists of two types of assets 1) The target asset from which we construct the trading signal, engineer the trades, and use in building the MDFA filter 2) The explanatory assets that provide the explanatory data for the target asset in the multivariate filter construction. Here, one can select up to*four*explanatory assets.**Exchange**– The exchange on which the assets are traded (according to IB).**Asset Type**– If the input portfolio is a selection of Stocks or Futures (Currencies and Options soon to be available).**Expiration**– If trading*Futures*, the*expiration*date of the contract, given as a six digit number of year then month (e.g. 201306 for June 2013).**Shares/Contracts**– The number of shares/contracts to trade (this number can also be changed throughout the trading day through the main panel).**Observation frequency**– In the MDFA financial trading method, we consider uniformly sampled observations of market data on which to do the trading (in seconds). The options are 1, 2, 3, 5, 15, 30, and 60 minute data. The default is 5 minutes.**Data**– For the intraday observations, determines the nature of data being extracted. Valid values include TRADES, MIDPOINT, BID, ASK, and BID_ASK. The default is MIDPOINT**Historical Data**– Selects how many days are used to for downloading the historical data to compute the initial MDFA filters. The historical data will of course come in intervals chosen in the*observation frequency*.

Once all the values have been set for the MDFA portfolio, click the *Set and Build* button which will first begin to check if the values entered are valid and if so, create the necessary data sets for TWS-iMetrica to initialize trading. This all must be done while TWS-iMetrica is connected to IB (not set in trading mode however). If the build was successful, the historical data of the desired target financial asset up to the most recent observation in regular market trading hours will be plotted on the graphics canvas. The historical data will be saved to a file named (by default) “lastSeriesData.dat” and the data will be come in columns, where the first column is the date/time of the observation, the second column is the price of the target asset, and remaining columns are log-returns of the target and explanatory data. And that’s it, the system is now setup to be used for financial trading. These values entered in the Setup MDFA Portfolio will never have to be set again (unless changes to the MDFA portfolio are needed of course).

Continuing on to the other controls and features of TWS-iMetrica, once the portfolio has been set, one can proceed to change any of the settings in main trading *control panel*. All these controls can be used/modified intraday while in automated MDFA trading mode. In the left most side of the panel at the main *control panel* (Figure 5) of the interface includes a set of options for the following features:

- In
**contracts/shares**text field, one enters the amount of share (for stocks) or contracts (for futures) that one will trade throughout the day. This can be adjusted during the day while the automated trading is activated, however, one must be certain that at the end of the day, the balance between bought and shorted contracts is zero, otherwise, you risk keeping contracts or shares overnight into the next trading day.Typically, this is set at the beginning before automated trading takes place and left alone. - The
**data input file**for loading historical data. The name of this file determines where the historical data associated with the MDFA portfolio constructed will be stored. This historical data will be needed in order to build the MDFA filters. By default this is “lastSeriesData.dat”. Usually this doesn’t need to be modified. - The
**stop-loss activation**and**stop-loss**slider bar, one can turn on/off the stop-loss and the stop-loss amount. This value determines how/where a stop-loss will be triggered relative to the price being bought/sold at and is completely dependent on the asset being traded. - The
**interval search**that determines how and when the trades will be made when the selected MDFA signal triggers a buy/sell transaction. If turned*off,*the transaction (a limit order determined by the bid/ask) will be made at the*exact*time that the buy/sell signal is triggered by the filter. If turned on, the value in the text field next to it gives how often (in seconds) the trade looks for a better price to make the transaction. This search runs until the next observation for the MDFA filter. For example, if 5 minute return data is being used to do the trading, the search looks every*n*seconds for 5 minutes for a better price to make the given transaction. If at the end of the 5 minute period no better price has been found, the transaction is is made at the current ask/bid price. This feature has been shown to be quite useful during sideways or highly volatile markets.

The middle of the main *control panel* features the main buttons for both connecting to disconnecting from Interactive Brokers, initiating the MDFA automated trading environment, as well as convenient buttons used for instantaneous buy/sell triggers that supplement the automated system. It also features an on/off toggle button for activating the trades given in the MDFA automated trading environment. When checked on, transactions according to the automated MDFA environment will proceed and go through to the IB account. If turned off, the real-time market data feeds and historical data will continue to be read into the TWS-iMetrica system and the signals according to the filters will be automatically computed, but no actual transactions/trades into the IB account will be made.

Finally, on the right hand side of the main* control panel* features the filter uploading and selection boxes. These are the MDFA filters that are constructed using the MDFA module in iMetrica. One convenient and useful feature of TWS-iMetrica is the ability to utilize up to three direct real-time filters in parallel and to switch at any given moment during market hours between the filters. (Such a feature enhances the adaptability of the trading using MDFA filters. I’ll discuss more about this in further detail shortly). In order to select up to three filters simultaneously, there is a filter selection panel (shown in bottom right corner of Figure 6 below) displaying three separate file choosers and a radio button corresponding to each filter. Clicking on the filter load button produces a file dialog box from which one selects a filter (a *.cft file produced by iMetrica). Once the filter is loaded properly, on success, the name of the filter is displayed in the text box to the right, and the radio button to the left is enabled. With multiple filters loaded, to select between any of them, simply click on their respective radio button and the corresponding signal will be plotted on the plot canvas (assuming market data has been loaded into the TWS-iMetrica using the market data file upload and/or has been connected to the IB TWS for live market data feeds). This is shown in Figure 6.

And finally, once the historical data file for the MDFA portfolio has been created, up to three filters have been created for the portfolio and entered in the filter selection boxes, and the system is connected to Interactive Brokers by pressing the Connect button, the market and signal plot panel can then be used for visualizing the different components that one will need for analyzing the market, signal, and performance of the automated trading environment. In the panel just below the plot canvas features and array of checkboxes and radiobuttons. When connected to IB and the *Start MDFA Trading* has been pressed, all the data and plots are updated in real-time automatically at the specific observation frequency selected in the MDFA Portfolio setup. The currently available plots are as follows:

**Price**– Plots in real-time the price of the asset being traded, at the specific observation frequency selected for the MDFA portfolio.**Log-returns**– Plots in real-time the log-returns of the price, which is the data that is being filtered to construct the trading signal.**Account**– Shows the cumulative returns produced by the currently chosen MDFA filter over the current and historical data period (note that this does not necessary reflect the actual returns made by the strategy in IB, just the theoretical returns over time if this particular filter had been used).**Buy/Sell lines**– Shows dashed lines where the MDFA trading signal has produced a buy/sell transaction. The green lines are the buy signals (entered a long position) and magenta lines are the sell (entered a short position).**Signal**– The plot of the signal in real-time. When new data becomes available, the signal is automatically computed and replotted in real-time. This gives one the ability to closely monitory how the current filter is reacting to the incoming data.**Aux Signal 1/2**– (If available) Plots of the other available signals produced by the (up to two) other filters constructed and entered in the system. To make either of these auxillary signals the main trading signal simply select the filter associated with the signal using the radio buttons in the filter selection panel.

Along with these plots, to track specific values of any of these plots at anytime, select the desired plot in the *Track Plot *region of the panel bar. Once selected, specific values and their respective times/dates are displayed in the upper left corner of the plot panel by simply placing the mouse cursor over the plot panel. A small tracking ball will then be moved along the specific plot in accordance with movements by the mouse cursor.

With the graphics panel displaying the performance in real-time of each filter, one can seamlessly switch between a band-pass filter or a timely trend (low-pass) filter according to the changing intraday market conditions. To give an example, suppose at early morning trading hours there is an unusual high amount of volume pushing an uptrend or pulling a downtrend. In such conditions a trend filter is much more appropriate, being able to follow the large-variation in log-returns much better than a band-pass filter can. One can glean from the effects of the trend filter on the morning hours of the market. After automated trading using the trend filter, with the volume diffusing into the noon hour, the band-pass filter can then be applied in order to extract and trade at certain low frequency cycles in the log-return data. Towards the end of the day, with volume continuously picking up, the trend filter can then be selected again in order to track and trade any trending movement automatically.

I am in the process of currently building an automated algorithm to “intelligently” switch between the uploaded filters according to the instantaneous market conditions (with triggering of the switching being set by volume and volatility. Otherwise, for the time being, currently the user must manually switch between different filters, if such switching is at all desired (in most cases, I prefer to leave one filter running all day. Since the process is automated, I prefer to have minimal (if any) interaction with the software during the day while it’s in automated trading mode).

### Conclusion

As I mentioned earlier, the main components of the TWS-iMetrica were written in a way to be adaptable to other brokerage/trading APIs. The only major condition is that the API either be available in Java, or at least have (possibly third-party?) wrappers available in Java. That being said, there are only three main types of general calls that are made automated to the connected broker 1) retrieve historical data for any asset(s), at any given time, at most commonly used observation frequencies (e.g. 1 min, 5 min, 10 min, etc.), 2) subscribe to automatic feed of bar/tick data to retrieve latest OHLC and bid/ask data, and finally 3) Place an order (buy/sell) to the broker with different any order conditions (limit, stop-loss, market order, etc.) for any given asset.

If you are interested in having TWS-iMetrica be built for your particular brokerage/trading platform (other than IB of course) and the above conditions for the API are met, I am more than happy to be hired at certain fixed compensation, simply get in contact with me. If you are interested seeing how well the automated system has performed thus far, interested in future collaboration, or becoming a client in order to use the TWS-iMetrica platform, feel free to contact me as well.

Happy extracting!

# High-Frequency Financial Trading on Index Futures with MDFA and R: An Example with EURO STOXX50

In this second tutorial on building high-frequency financial trading signals using the multivariate direct filter approach in R, I focus on the first example of my previous article on signal engineering in high-frequency trading of financial index futures where I consider 15-minute log-returns of the Euro STOXX50 index futures with expiration on March 18th, 2013 (STXE H3). As I mentioned in the introduction, I added a slightly new step in my approach to constructing the signals for intraday observations as I had been studying the problem of close-to-open variations in the frequency domain. With 15-minute log-return data, I look at the frequency structure related to the close-to-open variation in the price, namely when the price at close of market hours significantly differs from the price at open, an effect I’ve mentioned in my previous two articles dealing with intraday log-return data. I will show (this time in R) how MDFA can take advantage of this variation in price and profit from each one by ‘predicting’ with the extracted signal the jump or drop in the price at the open of the next trading day. Seems to good to be true, right? I demonstrate in this article how it’s possible.

The first step after looking at the log-price and the log-return data of the asset being traded is to construct the periodogram of the in-sample data being traded on. In this example, I work with the same time frame I did with my previous R tutorial by considering the in-sample portion of my data to be from 1-4-2013 to 1-23-2013, with my out-of-sample data span being from 1-23-2013 to 2-1-2013, which will be used to analyze the true performance of the trading signal. The STXE data and the accompanying explanatory series of the EURO STOXX50 are first loaded into R and then the periodogram is computed as follows.

#load the log-return and log-price SXTE data in-sample load(paste(path.pgm,"stxe_insamp15min.RData",sep="")) load(paste(path.pgm,"stxe_priceinsamp15min.RData",sep="")) #load the log-return and log-price SXTE data out-of-sample load(paste(path.pgm,"stxe_outsamp15min.RData",sep="")) load(paste(path.pgm,"stxe_priceoutsamp15min.RData",sep="")) len_price<-557 out_samp_len<-210 in_samp_len<-347 price_insample<-stxeprice_insamp price_outsample<-stxeprice_outsamp #some mdfa definitions x<-stxe_insamp len<-length(x[,1]) #my range for the 15-min close-to-open cycle cutoff<-.32 ub<-.32 lb<-.23 #------------ Compute DFTs --------------------------- spec_obj<-spec_comp(len,x,0) weight_func<-spec_obj$weight_func stxe_periodogram<-abs(spec_obj$weight_func[,1])^2 K<-length(weight_func[,1])-1 #----------- compute Gamma ---------------------------- Gamma<-((0:K)<(K*ub/pi))&((0:K)>(K*lb/pi)) colo<-rainbow(6) xaxis<-0:K*(pi/(K+1)) plot(xaxis, stxe_periodogram, main="Periodogram of STXE", xlab="Frequency", ylab="Periodogram", xlim=c(0, 3.14), ylim=c(min(stxe_periodogram), max(stxe_periodogram)),col=colo[1],type="l" ) abline(v=c(ub,lb),col=4,lty=3)

You’ll notice in the periodogram of the in-sample STXE log-returns that I’ve pinpointed a spectral peak between two blue dashed lines. This peak corresponds to an intrinsically important cycle in the 15-minute log-returns of index futures that gives access to predicting the close-to-open variation in the price. As you’ll see, the cycle flows fluidly through the 26 15-minute intervals during each trading day and will cross zero at (usually) one to two points during each trading day to signal whether to go long or go short on the index for the next day. I’ve deduced this optimal frequency range in a prior analysis of this data that I did using my target filter toolkit in iMetrica (see previous article). This frequency range will depend on the frequency of intraday observations, and can also depend on the index (but in my experiments, this range is typically consistent to be between .23 and .32 for most index futures using 15min observations). Thus in the R code above, I’ve defined a frequency cutoff at .32 and upper and lower bandpass cutoffs at .32 and .23, respectively.

In this first part of the tutorial, I extract this cycle responsible for marking the close-to-open variations and show how well it can perform. As I’ve mentioned in my previous articles on trading signal extraction, I like to begin with the mean-square solution (i.e. no customization or regularization) to the extraction problem to see exactly what kind of parameterization I might need. To produce the plain vanilla mean-square solution, I set all the parameters to 0.0 and then compute the filter by calling the main MDFA function (shown below). The function IMDFA returns an object with the filter coefficients and the in-sample signal. It also plots the concurrent transfer function for both of the filters along with the filter coefficients for increasing lag, shown in Figure 3.

L<-86 lambda_smooth<-0.0 lambda_cross<-0.0 lambda_decay<-c(0.00,0.0) i1<-F i2<-F lambda<-0 expweight<-0 i_mdfa_obj<-IMDFA(L,i1,i2,cutoff,lambda,expweight,lambda_cross,lambda_decay,lambda_smooth,weight_func,Gamma,x)

Notice the noise leakage past the stopband in the concurrent filter and the roughness of both sets of filter coefficients (due to overfitting). We would like to smooth both of these out, along with allowing the filter coefficients to decay as the lag increases. This ensures more consistent in-sample and out-of-sample properties of the filter. I first apply some smoothing to the stopband by applying an expweight parameter of 16, and to compensate slightly for this improved smoothness, I improve the timeliness by setting the lambda parameter to 1. After noticing the improvement in the smoothness of filter coefficients, I then proceed with the regularization and conclude with the following parameters.

lambda_smooth<-0.90 lambda_decay<-c(0.08,0.11) lambda<-1 expweight<-16

A vast improvement over the mean-squared solution. Virtually no noise leakage in the stopband passed and the coefficients decay beautifully with perfect smoothness achieved. Notice the two transfer functions perfectly picking out the spectral peak that is intrinsic to the close-to-open cycle that I mentioned was between .23 and .32. To verify these filter coefficients achieve the extraction of the close-to-open cycle, I compute the trading signal from the imdfa object and then plot it against the log-returns of STXE. I then compute the trades in-sample using the signal and the log-price of STXE. The R code is below and the plots are shown in Figures 5 and 6.

bn<-i_mdfa_obj$i_mdfa$b trading_signal<-i_mdfa_obj$xff[,1] + i_mdfa_obj$xff[,2] plot(x[L:len,1],col=colo[1],type="l") lines(trading_signal[L:len],col=colo[4]) trade<-trading_logdiff(trading_signal[L:len],price_insample[L:len],0)

Figure 5 shows the log-return data and the trading signal extracted from the data. The spikes in the log-return data represent the close-to-open jumps in the STOXX Europe 50 index futures contract, occurring every 27 observations. But notice how regular the signal is, and how consistent this frequency range is found in the log-return data, almost like a perfect sinusoidal wave, with one complete cycle occurring nearly every 27 observations. This signal triggers trades that are shown in Figure 6, where the black dotted lines are buys/long and the blue dotted lines are sells/shorts. The signal is extremely consistent in finding the opportune times to buy and sell at the near optimal peaks, such as at observations 140, 197, and 240. It also ‘predicts’ the jump or fall of the EuroStoxx50 index future for the next trading day by triggering the necessary buy/sell signal, such as at observations 19, 40, 51, 99, 121, 156, and, 250. The performance of this trading in-sample is shown in Figure 7.

Now for the real litmus test in the performance of this extracted signal, we need to apply the filter out-of-sample to check for consistency in not only performance, but also in trading characteristics. To do this in R, we bind the in-sample and out-of-sample data together and then apply the filter to the out-of-sample set (needing the final L-1 observations from the in-sample portion). The resulting signal in shown in Figure 8.

x_out<-rbind(stxe_insamp,stxe_outsamp) xff<-matrix(nrow=out_samp_len,ncol=2) for(i in 1:out_samp_len) { xff[i,]<-0 for(j in 2:3) { xff[i,j-1]<-xff[i,j-1]+bn[,j-1]%*%x_out[in_samp_len+i:(i-L+1),j] } } trading_signal_outsamp<-xff[,1] + xff[,2] plot(stxe_outsamp[,1],col=colo[1],type="l") lines(trading_signal_outsamp,col=colo[4])

The signal and log-return data Notice that the signal performs consistently out-of-sample until right around observation 170 when the log-returns become increasingly volatile. The intrinsic cycle between frequencies .23 and .32 has been slowed down due to this increased volatility and might affect the trading performance.

The total in-sample plus out-of-sample trading performance is shown in Figure 9 and 10, with the final 210 points being out-of-sample. The out-of-sample performance is very much akin to the in-sample performance we had, with a very clear systematic trading exposed by ‘predicting’ the next day close-to-open jump or fall in a consistent manner, by triggering the necessary buy/sell signal, such as at observations 310, 363, 383, and 413, with only one loss up until the final day trading. The higher volatility during the final day of the in-sample period damages the cyclical signal and fails to trade systematically as it had been during the first 420 observations.

With this kind of performance both in-sample and out-of-sample, and the beautifully consistent yet methodological trading patterns this signal provides, it would seem like attempting to improve upon it would be a pointless task. Why attempt to fix what’s not “broken”. But being the perfectionist that I am, I strive for an even “smarter” filter. If only there was a way to 1) keep the consistent cyclical trading effects as before 2) ‘predict’ the next day close-to-open jump/fall in the Euro Stoxx50 index future as before, and 3) avoid volatile periods to eliminate erroneous trading, where the signal performed worse. After hours spent in iMetrica, I figured how to do it. This is where advanced trading signal engineering comes into play.

The first step was to include all the lower frequencies below .23, which were not included in my previous trading signal. Due to the low amount of activity in these lower frequencies, this should only provide the effect or a ‘lift’ or a ‘push’ or the signal locally, while still retaining the cyclical component. So after changing my to a low-pass filter with cutoff set at , I then computed the filter with the new lowpass design. The transfer functions for the filter coefficients are shown below in Figure 11, with the red colored plot the transfer function for the STXE. Notice that the transfer function for the explanatory series still privileges spectral peak between .23 and .32, with only a slight lift at frequency zero (compare this with the bandpass design in Figure 4, not much has changed). The problem is that the peak exceeds 1.0 in the passband, and this will amplify the cyclical component extracted from the log-return. It might be okay, trading wise, but not what I’m looking to do. For the STXE filter, we get slightly more of a lift at frequency zero, however this has been compensated with a decreased cycle extraction between frequencies .23 and .32. Also, a slight amount of noise has entered in the stopband, another factor we must mollify.

#---- set Gamma to low-pass cutoff<-.32 Gamma<-((0:K)<(cutoff*K/pi)) #---- compute new filter ---------- i_mdfa_obj<-IMDFA(L,i1,i2,cutoff,lambda,expweight,lambda_cross,lambda_decay,lambda_smooth,weight_func,Gamma,x)

To improve the concurrent filter properties for both, I increase the smoothing expweight to 26, which will in turn affect the lambda_smooth, so I decrease it to .70. This gives me a much better transfer function pair, shown in Figure 12. Notice the peak in the explanatory series transfer function is now much closer to 1.0, exactly what we want.

I’m still not satisfied with the lift at frequency zero for the STXE series. At roughly .5 at frequency zero, the filter might not provide enough push or pull that I need. The only way to ensure a guaranteed lift in the STXE log-return series is to employ constraints on the filter coefficients so that the transfer function is one at frequency zero. This can be achieved by setting i1 to true in the IMDFA function call, which effectively ensures that the sum of the filter coefficients at is one. After doing this, I get the following transfer functions and the respective filter coefficients.

#---- Update the regularization parameters lambda_smooth<-0.68 lambda_cross<-0.0 lambda_decay<-c(0.083,0.11) #---- update customization parameters lambda<-0 expweight<-28 #---- set filter constraint ------- i1<-T weight_constraint[1]<-1

Now this is exactly what I was looking for. Not only does the transfer function for the explanatory series keep the important close-to-open cycle intact, but I have also enforced the lift I need for the STXE series. The coefficients still remain smooth with a nice decaying property at the end. With the new filter coefficients, I then applied them to the data both in-sample and out-of-sample, yielding the trading signal shown in Figure 14. It posses exactly the properties that I was seeking. The close-to-open cyclical component is still being extracted (thanks in part to the explanatory series), and is still relatively consistent, although not as much as the pure bandpass design. The feature that I like is the following: When the log-return data diverges away from the cyclical component, with increasing volatility, the STXE filter reacts by pushing the signal down to avoid any erroneous trading. This can be seen in observations 100 through 120 and then at observations 390 through the end of trading. Figure 15 (same as Figure 1 at the top of the article) show the resulting trades and performance produced in-sample and out-of-sample by this signal. This is the art of meticulous signal engineering folks.

With only two losses suffered out-of-sample during the roughly 9 days trading, the filter performs much more methodologically than before. Notice during the final two days trading, when volatility picked up, the signal ceases to trade as it is being pushed down. It even continues to ‘predict’ the close-to-open jump/fall correctly, such as at observations 288, 321, and 391. The last trade made was a sell/short sell position, with the signal trending down at the end. The filter is in position to make a huge gain from this timely signaling of a short position at 391, correctly determining a large fall the next trading day, and then waiting out the volatile trading. The gain should be large no matter what happens.

One thing I mention before concluding is that I made a slight adjustment to my filter design after employing the i1 constraint to get the results shown in Figure 13-15. I’ll leave this as an exercise for the reader to deduce what I have done. Hint: Look at the freezed degrees of freedom before and after applying the i1 constraint. If you still have trouble finding what I’ve done, email me and I’ll give you further hints.

### Conclusion

The overall performance of the first filter built, in regards to total return on investment out-of-sample, was superior to the second. However, this superior performance comes only with the assumption that the cycle component defined between frequencies .23 and .32 will continue to be present in future observations of STXE up until the expiration. If volatility increases and this intrinsic cycle ceases to exist in the log-return data, the performance will deteriorate.

For a better more comfortable approach that deals with changing volatile index conditions, I would opt for ensuring that the local-bias is present in the signal, This will effectively push or pull the signal down or up when the intrinsic cycle is weak in the increasing volatility, resulting in a pullback in trading activity.

As before, you can acquire the high-freq data used in this tutorial by requesting it via email.

Happy extracting!

# High-Frequency Financial Trading with Multivariate Direct Filtering Part Deux: Index Futures

Continuing along the trend of my previous installment on strategies and performances of high-frequency trading using multivariate direct filtering, I take on building trading signals for high-frequency index futures, where I will focus on the STOXX Europe 50 Index, S&P 500, and the Australian Stock Exchange Index. As you will discover in this article, these filters that I build using MDFA in iMetrica have yielded some of the best performing trading signals that I have seen using any trading methodology. My strategy as I’ve been developing throughout my previous articles on MDFA has not changed much, except for one detail that I will discuss throughout and will be a major theme of this article, and that relates to an interesting structure found in index futures series for intraday returns. The structure is related to the close-to-open variation in the price, namely when the price at close of market hours significantly differs from the price at open. an effect I’ve mentioned in my previous two articles dealing with high(er)-frequency (or intraday) log-return data. I will show how MDFA can take advantage of this variation in price and profit from each one by ‘predicting’ with the extracted signal the jump or drop in the price at the open of the next trading day.

The frequency of observations on the index that are to be considered for building trading filters using MDFA is typically only a question of taste and priorities. The beauty of MDFA lies in not only the versatility and strength in building trading signals for virtually any financial trading priorities, but also in the independence on the underlying observation frequency of the data. In my previous articles, I’ve considered and built high-performing strategies for daily, hourly, and 15 minute log returns, where the focus of the strategy in building the signal began with viewing the periodogram as the main barometer in searching for optimal frequencies on which one should set the low-pass cutoff for the extracting target filter function.

Index futures, as in a futures contract on a financial index, as we will see in this article present no new challenges. With the success I had on the 15-minute return observation frequency that I utilized in my previous article in building a signal for the Japanese Yen, I will continue to use the 15 minute intervals for the index futures where I hope to shed some more light on the filter selection process. This includes deducing properties of the intrinsically optimal spectral peaks to trade on. To do this, I present a simple approach I take in these examples by first setting up a bandpass filter over the spectral peak in the periodogram and then study the in-sample and out-of-sample characteristics of this signal, both in performance and consistency. So without further ado, I present my experiments with financial trading on index futures using MDFA, in iMetrica.

### STOXX Europe 50 Index (STXE H3, Expiration March 18 2013)

The STOXX Europe 50 Index, Europe’s leading Blue-chip index, provides a representation of sector leaders in Europe. The index covers 50 stocks from 18 European countries and has the highest trading volume of any European index. One of the first things to notice with the 15-minute log-returns of STXE are the frequent large spikes. These spikes will occur every 27 observations at 13:30 (UTC time zone) due to the fact that there are 26 15-minute periods during the trading hours. These spikes represent the close-to-open jumps that the STOXX Europe 50 index has been subjected to and then reflected in the price of the futures contract. With this ‘seasonal’ pattern so obviously present in the log-return data, the frequency effects of this pattern should be clearly visible in the periodogram of the data. The beauty of MDFA (and iMetrica) is that we have the ability to explicitly engineer a trading signal to take advantage of this ‘seasonal’ pattern by building an appropriate extractor .

Regarding the periodogram of the data, Figure 2 depicts the periodograms for the 15 minute log-returns of STXE (red) and the explanatory series (pink) together on the same discretized frequency domain. Notice that in both log-return series, there is a principal spectral peak found between .23 and .32. The trick is to localize the spectral peak that accounts for the cyclical pattern that is brought about by the close-to-open variation between 20:00 and 13:30 UTC.

In order to see the effects of the MDFA filter when localizing this spectral peak, I use my target builder interface in iMetrica to set the necessary cutoffs for the bandpass filter directly covering both spectral peaks of the log-returns, which are found between .23 and .32. This is shown in Figure 3, where the two dashed red lines show indicate the cutoffs and spectral peak is clearly inside these two cutoffs, with the spectral peak for both series occurring in the vicinity of . Once the bandpass target was fixed on this small frequency range, I set the regularization parameters for the filter coefficients to be , , and .

Pinpointing this frequency range that zooms in on the largest spectral peak generates a filter that acts on the intrinsic cycles found in the 15 minute log-returns of the STXE futures index. The resulting trading signal produced by this spectral peak extraction is shown in Figure 4, with the returns (blue to pink line) generated from the trading signal (green) , and the price of the STXE futures index in gray. The cyclical effects in the signal include the close-to-open variations in the data. Notice how the signal predicts the variation of the close-to-open price in the index quite well, seen from the large jumps or falls in price every 27 observations. The performance of this simple design in extracting the spectral peak of STXE yields a **4 percent ROI on 200 observations out-of-sample** with only 3 losses out of 20 total trades (**85 percent trade success rate**), with two of them being accounted for towards the very end of the out-of-sample observations in an uncharacteristic volatile period occurring on January 31st 2013.

The two concurrent frequency response (transfer) functions for the two filters acting on the STXE log-return data (purple) and the explanatory series (blue), respectively, are plotted below in Figure 5. Notice the presence of the spectral peaks for both series being accounted for in the vicinity of the frequency , with mild damping at the peak. Slow damping of the noise in the higher frequencies is aided by the addition of a smoothing *expweight* parameter that was set at .

With the ideal characteristics of a trading signal quite present in this simple bandpass filter, namely smooth decaying filter coefficients, in-sample and out-of-sample performance properties identical, and accurate, consistent trading patterns, it would be hard to imagine on improving the trading signal for this European futures index even more. But we can. We simply keep the spectral peak frequencies intact, but also account for the local bias in log-return data by extending the lower cutoff to frequency zero. This will provide improved systematic trading characteristics by not only predicting the close-to-open variation and jumps, but also handling upswings and downswings, and highly volatile periods much better.

In this new design, I created a low-pass filter by keeping the upper cutoff from the band-pass design and setting the lower cutoff to 0. I also increased the smoothing parameter to $\alpha = 32$. In this newly designed filter, we see a vast improvement in the trading structure. As before, the filter was able to deduce the direction of every single close-to-open jump during the 200 out-of-sample observations, but notice that it was also able to become much more flexible in the trading during any upswing/downswing and volatile period. This is seen in more detail in Figure 7, where I added the letter ‘D’ to each of the 5 major buy/sell signals occurring before close.

Notice that the signal predicted the jump correctly for each of these major jumps, resulting in large returns. For example, at the first “D” indicator, the signal indicated sell/short (magenta dashed line) the STXE future index 5 observations before close, namely at 18:45 UTC, before market close at 20:00 UTC. Sure enough, the price of the STXE contract went down during overnight trading hours and opened far below the previous days close, with the filter signaling a buy (green dashed line) 45 minutes into trading. At the mark of the second “D”, we see that on the final observation before market close, the signal produced a buy/long indication, and indeed, the next day the price of the future jumped significantly.

Only two very small losses of less than .08 percent were accounted for. One advantage of including the frequency zero along with the spectral peak frequency of STXE is that the local bias can help push-up or pull-down the signal resulting in a more ‘patient’ or ‘diligent’ filter that can better handle long upswings/downswings or volatile periods. This is seen in the improvement of the performance towards the end of the 200 observations out-of-sample, where the filter is more patient in signaling a sell/short after the previous buy. Compare this with the end of the performance from the band-pass filter, Figure 4. With this trading signal out-of-sample, I computed a** 5 percent ROI on the 200 observations out-of-sample** with only 2 small losses. The trading statistics for the entire in-sample combined with out-of-sample are shown in Figure 8.

### S&P 500 Futures Index (ES H3, Expiration March 18 2013)

In this experiment trading S&P 500 future contracts (E-mini) on observations of 15 minute intervals from Jan 4th to Feb 1st 2013, I apply the same regimental approach as before. In looking at the log-returns of ESH3 shown in Figure 10, the effect of close-to-open variation seem to be much less prominent here compared to that on the STXE future index. Because of this, the log-returns seem to be much closer to ‘white noise’ on this index. Let that not perturb our pursuit of a high performing trading signal however. The approach I take for extracting the trading signal, as always, begins with the periodogram.

As the large variations in the close-to-open price are not nearly as prominent, it would make sense that the same spectral peak found before at near is not nearly as prominent either. We can clearly see this in the periodogram plotted below in Figure 11. In fact, the spectral peak at is slightly larger in the explanatory series (pink), thus we should still be able to take advantage of any sort of close-to-open variation that exists in the E-min future index.

With this spectral peak extracted from the series, the resulting trading signal is shown in Figure 12 with the performance of the bandpass signal shown in Figure 13.

One can clearly see that the trading signal performs very well during the consistent cyclical behavior in the ESH3 price, However, when breakdown occurs in this stochastic structure and follows more prominently another frequency, the trading signal dies and no longer trades systematically taking advantage of the intrinsic cycle found near . This can be seen in the middle 90 or so observations. The price can be seen to follow more closely a random walk and the trading becomes inconsistent. After this period of 90 or so observations however, just after the beginning of the out-of-sample period, the trajectory of the ESH3 follows back on its consistent course with a cyclical component it had before.

Now to improve on these results, we include the frequency zero by moving the lower cutoff of the previous band-pass filter to $\latex \omega_0 = 0$. As I mentioned before, this lifts or pushes down the signal from the local bias and will trade much more systematically. I then lessened the amount of smoothing in the expweight function to , down from as I had on the band-pass filter. This allows for slightly higher frequencies than to be traded on. I then proceeded to adjust the regularization parameters to obtain a healthy dosage of smoothness and decay in the coefficients. The result of this new low-pass filter design is shown in Figure 14.

The improvement in the overall systematic trading and performance is clear. Most of the major improvements came from the middle 90 points where the trading became less cyclical. With 6 losses in the band-pass design during this period alone, I was able to turn those losses into two large gains and no losses. Only one major loss was accounted for during the 200 observation out-of-sample testing of filter from January 18th to February 1st, with an **ROI of nearly 4 percent during the 9 trading days**. As with the STXE filter in the previous example, I was able to successfully build a filter that correctly predicts close-to-open variations, despite the added difficulty that such variations were much smaller. Both in-sample and out-of-sample, the filter performs consistently, which is exactly what one strives for thanks to regularization.

### ASX Futures (YAPH3, Expiration March 18, 2013)

In the final experiment, I build a trading signal for the Australian Stock Exchange futures, during the same period of the previous two experiments. The log-returns show moderately large jumps/drops in price during the entire sample from Jan 4th to Feb 1st, but not quite as large as in the STXE index. We still should be able to take advantage of these close-to-open variations.

In looking at the periodograms for both the YAPH3 15 minute log-returns (red) and the explanatory series (pink), it is clear that the spectral peaks don’t align like they did in the previous two exampls. In fact, there hardly exists a dominant spectral peak in the explanatory series, whereas the spectral peak in YAPH3 is very prominent. This ultimately might effect the performance of the filter, and consequently the trades. After building the low-pass filter and setting a high smoothing expweight parameter , I then set the regularization parameters to be , , and (same as first example).

The performance of the filter in-sample and out-of-sample is shown in Figure 18. This was one of the more challenging index futures series to work with as I struggled finding an appropriate explanatory series (likely because I was lazy since it was late at night and I was getting tired). Nonetheless, the filter still seems to predict the close-to-open variation on the Australian stock exchange index fairly well. All the major jumps in price are accounted for if you look closely at the trades (green dashed lines are buys/long and magenta lines are sells/shorts) and the corresponding action on the price of the futures contract. Five losses out-of-sample for a** trade success ratio of 72** percent and an **ROI out-of-sample on 200 observations of 4.2 percent**. As with all other experiments in building trading signals with MDFA, we check the consistency of the in-sample and out-of-sample performance, and these seem to match up nicely.

The filter coefficients for the YAPH3 log-difference series is shown in Figure 19. Notice the perfectly smooth undulating yet decaying structure of the coefficients as the lag increases. What a beauty.

### Conclusion

Studying the trading performance of spectral peaks by first constructing band-pass filters to extract the signal corresponding to the peak in these index futures enabled me to understand how I can better construct the lowpass filter to yield even better performance. In these examples, I demonstrated that the close-to-open variation in the index futures price can be seen in the periodogram and thus be controlled for in the MDFA trading signal construction. This trading frequency corresponds to roughly in the 15 minute observation data that I had from Jan 4th to Feb 1st. As I witnessed in my empirical studies using iMetrica, this peak is more prominent when the close-to-open variations are larger and more often, promoting a very cyclical structure in the log-return data. As I look deeper and deeper into studying the effects of extracting spectral peaks in the periodogram of financial data log-returns and the trading performance, I seem to improve on results even more and building the trading signals becomes even easier.

Stay tuned very soon for a tutorial using R (and MDFA) for one of these examples on high-frequency trading on index futures. If you have any questions or would like to request a certain index future (out of one of the above examples or another) to be dissected in my second and upcoming R tutorial, feel free to shoot me an email.

Happy extracting!

# High-Frequency Financial Trading on FOREX with MDFA and R: An Example with the Japanese Yen

In my previous article on high-frequency trading in iMetrica on the FOREX/GLOBEX, I introduced some robust signal extraction strategies in iMetrica using the multidimensional direct filter approach (MDFA) to generate high-performance signals for trading on the foreign exchange and Futures market. In this article I take a brief leave-of-absence from my world of developing financial trading signals in iMetrica and migrate into an uber-popular language used in finance due to its exuberant array of packages, quick data management and graphics handling, and of course the fact that it’s free (as in speech and beer) on nearly any computing platform in the world.

This article gives an intro tutorial on using R for high-frequency trading on the FOREX market using the R package for MDFA (offered by Herr Doktor Marc Wildi von Bern) and some strategies that I’ve developed for generating financially robust trading signals. For this tutorial, I consider the second example given in my previous article where I engineered a trading signal for 15-minute log-returns of the Japanese Yen (from opening bell to market close EST). This presented slightly new challenges than before as the close-to-open jump variations are much larger than those generated by hourly or daily returns. But as I demonstrated, these larger variations on close-to-open price posed no problems for the MDFA. In fact, it exploited these jumps and made large profits by predicting the direction of the jump. Figure 1 at the top of this article shows the in-sample (observations 1-250) and out-of-sample (observations 251 onward) performance of the filter I will be building in the first part of this tutorial.

Throughout this tutorial, I attempt to replicate these results that I built in iMetrica and expand on them a bit using the R language and the implementation of the MDFA available in here. The data that we consider are 15-minute log-returns of the Yen from January 4th to January 17th and I have them saved as an .RData file given by `ld_fxy_insamp`

. I have an additional explanatory series embedded in the .RData file that I’m using to predict the price of the Yen. Additionally, I also will be using `price_fxy_insamp`

which is the log price of Yen, used to compute the performance (buy/sells) of the trading signal. The `ld_fxy_insamp`

will be used as the in-sample data to construct the filter and trading signal for FXY. To obtain this data so you can perform these examples at home, email me and I’ll send you all the necessary .RData files (the in-sample and out-of-sample data) in a .zip file. Taking a quick glance at the `ld_fxy_insamp`

data, we see log-returns of the Yen at every 15 minutes starting at market open (time zone UTC). The target data (Yen) is in the first column along with the two explanatory series (Yen and another asset co-integrated with movement of Yen).

> head(ld_fxy_insamp)

[,1] [,2] [,3]

2013-01-04 13:30:00 0.000000e+00 0.000000e+00 0.0000000000

2013-01-04 13:45:00 4.763412e-03 4.763412e-03 0.0033465833

2013-01-04 14:00:00 -8.966599e-05 -8.966599e-05 0.0040635638

2013-01-04 14:15:00 2.597055e-03 2.597055e-03 -0.0008322064

2013-01-04 14:30:00 -7.157556e-04 -7.157556e-04 0.0020792190

2013-01-04 14:45:00 -4.476075e-04 -4.476075e-04 -0.0014685198

Moving on, to begin constructing the first trading signal for the Yen, we begin by uploading the data into our R environment, define some initial parameters for the MDFA function call, and then compute the DFTs and periodogram for the Yen.

load(paste(path.pgm,"ld_fxy_in15min.RData",sep="")) #load in-sample log-returns of Yen load(paste(path.pgm,"price_fxy_in15min.RData",sep="")) #load in-sample log-price of Yen in_samp_lenprice_insample<-price_fxy_insamp #setup some MDFA variables x<-ld_fxy_insamp len<-length(x[,1]) shift_constraint<-rep(0,length(x[1,])-1) weight_constraint<-rep(0,length(x[1,])-1) d<-0 plots<-T lin_expweight<-F # Compute DFTs and periodogram for initial analysis spec_obj<-spec_comp(len,x,d) weight_func<-spec_obj$weight_func K<-length(weight_func[,1])-1 fxy_periodogram<-abs(spec_obj$weight_func[,1])^2

As I’ve mentioned in my previous articles, my step-by-step strategy for building trading signals always begin by a quick analysis of the periodogram of the asset being traded on. Holding the key to providing insight into the characteristics of how the asset trades, the periodogram is an essential tool for navigating how the extractor is chosen. Here, I look for principal spectral peaks that correspond in the time domain to how and where my signal will trigger buy/sell trades. Figure 2 shows the periodogram of the 15-minute log-returns of the Japanese Yen during the in-sample period from January 4 to January 17 2013. The arrows point to the main spectral peaks that I look for and provides a guide to how I will define my function. The black dotted lines indicate the two frequency cutoffs that I will consider in this example, the first being and the second at . Notice that both cutoffs are set directly after a spectral peak, something that I highly recommend. In high-frequency trading on the FOREX using MDFA, as we’ll see, the trick is to seek out the spectral peak which accounts for the close-to-open variation in the price of the foreign currency. We want to take advantage of this spectral peak as this is where the big gains in foreign currency trading using MDFA will occur.

In our first example we consider the larger frequency as the cutoff for by setting it to (the right most line in the figure of the periodogram). I then initially set the timeliness and smoothness parameters, and expweight to 0 along with setting all the regularization parameters to 0 as well. This will give me a barometer for where and how much to adjust the filter parameters. In selecting the filter length , my empirical studies over numerous experiments in building trading signals using iMetrica have demonstrated that a ‘good’ choice is anywhere between 1/4 and 1/5 of the total in-sample length of the time series data. Of course, the length depends on the frequency of the data observations (i.e. 15 minute, hourly, daily, etc.), but in general you will most likely never need more than being greater than 1/4 the in-sample size. Otherwise, regularization can become too cumbersome to handle effectively. In this example, the total in-sample length is 335 and thus I set which I’ll stick to for the remainder of this tutorial. In any case, the length of the filter is not the most crucial parameter to consider in building good trading signals. For a good robust selection of the filter parameters couple with appropriate explanatory series, the results of the trading signal with compared with, say, should hardly differ. If they do, then the parameterization is not robust enough.

After uploading both the in-sample log-return data along with the corresponding log price of the Yen for computing the trading performance, we the proceed in R to setting initial filter settings for the MDFA routine and then compute the filter using the `IMDFA_comp`

function. This returns both the `i_mdfa&`

object holding coefficients, frequency response functions, and statistics of filter, along with the signal produced for each explanatory series. We combine these signals to get the final trading signal in-sample. All this is all done in R as follows:

cutoff<-pi/6 #set frequency cutoff Gamma<-((0:K)<(cutoff*K/pi)) #define Gamma grand_mean<-F Lag<-0 L<-82 lambda_smooth<-0 lambda_cross<-0 lambda_decay<-c(0.,0.) #regularization - decay lambda<-0 expweight<-0 i1<-F i2<-F # compute the filter for the given parameter definitions i_mdfa_obj<-IMDFA_comp(Lag,K,L,lambda,weight_func,Gamma,expweight,cutoff,i1,i2,weight_constraint, lambda_cross,lambda_decay,lambda_smooth,x,plots,lin_expweight,shift_constraint,grand_mean) # after computing filter, we save coefficients bn<-i_mdfa_obj$i_mdfa$b # now we build trading signal trading_signal<-i_mdfa_obj$xff[,1] + i_mdfa_obj$xff[,2]

The resulting frequency response functions of the filter and the coefficients are plotted in the figure below.

Notice the abundance of noise still present passed the cutoff frequency. This is mollified by increasing the expweight smoothness parameter. The coefficients for each explanatory series show some correlation in their movement as the lags increase. However, the smoothness and decay of the coefficients leaves much to be desired. We will remedy this by introducing regularization parameters. Plots of the in-sample trading signal and the performance in-sample of the signal are shown in the two figures below. Notice that the trading signal behaves quite nicely in-sample. However, looks can be deceiving. This stellar performance is due in large part to a filtering phenomenon called overfitting. One can deduce that overfitting is the culprit here by simply looking at the nonsmoothness of the coefficients along with the number of freezed degrees of freedom, which in this example is roughly 174 (out of 174), way too high. We would like to get this number at around half the total amount of degrees of freedom (number of explanatory series x L).

The in-sample performance of this filter demonstrates the type of results we would like to see after regularization is applied. But now comes for the sobering effects of overfitting. We apply these filter coeffcients to 200 15-minute observations of the Yen and the explanatory series from January 18 to February 1 2013 and compare with the characteristics in-sample. To do this in R, we first load the out-of-sample data into the R environment, and then apply the filter to the out-of-sample data that I defined as x_out.

load(paste(path.pgm,"ld_fxy_out15min.RData",sep="")) load(paste(path.pgm,"price_fxy_out15min.RData",sep="")) x_out<-rbind(ld_fxy_insamp,ld_fxy_outsamp) #bind the in-sample with out-of-sample data xff<-matrix(nrow=out_samp_len,ncol=2) #apply filter built in-sample for(i in 1:out_samp_len) { xff[i,]<-0 for(j in 2:3) { xff[i,j-1]<-xff[i,j-1]+bn[,j-1]%*%x_out[335+i:(i-L+1),j] } } trading_signal_outsamp<-xff[,1] + xff[,2] #assemble the trading signal out-of-sample trade_outsamp<-trading_logdiff(trading_signal_outsamp,price_outsample,.0005) #compute the performance

The plot in Figure 5 shows the out-of-sample trading signal. Notice that the signal is not nearly as smooth as it was in-sample. Overshooting of the data in some areas is also obviously present. Although the out-of-sample overfitting characteristics of the signal are not horribly suspicious, I would not trust this filter to produce stellar returns in the long run.

Following the previous analysis of the mean-squared solution (no customization or regularization), we now proceed to clean up the problem of overfitting that was apparent in the coefficients along with mollifying the noise in the stopband (frequencies after ). In order to choose the parameters for smoothing and regularization, one approach is to first apply the smoothness parameter first, as this will generally smooth the coefficients while acting as a ‘pre’-regularizer, and then advance to selecting appropriate regularization controls. In looking at the coefficients (Figure 3), we can see that a fair amount of smoothing is necessary, with only a slight touch of decay. To select these two parameters in R, one option is to use the Troikaner optimizer (found here) to find a suitable combination (I have a secret sauce algorithmic approach I developed for iMetrica for choosing optimal combinations of parameters given an extractor and a performance indicator, although it’s lengthy (even in GNU C) and cumbersome to use, so I typically prefer the strategy discussed in this tutorial). In this example, I began by setting the lambda_smooth to .5 and the decay to (.1,.1) along with an expweight smoothness parameter set to 8.5. After viewing the coefficients, it still wasn’t enough smoothness, so I proceeded to add more finally reaching .63, which did the trick. I then chose lambda to balance the effects of the smoothing expweight (lambda is always the last resort tweaking parameter).

lambda_smooth<-0.63 lambda_cross<-0. lambda_decay<-c(0.119,0.099) lambda<-9 expweight<-8.5 i_mdfa_obj<-IMDFA_comp(Lag,K,L,lambda,weight_func,Gamma,expweight,cutoff,i1,i2,weight_constraint, lambda_cross,lambda_decay,lambda_smooth,x,plots,lin_expweight,shift_constraint,grand_mean) bn<-i_mdfa_obj$i_mdfa$b #save the filter coefficients trading_signal<-i_mdfa_obj$xff[,1] + i_mdfa_obj$xff[,2] #compute the trading signal trade<-trading_logdiff(trading_signal[L:len],price_insample[L:len],0) #compute the in-sample performance

Figure 6 shows the resulting frequency response function for both explanatory series (Yen in red). Notice that the largest spectral peak found directly before the frequency cutoff at is being emphasized and slightly mollified (value near .8 instead of 1.0). The other spectral peaks below are also present. For the coefficients, just enough smoothing and decay was applied to keep the lag, cyclical, and correlated structure of the coefficients intact, but now they look much nicer in their smoothed form. The number of freezed degrees of freedom has been reduced to approximately 102.

Along with an improved freezed degrees of freedom and no apparent havoc of overfitting, we apply this filter out-of-sample to the 200 out-of-sample observations in order to verify the improvement in the structure of the filter coefficients (shown below in Figure 7). Notice the tremendous improvement in the properties of the trading signal (compared with Figure 5). The overshooting of the data has be eliminated and the overall smoothness of the signal has significantly improved. This is due to the fact that we’ve eradicated the presence of overfitting.

With all indications of a filter endowed with exactly the characteristics we need for robustness, we now apply the trading signal both in-sample and out of sample to activate the buy/sell trades and see the performance of the trading account in cash value. When the signal crosses below zero, we sell (enter short position) and when the signal rises above zero, we buy (enter long position).

The top plot of Figure 8 is the log price of the Yen for the 15 minute intervals and the dotted lines represent exactly where the trading signal generated trades (crossing zero). The black dotted lines represent a buy (long position) and the blue lines indicate a sell (and short position). Notice that the signal predicted all the close-to-open jumps for the Yen (in part thanks to the explanatory series). This is exactly what we will be striving for when we add regularization and customization to the filter. The cash account of the trades over the in-sample period is shown below, where transaction costs were set at .05 percent. In-sample, the signal earned roughly 6 percent in 9 trading days and a 76 percent trading success ratio.

Now for the ultimate test to see how well the filter performs in producing a winning trading signal, we applied the filter to the 200 15-minute out-of-sample observation of the Yen and the explanatory series from Jan 18th to February 1st and make trades based on the zero crossing. The results are shown below in Figure 9. The black lines represent the buys and blue lines the sells (shorts). Notice the filter is still able to predict the close-to-open jumps even out-of-sample thanks to the regularization. The filter succumbs to only three tiny losses at less than .08 percent each between observations 160 and 180 and one small loss at the beginning, with an out-of-sample trade success ratio hitting 82 percent and an ROI of just over 4 percent over the 9 day interval.

Compare this with the results achieved in iMetrica using the same MDFA parameter settings. In Figure 10, both the in-sample and out-of-sample performance are shown. The performance is nearly identical.

#### Example 2

Now we take a stab at producing another trading filter for the Yen, only this time we wish to identify only the lowest frequencies to generate a trading signal that trades less often, only seeking the largest cycles. As with the performance of the previous filter, we still wish to target the frequencies that might be responsible to the large close-to-open variations in the price of Yen. To do this, we select our cutoff to be which will effectively keep the largest three spectral peaks intact in the low-pass band of .

For this new filter, we keep things simple by continuing to use the same regularization parameters chosen in the previous filter as they seemed to produce good results out-of-sample. The and expweight customization parameters however need to be adjusted to account for the new noise suppression requirements in the stopband and the phase properties in the smaller passband. Thus I increase the smoothing parameter and decreased the timeliness parameter (which only affects the passband) to account for this change. The new frequency response functions and filter coefficients for this smaller lowpass design are shown below in Figure 11. Notice that the second spectral peak is accounted for and only slightly mollified under the new changes. The coefficients still have the noticeable smoothness and decay at the largest lags.

To test the effectiveness of this new lower trading frequency design, we apply the filter coefficients to the 200 out-of-sample observations of the 15-minute Yen log-returns. The performance is shown below in Figure 12. In this filter, we clearly see that the filter still succeeds in predicting correctly the large close-to-open jumps in the price of the Yen. Only three total losses are observed during the 9 day period. The overall performance is not as appealing as the previous filter design as less amount of trades are made, with a near 2 percent ROI and 76 percent trade success ratio. However, this design could fit the priorities for a trader much more sensitive to transaction costs.

#### Conclusion

The point of this tutorial was to show some of the main concepts and strategies that I undergo when approaching the problem of building a robust and highly efficient trading signal for any given asset at any frequency. I also wanted to see if I could achieve similar results with the R MDFA package as my iMetrica software package. The results ended up being nearly parallel except for some minor differences. The main points I was attempting to highlight were in first analyzing the periodogram to seek out the important spectral peaks (such as ones associate with close-to-open variations) and to demonstrate how the choice of the cutoff affects the systematic trading. Here’s a quick recap on good strategies and hacks to keep in mind.

Summary of strategies for building trading signal using MDFA in R:

- As I mentioned before, the periodogram is your best friend. Apply the cutoff directly after any range of spectral peaks that you want to consider. These peaks are what generate the trades.
- Utilize a choice of filter length no greater than 1/4. Anything larger is unnecessary.
- Begin by computing the filter in the mean-square sense, namely without using any customization or regularization and see exactly what needs to be approved upon by viewing the frequency response functions and coefficients for each explanatory series. Good performance of the trading signal in-sample (and even out-of-sample in most cases) is meaningless unless the coefficients have solid robust characteristics in both the frequency domain and the lag domain.
- I recommend beginning with tweaking the smoothness customization parameter expweight and the lambda_smooth regularization parameters first. Then proceed with only slight adjustments to the lambda_decay parameters. Finally, as a last resort, the lambda customization. I really never bother to look at lambda_cross. It has seldom helped in any significant manner. Since the data we are using to target and build trading signals are log-returns, no need to ever bother with i1 and i2. Those are for the truly advanced and patient signal extractors, and should only be left for those endowed with iMetrica 😉

If you have any questions, or would like the high-frequency Yen data I used in these examples, feel free to contact me and I’ll send them to you. Until next time, happy extracting!