The Scrapbook: An index for the MJO using complex Empirical Orthogonal Functions

Elio Campitelli

The Madden-Julian Oscillation is a tropical oscillation located mainly over the Indian and western Pacific oceans. It’s not a standing oscillation, but instead it’s more like a propagating wave of enhanced and reduced convection that moves eastward.

Figure 1: MJO schematic. From https://www.climate.gov/news-features/blogs/enso/what-mjo-and-why-do-we-care.

Due to its propagating nature, it cannot be reproduced by a single EOF so MJO indices use two EFOs. The Real-time Multivariate MJO (RMM) index is made up of the first leading EOFs of Outgoing Longwave Radiation, 200hPa zonal wind and 850 hPa zonal wind anomalies in the tropics. They do some filtering of the time series to remove the seasonal cycle, short-scale fluctuations and the impact of El Niño-Southern Oscillation too.

I think complex Empirical Orthogonal Functions (Horel 1984) might be a great fit for this kind of index, because they can naturally represent propagating patterns.

The methods used to derive the RMM index are listed in Wheeler and Hendon (2004). In short they are, for each variable averaged between 15ºS and 15ºN,

remove the annual cycle, estimated as the waves 0 through 3 of daily means,
remove the linear effect of ENSO using the monthly ONI interpolated to daily values,
remove the 120 day running mean, and
normalise by the global standard deviation.

Here I reproduce the method to the best of my ability. The only difference is that, due to what data easily available to me, I will be using the 1994–2024 period instead of the 1979–2001 to define the annual cycle, remove the (linear) effect of ENSO and compute the EOFs. The dataset has a few missing values, which I impute using DINEOF (Alvera-Azcárate et al. 2011).

To compute the cEOF, an extra fifth step is to “enrich” the original data by applying the Hilbert transform. In the literature they usually apply this step in the time domain: considering the signal as an oscillation in time. In this case, because this is a longitudinally-propagating wave, I’m computing this in the zonal domain: considering the signal as an oscillation in space.

The same way that EOFs are only defined up to a change in sign, cEOFs are only defined up to a rotation in the complex plane. Any rotation is equally “real” but to make it comparable with the RMM index, I will rotate the leading cEOF so that both indices are maximally correlated. To compute the correlation between two bivariate indices I treat them as vectors, and compute their correlation as the mean cosine of the difference between their phases weighted by the product of their amplitudes.

Instead of labelling each component as the “Real” and “Imaginary” part, I use the angle between each and the positive real line. So the real part is the 0º phase and the imaginary part is the 90º phase. Here, the RMM1 index is aligned with the 0º phase and the RMM2 with the 90º phase.

The spatial pattern of the cEOF is shown in Figure 2, in which the map is shown only for reference, the vertical coordinates are arbitrary and the dark band indicates the area in which the variables were averaged.

Figure 2: Spatial patterns of the leading cEOF of OLR, 850 hPa zonal wind and 200 hPa zonal wind anomalies.

Compare this figure with Wheeler and Hendon (2004)’s Figure 1. The MJO in its 0º phase is characterised by increased convection over Indonesia (around 120ºE), which is evident by the OLR minimum, convergence at the lower levels (negative slope in the 850 hPa zonal wind) and divergence at upper levels (positive slope in the 200 hPa zonal wind) and reduced convection in the western Indian ocean and Africa as evidenced by the inverse signal. This is equivalent to the phases 4 and 5 in the RMM diagram. In its 90º phase, the enhanced convection is over the Indian ocean, with drier conditions east of Indonesia.

The amplitude of the signal, particularly for OLR, is maximum over the Indian ocean and Indonesia, with little to no signal in the eastern Pacific, South America and the Atlantic Ocean (Fig. 3).

Figure 3: Amplitude of the OLR component of the cEOF.

A buttery smooth and oddly satisfying animation of the evolution of the cEOF shows how the wet and dry sections travel around the tropics (Fig. 4).

Figure 4: Animation showing all the phases of the cEOF.

The correlation between the RMM index and the cEOF is 0.96, so they are essentially the same indices. Figure 5 show the trajectory of the two indices between March 6^th 2024 and March 31^st 2024 with arrows showing the difference.

Figure 5: Sample trajectory in the RMM/cEOF phase space between March 6^th 2024 and March 31^st 2024. Black arrows indicate the difference between the two indices.

They are almost identical up to an arbitrary scale factor. This scale factor comes up because the RMM1 is scaled so that each component has unit standard deviation in the climatological period, but I scaled the cEOF so that its amplitude has unit standard deviation¹.

The arbitrary constant makes it hard to compare the amplitude of each index. The BOM uses amplitude equal to 1 to more or less define when the MJO is active, but that same cut-off is not necessarily useful for the cEOF index. There are a few ways of getting an equivalent threshold for the cEOF index.

Run a linear regression of the cEOF amplitude as a function of RMM amplitude and use the cEOF amplitude that corresponds to 1 RMM amplitude.
Do the same but using orthogonal regression, which might be more appropriate in this case because this procedure should be symmetrical.
Compute the quantile corresponding to the RMM threshold and use the value of that quantile in the cEOF amplitude series.

Thankfully the three approaches give you basically the same answer (this might not be a coincidence), but I kind of like the 3^rd one better. For the record, the quantile associated with 1 RMM amplitude is 0.38, which translates to a threshhold value of 0.69 in cEOF amplitude.

Figure 6 shows the relationship between the amplitude of each index and a line of best fit using orthogonal regression for each month and the variance explained by the line (this is actually the first Principal Component of the two series).

Figure 6: Relationship between RMM amplitude and cEOF amplitude for each month. The blue line shows the orthogonal regression line, whose explained variance is shown as text, and the horizonal and vertical black lines indicate the 0.38 quantile of each series (computed for the whole period and not for each month).

Although both variables are clearly highly correlated, there are some differences which make a sizeable proportion of days “active” by the cEOF definition but not “active” by the RMM definition and vice versa, especially in the boreal winter months.

An important characteristic of the MJO is its intraseasonal timescale of between 30 and 80 days. Wheeler and Hendon (2004) computes the spectra of each RMM index, but with I can just compute the spectrum of the complex cEOF. The three spectra are very similar (Fig. 7).

Figure 7: Smoothed power spectra of the RMM1, RMM2 and cEOF indices scaled to unit area.

The main difference is that the cEOF spectrum is a bit more concentrated in the intraseasonal range and doesn’t have that “bump”around 200 days that is visible in both the RMM1 and RMM2 indices. I don’t know if that’s something to do with the method or with the change in climatological period. In any case, I think this also highlights one advantage of treating a bivariate index as a complex signal, since it allows you to study a single spectrum for the whole signal.

Finally, Figure 8 shows the regression between OLR and the MJO using three different methods. Both the “RMM”and “cEOF”panels are the regression of OLR anomalies with the respective index amplitude for each “pizza slice” phase (i.e. the 1 phase encompases phases from -180º to -135º). A problem with this method is that each panel discards a lot of information since only around 12% of observations fall into each slice.

The “cEOF - linear” panels are obtained by first computing the multivariate linear regression of OLR with the 0º and 90º phases of the complex series and then using a linear combination to compute the regression associated with any other phase. This method assumes that the relationship between OLR and the MJO is linear in every phase and that the total effect of the MJO can be linearly divided into the effect of two orthogonal phases (i.e. that the 0º phase is equal an oposite to the 180º phase and that the 45º phase is the combined effect of the 0º phase and the 90º phase in equal measure). The big advantage of this method is that it uses al the information available.

Figure 8: Linear regression between each index and OLR for different phases. The “pizza slice” used to construct each panel or the phase represented by each panel is shown by the small inset in the lower-left corner of each panel.

All three methods give more or less the same result that are (naturally) consistent with the cEOF description above; the “wet patch” moves east, reaching its maximum intensity over the Indian ocean and then being replaced by a “dry patch” with a similar behaviour. There is some differences in the small details, but it’s not trivial to know how much of that-small scale structure is just sampling noise. The linear method has the advantage of resulting in smoother patterns with less noise and more large-scale signal. On the other hand, the linear method seems to exaggerate the intensity of the dry patch in phases 6-7, since this has to be equal and opposite to the wet patch in phases 2-3 by construction.

The linear method also creates buttery smooth and oddly satisfying animations.

Figure 9: Animation of OLR regression on the difference phases of the cEOF.

Alvera-Azcárate, A., A. Barth, D. Sirjacobs, F. Lenartz, and J. M. Beckers. 2011. “Data Interpolating Empirical Orthogonal Functions (DINEOF): A Tool for Geophysical Data Analyses.” Mediterranean Marine Science 12 (3): 5. https://doi.org/10.12681/mms.64.

Horel, J. D. 1984. “Complex Principal Component Analysis: Theory and Examples.” Journal of Applied Meteorology and Climatology 23 (12): 1660–73. https://doi.org/10.1175/1520-0450(1984)023<1660:CPCATA>2.0.CO;2.

Wheeler, Matthew C., and Harry H. Hendon. 2004. “An All-Season Real-Time Multivariate MJO Index: Development of an Index for Monitoring and Prediction.” Monthly Weather Review 132 (8): 1917–32. https://doi.org/10.1175/1520-0493(2004)132<1917:aarmmi>2.0.co;2.

The variance can be generalised to complex signals naturally by considering it as the mean squared distance between each point and the average point.↩︎

An index for the MJO using complex Empirical Orthogonal Functions

References

Updates and Corrections

Citation