## 1. Introduction

Weather prediction is inherently associated with uncertainty because of our limited knowledge about the current state of the atmosphere and the limitations arising from representing the atmospheric dynamics and physical processes with numerical algorithms. Over the past decade considerable progress has been made in quantifying this uncertainty in the prediction with the use of ensembles. A skillful ensemble prediction system should bound the actual evolution of the atmosphere most of the time. This can be achieved only if the spread of the ensemble matches the magnitude of the forecast error in an average over many cases.

Techniques that identify optimal linear perturbations about a given forecast trajectory have proven to be an effective way to generate initial condition perturbations for ensembles. Let 𝗠 denote the propagator, that is, the linear operator that maps a perturbation of the initial condition at time *t*_{0} to the perturbation at a later time *t*_{1} = *t*_{0} + *τ*. Then, optimal perturbations **x** can be defined as those **x** that maximize ||𝗠**x**||* _{e}*/||

**x**||

*, where || ||*

_{c}*and || ||*

_{e}*denote suitable norms at*

_{c}*t*

_{1}and

*t*

_{0}, respectively. These optimal perturbations are referred to as singular vectors as they arise from the singular value decomposition of the propagator [see, e.g., Buizza and Palmer (1995) for an introduction to singular vectors]. The time

*τ*is called the optimization time.

In realistic applications, the propagator 𝗠 is obtained from a first-order Taylor expansion of a lower-resolution version of the forecast model with no or simplified parameterizations of physical processes. The linearization between *t*_{0} and *t*_{1} is usually performed about a nonlinear model trajectory started from an analysis valid at *t*_{0}. Such singular vectors will be called analysis singular vectors (AN-SVs) in the following. Alternatively, singular vectors can be computed from a trajectory starting from a short-range forecast, which has been initialized prior to *t*_{0}. The latter singular vectors will be referred to here as forecast singular vectors (FC-SVs). The lead time of the forecast used to start the trajectory at *t*_{0} will be called the FC-SV lead time. First, this study examines the impact of the choice of the trajectory initial condition on the singular vector structure. Second, the skill of ensemble forecasts using initial condition perturbations based on AN-SVs and based on FC-SVs is compared. There are two motivations for this work: (i) considerably more computing resources could be allocated to FC-SVs than to AN-SVs in an operational ensemble prediction system, (ii) the nonlinear forecasts can be completed earlier or more resources can be devoted to them.

In the context of observation targeting studies, Buizza and Montani (1999) and Gelaro et al. (1999) compared AN-SVs with FC-SVs. These studies concluded that the geographical location of the leading singular vectors could be obtained to reasonable accuracy with FC-SVs using lead times of up to 2 days. However, the detailed structure of FC-SVs with lead time ≥1 day can differ considerably from the structure of the corresponding AN-SVs. Gelaro et al. (1999) show that in the Fronts and Atlantic Storm Track Experiment (FASTEX) eleventh intensive observing period (IOP 11) the similarity index for subspaces spanned by the leading three AN-SVs and the leading three FC-SVs drops to 0.4 at a lead time of 1 day. In ensemble prediction, FC-SV lead times of less than 24 h are sufficient in order to achieve the above-mentioned objectives. In the experiments that are summarized here, the similarity between AN-SVs and 12-h lead FC-SVs will be determined. The unknown sensitivity of the ensemble to moderate structural changes of the singular vectors precludes us from drawing further conclusions from the observation targeting studies about the impact on ensemble forecasts.

This study is based on the European Centre for Medium-Range Weather Forecasts (ECMWF) medium-range Ensemble Prediction System (EPS). It is briefly described in section 2 together with the methodology of the experimentation. The similarity between FC-SVs and AN-SVs is examined in section 3. Section 4 compares the skill of ensemble forecasts using these singular vectors for the initial condition perturbations. Discussion and conclusions follow in section 5.

## 2. Methodology

### a. The ECMWF Ensemble Prediction System

At the time of writing, the medium-range Ensemble Prediction System at ECMWF consists of 1 unperturbed and 50 perturbed forecasts. The model spatial resolution is T* _{L}*255 with 40 levels in the vertical (Buizza et al. 2003). The model has semi-Lagrangian semi-implicit advection, and it is integrated with a 45-min time step. Multiplicative noise is used to perturb the tendencies of physical parameterization schemes in the perturbed forecasts with the aim to represent model errors (stochastic physics; Buizza et al. 1999).

Initial condition perturbations are constructed from linear combinations of singular vectors that have been optimized for a 48-h period using a T42 tangent-linear model and the total energy metric at both initial and final time. For the extratropics (30°–90°), the leading ∼35 singular vectors are computed for each hemisphere. A selection, rotation, and scaling procedure yields 25 extratropical perturbations from the initial and evolved singular vectors (Molteni et al. 1996; Barkmeijer et al. 1999; Buizza et al. 2000). Fifty perturbed initial conditions are determined by adding and subtracting these perturbations to the unperturbed analysis.

Perturbations are added also in the Caribbean region and in the vicinity of reported active tropical cyclones between 25°S and 25°N. At a given time, there can be up to four tropical optimization regions. For each region, the perturbations are based on the leading five singular vectors computed with a diabatic tangent-linear model (Barkmeijer et al. 2001) and an optimization region centered on the tropical cyclone. Puri et al. (2001) show that these perturbations are efficient in generating realistic spread of tropical cyclone tracks.

### b. The early-delivery assimilation system

The operational assimilation system at ECMWF was reconfigured in June 2004 in order to be able to deliver the deterministic and ensemble forecasts about 4 h earlier without degrading the forecast skill (Haseler 2005). For instance, the 0000 UTC ensemble is disseminated at 1015 UTC now. The new assimilation configuration, which is referred to as the early-delivery suite, comprises two data assimilation streams. A 12-h four-dimensional variational data assimilation (4DVAR) stream with delayed observation cutoff cycles the information and provides accurate first-guess fields. The 12-h 4DVAR windows begin at 0900 and 2100 UTC. The second data assimilation stream consists of 6-h 4DVAR with an early observation cutoff time and windows centered at 0000 and 1200 UTC. The first-guess fields are provided by the 12-h 4DVAR stream. The operational forecasts are started from the 6-h 4DVAR analyses. Extensive experimentation has shown that the early-delivery suite produces forecasts that are as skillful as those from the previous operational system (delayed-cutoff 12-h 4DVAR with windows beginning at 0300 and 1500 UTC).

### c. Ensemble configurations

The numerical experiments in this study are based on the ensemble configuration described in section 2a. The initial conditions for singular vectors and forecasts are taken from early-delivery assimilation experiments.^{1} In experiment 1 (AN-SV ensemble), the singular vectors are computed from a trajectory starting from the 6-h short-cutoff 4DVAR analysis. In experiment 2 (FC-SV ensemble), the trajectory for the singular vector computation starts from the first-guess field of this analysis, which is the 12-h forecast starting from the previous 12-h 4DVAR analysis. In both experiments, the unperturbed initial conditions for the forecasts are based on the 6-h short-cutoff 4DVAR analysis and the same scaling is used to set the amplitude of the initial condition perturbations. The experiment sample consists of 62 cases in total, 34 days in November–December 1999 and 28 days in September 2003.

## 3. Comparison of AN-SVs and FC-SVs

A subjective comparison of the singular vector structure suggests that FC-SVs with 12-h lead time are very similar to the corresponding AN-SVs. To illustrate this, the temperature field at about 700 hPa is plotted for the leading five singular vectors localized in the Atlantic sector for a case in December 1999 in Fig. 1. Each of the AN-SVs can be matched to a very similar FC-SV.^{2} The corresponding ensemble predictions will be discussed in section 4c.

*S*, introduced by Buizza (1994). It measures the degree of parallelism of singular vector subspaces. The index varies between 0 (orthogonal subspaces) and 1 (identical subspaces). It is computed as the average square norm of the projection of singular vectors in one subspace on the other subspace:where

**v**

_{1}, . . . ,

**v**

*and*

_{N}**w**

_{1}, . . . ,

**w**

*are orthonormal bases of the two subspaces and 𝗘 is the matrix that defines the total energy metric. The index is a very sensitive measure of structural differences. The similarity index between the leading 25 extratropical SVs from 0000 UTC and the leading 25 SVs from 1200 UTC on the same day ranges typically between 0.2 and 0.4. The average similarity index between AN-SVs and FC-SVs has been computed for corresponding pairs of initial singular vectors. The distribution of similarity indices for the available sample of two hemispheres ×66 dates*

_{N}^{3}is presented in Table 1. The similarity index of the leading 25 extratropical SVs is larger than 0.8 in all 132 cases. In 95% percent of the cases, the similarity index exceeds 0.9.

The tropical SVs comprise SVs computed for the Caribbean region and SVs targeted on active TCs. The similarity index is computed for the subspace of the leading five SVs that are used in the ensemble. For 57% of the cases, the similarity index exceeds 0.7. The fraction of cases with *S* ≥ 0.7 increases to 80% if only those sets are considered that are targeted on active tropical cyclones. The fact that the similarity index for the tropical SVs is generally lower than the index for the extratropical SVs arises only partly from considering a lower-dimensional subspace. For the extratropical SVs, the similarity index based on the leading 5 SVs is lower than the index based on the leading 25 SVs but still larger than 0.7 in all 132 pairs. Differences in the dynamics are expected to also contribute to the generally lower similarity indices in the Tropics. The SVs in the Tropics are thought to be sensitive to smaller scale, and thus less predictable, features in the trajectory forecast than the extratropical SVs. The latter depend on the evolution of the synoptic-scale baroclinic regions whereas the former depend on the evolution of a cyclonic vortex that has scales close to the truncation scale of the T42 trajectory forecast. A third aspect that might contribute to lower similarity indices in the Tropics is the fact that the tropical SVs are computed with a lower accuracy than the extratropical ones in the iterative Lanczos algorithm. The spectrum of singular values is shallower in the Tropics and less iterations of the Lanczos algorithm are used.

However, even including the results for the Tropics, the similarity between AN-SVs and FC-SVs is generally high. This suggests that the AN-SV ensemble and the FC-SV ensemble should have about the same skill. The next section describes to what extent this is the case.

## 4. Impact of FC-SVs on ensemble forecasts

First, the FC-SV ensemble and the AN-SV ensemble are compared in terms of extratropical geopotential height predictions. Then, the impact on tropical wind scores and tropical cyclone tracks is presented. Results are based on the entire sample of 62 dates unless stated otherwise. Finally, the impact of FC-SVs on the ensemble forecasts of storms Lothar and Martin in December 1999 is described.

### a. Extratropics

The divergence of the ensemble forecasts can be quantified by the rms difference between the perturbed forecasts and the unperturbed forecast, where the mean is taken over a region, all perturbed members, and a set of dates. This rms difference is referred to as the spread. The spread has been computed for 500- and 1000-hPa geopotential height for the extratropics of each hemisphere.^{4} The spread in the FC-SV ensemble is almost identical to the spread in the AN-SV ensemble. Relative differences in spread are below 1%.

Brier skill scores (BSS) and areas under the relative operating characteristic (ROC) have been computed for 500- and 1000-hPa geopotential height anomaly events for Northern Hemisphere extratropics and Southern Hemisphere extratropics. The relative score differences are summarized in Table 2 for 500 hPa. A *positive* sign has been chosen to indicate that the FC-SV ensemble is *worse* than the AN-SV ensemble. Overall, the impact of changing the trajectory for the singular vector computation is very close to neutral. Minor improvements due to using FC-SVs prevail until day 7; thereafter a minor degradation occurs. The largest relative degradations of the Brier skill scores of about 3% at day 10 correspond still to very small absolute degradations of about 0.01. Results for 1000-hPa anomaly events are very similar (not shown).

The sign test and the *t* test have been applied to the daily series of global score differences to estimate whether the small differences are statistically significant (Barlow 1989). The same test is applied routinely to judge score differences of deterministic forecast scores (M. Fisher 2001, personal communication). The “run test” is used to detect temporal correlation of the score differences. Let us consider two significance categories. Differences will be considered *moderately significant* if the null hypothesis of equal scores is rejected by the *t* test *or* the sign test with a probability of at least 0.9. They will be considered as *significant* if both the *t* test *and* the sign test reject the null hypothesis with a probability of at least 0.95. Furthermore, both categories require that the observed number of runs (a run is a sequence of consecutive cases in which the score difference has the same sign) has a probability larger than 0.1. The probability is computed under the hypothesis that any permutation of the sequence of score differences is equally likely. The results of the significance tests are presented in Table 2 as symbols. Only one of the degradations (positive entries) falls into the category “significant,” and only three entries are “moderately significant.”

A similar conclusion is reached by looking at the samples of November–December 1999 and September 2003 separately (cf. Table 3). The impact of using FC-SVs is close to neutral in both periods. There is no systematic degradation that reaches the above-defined category of statistical significance. In summary, the impact of using FC-SVs in the EPS on probabilistic extratropical height anomaly scores is neutral.

### b. Tropics

The impact in the Tropics (30°S–30°N) has been evaluated by looking at probabilistic scores of 850-hPa wind anomaly events. Relative differences of Brier skill scores for positive anomalies of the wind components are smaller in modulus than 1.5% for all forecast ranges. The sign of the impact changes according to the variable (*u* or *υ* component) and forecast range. The small differences are statistically not significant according to the sign test and the *t* test. Thus, AN-SV ensemble and FC-SV ensemble are identical in terms of the tropical wind scores.

Tropical SVs had been introduced in the EPS because they generate a realistic spread of tropical cyclone (TC) tracks. The diagnosis based on similarity indices suggests that the tropical SVs are more sensitive to the change of the trajectory than the extratropical SVs, which may have a detrimental impact on TC track forecasts. Therefore, we now have a look at the spread of TC tracks in the two ensembles.

*s*〉 at forecast range

*t*is computed aswhere

*s*(

*k, d, c, t*) denotes the great circle distance between the position of TC

*c*in perturbed forecast number

*k*and the mean position of TC

*c*in the ensemble. The symbols

*d*and

*t*refer to the start date and forecast range, respectively. Furthermore,

*D*is the set of dates with at least one TC,

*C*(

*d, t*) is the set of TCs considered for this date and forecast range, and

*M*(

*d, c, t*) is the subset of perturbed forecasts in which TC

*c*could be tracked. The number of elements in the sets

*D, C*(

*d, t*), and

*M*(

*d, c, t*) are denoted by

*N*,

_{D}*N*, and

_{C}*N*, respectively. In the sample of 62 start dates, 41 TCs could be identified that could be tracked in the ensembles to a forecast range of a few days. The mean spread has been computed for this sample of 41 TCs (Table 4). The spread of TC track positions is very similar in both ensembles; the spread in the FC-SV ensemble is marginally smaller than the spread in the AN-SV ensemble. Thus, we conclude that the moderate structural differences between AN-SVs and FC-SVs do not affect the spread in TC tracks.

_{M}### c. European storms of December 1999

In addition to the classical scores presented above, the AN-SV and FC-SV ensembles are compared in terms of synoptic maps of mean sea level pressure. This comparison is focused on the European storms of 26 and 28 December 1999, also referred to as Lothar and Martin, respectively. A subjective comparison of the stamp maps indicates that human forecasters would gauge the FC-SV ensemble as skillful as the AN-SV ensemble in terms of predicting these two storms.

As an example, the 36-h forecast for Martin is presented. Figure 2 shows the verifying analysis, the unperturbed control forecasts of the AN-SV and FC-SV ensembles, and, for reference, the control forecast operational at the time. The forecast by the ensemble operational at the time was very poor at this forecast range; no member predicted a low with central pressure below 980 hPa over France. In contrast, both the AN-SV ensemble and FC-SC ensemble have 10 members with a low deeper than 976 hPa over France (Fig. 3). Both ensembles appear to have equal skill from a forecasters perspective. The differences between the forecasts of this study and the operational ones are attributed to changes in the analysis/forecast system, which comprise changes in observation usage, spatial resolution of forecast model and analysis increments, as well as physical parameterizations.

Note that a particular member, say 31, can be very different in the two ensembles despite the high similarity between the AN-SVs and FC-SVs (Fig. 1). This is partly due to the fact that the coefficients for the linear combination of the SVs in member 31 of the AN-SV ensemble are very different from the coefficients of member 31 in the FC-SV ensemble. The different coefficients are generated by the rotation algorithm, which acts like a random number generator that depends sensitively on the structure of the singular vectors. Additionally, sometimes an AN-SV and its corresponding FC-SV have opposite signs or different indices (see Fig. 1).

## 5. Discussion and conclusions

Over recent years the skill of global numerical weather prediction systems has improved considerably. Simmons and Hollingsworth (2002) discuss improvements in skill of three numerical weather prediction systems over two decades. They also estimate the magnitude of 500-hPa geopotential height errors of the analyses and 1-day forecasts. For a sample in winter 2001 and the ECMWF system, they estimate an error of 7–8 m for the analysis and about 10 m for the 1-day forecast. On the large scales represented in the singular vector computation, the 12-h forecast appears to be almost as accurate as the analysis itself. This explains why the 12-h lead time FC-SVs are very similar to the AN-SVs.

Using a trajectory started from a forecast is considered to be a minor approximation compared to other uncertainties involved in the singular vector approach. Approximations are made in the tangent-linear model by using a low spatial resolution and by representing physical processes in a simplified manner. Furthermore, uncertainties enter through the choice of the initial time norm and the choice of a particular optimization time. The neutral impact of FC-SVs on ensemble scores for the 62 cases of this study supports this conjecture. The potential advantages of using FC-SVs are deemed to outweigh any possible very small degradation in skill that might occur in a different sample of cases. Noticeable degradations due to using FC-SVs are not expected in the extratropics as the similarity between AN-SVs and FC-SVs is very high in every individual case. Based on the results discussed here, it was decided to implement FC-SVs in the ECMWF Ensemble Prediction System as part of the early-delivery suite in June 2004.

## Acknowledgments

I would like to thank Jan Haseler for providing the early-delivery assimilation experiments, Mike Fisher for providing the significance test code, and Philippe Bougeault, Roberto Buizza, and Tim Palmer for their comments on earlier versions of the manuscript. Comments by two anonymous reviewers further improved the presentation of this work.

## REFERENCES

Barkmeijer, J., , R. Buizza, , and T. N. Palmer, 1999: 3D-Var Hessian singular vectors and their potential use in the ECMWF Ensemble Prediction System.

,*Quart. J. Roy. Meteor. Soc.***125****,**2333–2351.Barkmeijer, J., , R. Buizza, , T. N. Palmer, , K. Puri, , and J-F. Mahfouf, 2001: Tropical singular vectors computed with linearized diabatic physics.

,*Quart. J. Roy. Meteor. Soc.***127****,**685–708.Barlow, R. J., 1989:

*A Guide to the Use of Statistical Methods in the Physical Sciences*. Wiley, 204 pp.Buizza, R., 1994: Sensitivity of optimal unstable structures.

,*Quart. J. Roy. Meteor. Soc.***120****,**429–451.Buizza, R., , and T. N. Palmer, 1995: The singular-vector structure of the atmospheric global circulation.

,*J. Atmos. Sci.***52****,**1434–1456.Buizza, R., , and A. Montani, 1999: Targeting observations using singular vectors.

,*J. Atmos. Sci.***56****,**2965–2985.Buizza, R., , M. Miller, , and T. N. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System.

,*Quart. J. Roy. Meteor. Soc.***125****,**2887–2908.Buizza, R., , J. Barkmeijer, , T. N. Palmer, , and D. S. Richardson, 2000: Current status and future developments of the ECMWF Ensemble Prediction System.

,*Meteor. Appl.***7****,**163–176.Buizza, R., , D. S. Richardson, , and T. N. Palmer, 2003: Benefits of increased resolution in the ECMWF ensemble system and comparison with poor-man’s ensembles.

,*Quart. J. Roy. Meteor. Soc.***129****,**1269–1288.Gelaro, R., , R. H. Langland, , G. D. Rohaly, , and T. E. Rosmond, 1999: An assessment of the singular-vector approach to targeted observing using the FASTEX data set.

,*Quart. J. Roy. Meteor. Soc.***125****,**3299–3327.Haseler, J., 2005: Early-delivery suite. Tech. Rep. ECMWF TM 454, 35 pp. [Available online at http://www.ecmwf.int/publications/library/ecpublications/_pdf/tm/401-500/tm454.pdf.].

Molteni, F., , R. Buizza, , T. N. Palmer, , and T. Petroliagis, 1996: The ECMWF Ensemble Prediction System: Methodology and validation.

,*Quart. J. Roy. Meteor. Soc.***122****,**73–119.Puri, K., , J. Barkmeijer, , and T. N. Palmer, 2001: Ensemble prediction of tropical cyclones using targeted diabatic singular vectors.

,*Quart. J. Roy. Meteor. Soc.***127****,**709–731.Simmons, A. J., , and A. Hollingsworth, 2002: Some aspects of the improvement in skill of numerical weather prediction.

,*Quart. J. Roy. Meteor. Soc.***128****,**647–677.

Distribution of similarity index *S* between subspaces of initial time AN-SVs and FC-SVs. Results are grouped by SV type: extratropics (ET), tropical SVs including Caribbean (TR), and subset of TR targeted on active tropical cyclones (TCs). Subspace dimension (Dim) and number of pairs of SV subspaces (No. of sets).

Relative degradation (percent) of extratropical 500-hPa geopotential height anomaly scores due to using FC-SVs [*s*(AN-SVs) − *s*(Fc-SVs)]/*s*(AN-SVs), where *s* is one of the scores. NH: Northern Hemisphere; SH: Southern Hemisphere. Entries marked with ° are moderately significant, and entries with ^{•} are significant (see text for definitions of the significance categories). There is no entry if the modulus of the relative difference is less than 0.1%.

Relative degradation (percent) of Brier skill scores for positive anomalies of 500-hPa geopotential height due to using FC-SVs [*s*(AN-SVs) − *s*(FC-SVs)]/*s*(AN-SVs)] by period. ND99: 34 cases in Nov–Dec 1999; S03: 28 cases in Sep 2003. See Table 2 caption for further details.

Average spread of tropical cyclone tracks about mean position (km). Mean over 41 cases (39 for 120-h forecast range).

^{1}

All experimentation was performed with ECMWF’s Integrated Forecast System model cycle 26r3.

^{2}

Note that the choice of sign for a singular vector is arbitrary.

^{3}

There are four more dates than for the forecasts because the singular vector computation is started two days before the first forecast to generate the evolved singular vectors.

^{4}

30°–90°N and 30°–90°S.