Econometrics of Model Selection

Contents

10 The Econometrics of Model Selection
  10.1 Introduction
  10.2 The selection stages of PcGets
  10.2.1 Formulating the GUM
  10.2.1.1 Integrated variables
  10.2.2 Mis-specification tests
  10.2.2.1 Significant mis-specification tests
  10.2.2.2 Integrated variables
  10.2.3 Pre-search reductions
  10.2.4 Multiple search paths
  10.2.5 Encompassing
  10.2.6 Information criteria
  10.2.7 Sub-sample reliability
  10.2.8 Type I and type II errors
  10.3 Analyzing the algorithm
  10.3.1 Costs of inference and costs of search
  10.4 Selection probabilities
  10.5 Deletion probabilities
  10.6 Monte Carlo evidence on PcGets
List of tables
  Table:10.1 Selected Hoover--Perez DGPs
  Table:10.2 Simulation results for Hoover--Perez experiments

Chapter 10 The Econometrics of Model Selection

10.1 Introduction

The practical embodiment of the theory of reduction is general-to-specific (Gets) modelling. The former explains how the DGP is reduced to the `local' DGP (LDGP), namely the joint distribution of the subset of variables under analysis. Chapter 9 discussed the theory of reduction and explained the origins and properties of the LDGP. Then a general unrestricted model (GUM) is formulated to provide a congruent approximation to the LDGP, given the theoretical and empirical background. The empirical analysis commences from this GUM, after testing for mis-specifications, and if none are apparent, it is simplified to a parsimonious, congruent model, each simplification step being checked by diagnostic testing. Simplification can be done in many ways: and although the goodness of a model is intrinsic to it, and not a property of the selection route, poor routes seem unlikely to deliver useful models. Below, we investigate the impact of selection rules on the properties of the resulting models, and show that a solution is to explore many simplification paths.

When the prior specification of a possible relationship is not known for certain, data evidence is essential to delineate the relevant from the irrelevant variables. Thus, selection is inevitable in practice. Some economists insist on imposing a priori specifications: but such claims assume knowledge of the answer before the investigation starts, so deny empirical modelling any useful role -- and in practice, it has rarely contributed in that approach (see e.g., Summers, 1991).

PcGets embodies all the principles discussed in Hendry (1995a). First, the initial general statistical model is tested for congruence, which is maintained throughout the selection process by diagnostic checks, thereby ensuring a congruent final model. Next statistically-insignificant variables are eliminated by selection tests, both in blocks and individually. Many reduction paths are searched, to prevent the algorithm from becoming stuck in a sequence that inadvertently eliminates a variable that matters, and thereby retains other variables as proxies. If several models are selected, encompassing tests resolve the choice; and if more than one congruent, mutually-encompassing choice remains, model-selection criteria are the final arbiter. Lastly, sub-sample significance helps identify `spuriously significant' regressors.

In Monte Carlo experiments, PcGets recovers the data generation process (DGP) with an accuracy close to what one would expect if the DGP specification were known, but nevertheless coefficient tests were conducted. Empirically, on the DHSY and UK M1 data sets analyzed by Davidson, Hendry, Srba and Yeo (1978) and Hendry and Ericsson (1991b) respectively, PcGets selects (in seconds!) models at least as good as those developed over several years by their authors: see Chapter 5. Computer automation of model selection is in its infancy, yet already exceptional progress has been achieved, setting a high `lower bound' on future performance. Moreover, there is a burgeoning symbiosis between the implementation and the theory -- developments in either stimulate advances in the other.

The following discussion draws on Hendry and Krolzig (1999b), Hendry (2000b) and Krolzig and Hendry (2001). Section 10.2 describes the main stages of PcGets' automatic model-selection procedures from general to specific. Then section 10.3 explains how PcGets implements the methodology of reduction, discriminating between the costs of inference and the costs of search, and describes the steps in the algorithm. Section 10.4 discusses the factors affecting the probability of selecting relevant variables, whereas section 10.5 considers the determinants of the probability of deleting irrelevant variables. Putting these together enables us to explain why PcGets performs so well. Section 10.6 reports the outcomes of applying PcGets to the Monte Carlo experiments in Lovell (1983) as re-analyzed by Hoover and Perez (1999).

10.2 The selection stages of PcGets

10.2.1 Formulating the GUM

Naturally, the specification of the initial general model must depend on the type of data (time series, cross section etc.), the size of sample, number of different potential variables, previous empirical and theoretical findings, likely functional-form transformations (e.g., logs) and appropriate parameterizations, known anomalies (such as measurement changes, breaks etc.) and data availability. The aim is to achieve a congruent starting point, so the specification should be sufficiently general that if a more general model is required, the investigator would be surprised, and therefore already have acquired useful information. Data may prove inadequate for the task, but even if a smaller GUM is enforced by pre-simplification, knowing at the outset what model ought to be postulated remains important. The larger the initial regressor set, the more likely adventitious effects are to be retained; but the smaller the GUM, the more likely key variables will be omitted. Further, the less orthogonality between variables, the more `confusion' the algorithm faces, possibly leading to a proliferation of mutual-encompassing models, where final choices may only differ marginally (e.g., lag 2 versus1).

Davidson and Hendry (1981, p.257) noted four possible problems facing Gets: (i) the chosen `general' model may be inadequate, by being too special a case of the LDGP; (ii) data limitations may preclude specifying the desired relationship; (iii) the non-existence of an optimal sequence for simplification leaves open the choice of reduction path; and (iv) potentially-large type-II error probabilities of the individual tests may be needed to avoid a high type-I error of the overall sequence. By adopting the `multiple path' development of Hoover and Perez (1999), and implementing a range of important improvements, PcGets overcomes the problems associated with points (iii) and (iv). However, the empirical success of PcGets depends crucially on the creativity of each researcher in specifying the general model, and the feasibility of estimating it from the available data -- aspects beyond the capabilities of the program, other than the diagnostic tests serving their usual role of revealing model mis-specification.

There is a central role for economic theory in the modelling process in `prior specification', `prior simplification', and suggesting admissible data transforms. The first of these relates to the inclusion of potentially-relevant variables, the second to the exclusion of irrelevant effects, and the third to appropriate formulations in which the influences to be included are entered, such as log or ratio transforms etc., differences and cointegration vectors, and any likely linear transformations that might enhance orthogonality between regressors. The `LSE approach' argued for a close link of theory and model, and explicitly opposed `running regressions on every variable on the database' as in Lovell (1983) (see e.g., Hendry and Ericsson, 1991a). Unfortunately, economic theory rarely provides a basis for specifying the lag lengths in empirical macro-models: even when a theoretical model is dynamic, a `time period' is usually not well defined. In practice, lags are chosen either for analytical convenience (e.g., first-order differential equation systems), or to allow for some desirable features (as in the choice of a linear, second-order difference equation to replicate cycles). Therefore, it seems sensible to start from an unrestricted autoregressive-distributed lag model with a maximal lag length set according to available evidence (e.g., as four or five lags for quarterly time series, to allow for seasonal dynamics). Prior analysis also remains essential for appropriate parameterizations; functional forms; choice of variables; lag lengths; and indicator variables (including seasonals, special events, etc.). Hopefully, automating the reduction process will enable researchers to concentrate their efforts on designing the GUM, which could significantly improve the empirical success of the algorithm.

10.2.1.1 Integrated variables

PcGets conducts all inferences as if the data are I(0). Most selection tests will in fact be valid even when the data are I(1), given the results in Sims, Stock and Watson (1990). Only t- or F-tests for an effect that corresponds to a unit root require non-standard critical values. The empirical examples on I(1) data provided below do not reveal problems, but in principle it could be useful to implement cointegration tests and appropriate transformations prior to reduction. Care is then required not to `mix' variables with different degrees of integration, so our present recommendation is to specify the GUM in levels.

10.2.2 Mis-specification tests

Given the initial GUM, the next step is to conduct mis-specification tests. There must be sufficient tests to check the main attributes of congruence, but, as discussed above, not so many to induce a large type-I error. Thus, PcGets generally tests the following null hypotheses:

  1. white-noise errors;

  2. conditionally homoscedastic errors;

  3. normally distributed errors;

  4. unconditionally homoscedastic errors;

  5. constant parameters.

Approximate F-test formulations are used (see , ). §13.7.7 describes the finite-sample behaviour of the various tests.

10.2.2.1 Significant mis-specification tests

If the initial mis-specification tests are significant at the pre-specified level, the required significance level is lowered, and search paths terminated only when that lower level is violated. Empirical investigators would probably re-specify the GUM on rejection, but as yet that relies on creativity beyond the capabilities of computer automation.

10.2.2.2 Integrated variables

Wooldridge (1999) shows that diagnostic tests on the GUM (and presumably simplifications thereof) remain valid even for integrated time series.

10.2.3 Pre-search reductions

Once congruence of the GUM is established, groups of variables are tested in the order of their absolute t-values, commencing from the smallest and continuing up towards the pre-assigned selection criterion, when deletion must become inadmissible. A non-stringent significance level is used at this step, usually 90%, since the insignificant variables are deleted permanently. Such a high value might seem surprising given the claim noted above that selection leads to over-parameterization, but confirms that such a claim is not sustainable. If no test is significant, the F-test on all variables in the GUM has been calculated, establishing that there is nothing to model.

Once m variables are included in this first step, non-rejection requires that (a) m-1 variables did not induce rejection; (b) |tm|<ca for a critical value ca; and (c):

F1( m,T-k) ~= 1/m åi=1mti2£cg,
(eq:10.1)

for a critical value cg. Slightly more than half the coefficients will have ti2<0.5 or smaller. Any ti2£1 reduces the mean F1 statistic, and since P( |ti|<1) =0.68, when k=40 then approximately 28 variables fall in that group, leaving an F1-statistic value of less than unity after their elimination. Also, P( |ti|³2) =0.05 so only 2 out of 40 variables should chance to have a larger |ti| value on average. Thus, surprisingly-large values of g, such as 0.75, can be selected for this step yet have a high probability of eliminating most irrelevant variables. Since (e.g.) P( F1( 30,100) <1|H0) ~=0.48, a first step with g=0.5 would on average eliminate 28 variables with ti2£1, when k=40, and some larger t-values as well -- hence the need to check that |tm|<ca.

Two rounds of cumulative simplification are offered, the second at a tighter level such as 25%. Optionally for time-series data, block tests of lag length are also offered.

10.2.4 Multiple search paths

All paths that commence with an insignificant t-deletion are explored. Blocks of variables constitute feasible search paths -- like the block F-tests in the preceding sub-section, but along search paths -- so these can be selected, in addition to individual-coefficient tests. The algorithmic decisions are described in Chapter A1: here we merely note that a non-null set of `terminal' models is selected -- namely all distinct minimal congruent reductions found along all the search paths -- so when more than one such model is found, a choice between these is needed, accomplished as described in the next section.

10.2.5 Encompassing

Encompassing tests select between the candidate congruent models at the end of path searches. Each contender is tested against their union, dropping those which are dominated by, and do not dominate, another contender. If a unique model results, it is selected; otherwise, if some are rejected, PcGets forms the union of the remaining models, and repeats this round till no encompassing reductions result. That union then constitutes a new starting point, and the complete path-search algorithm repeats until the union is unchanged between successive rounds.

10.2.6 Information criteria

When such a union coincides with the original GUM, or with a previous union, so no further feasible reductions can be found, PcGets selects a model by an information criterion. The preferred `final-selection' rule presently is the Schwarz criterion, or BIC, defined above. For T=140 and p=40, minimum SC corresponds approximately to the marginal regressor satisfying |t|³1.9.

10.2.7 Sub-sample reliability

For the finally-selected model, sub-sample reliability is evaluated by the Hoover--Perez overlapping split-sample criterion. PcGets then concludes that some variables are definitely excluded; some definitely included; and some have an uncertain role, varying from a reliability of (say) 0%(included in the final model, but insignificantly, and insignificant in both sub-samples), through to 100%(significant overall and in both sub-samples). Investigators are at liberty to interpret such evidence as they see fit, noting that further simplification of the selected congruent model may induce some violations of congruence or encompassing.

Recursive estimation is central to the Gets research program, but focused on parameter constancy, whereas Hoover and Perez use the split samples to help determine overall significance. A central t-test wanders around the origin, so the probability is low that an effect which is significant only by chance in the full sample will also be significant in two independent sub-samples (see e.g., the discussion in Campos and Ericsson, 1999). Conversely, a non-central t-test diverges as the sample size increases, so should be significant in sub-samples, perhaps at a lower level of significance to reflect the smaller sample size. This strategy should be particularly powerful for model selection when breaks occur in some of the marginal relations over either of the sub-samples.

10.2.8 Type I and type II errors

Whether or not Gets over or under selects is not intrinsic to it, but depends on how it is used: neither type I nor type II errors are emphasized by the methodology per se, nor by the PcGets algorithm, but reflect the choices of critical values in the search process. In the Hendry and Krolzig (1999b) analysis of the Hoover and Perez (1999) re-run of the experiments in Lovell (1983), lowering the significance levels of the diagnostic tests from (say) 0.05 to 0.01 reduced the overall selection size noticeably (due to the difference in powering up 0.95 and 0.99 repeatedly), without greatly affecting the power of the model-selection procedure. Smaller significance levels (1%versus 5%) for diagnostic tests probably have much to commend them. Increasing the significance levels of the selection t-tests also reduced the empirical size, but lowered the power more noticeably for variables with population t-values smaller than 3. This trade-off can, therefore, be selected by an investigator. The next section addresses these issues.

10.3 Analyzing the algorithm

We first distinguish between the costs of inference and the costs of search, then consider some aspects of the search process.

10.3.1 Costs of inference and costs of search

Let pa,idgp denote the probability of retaining the ith variable in the DGP when commencing from the DGP using a selection test procedure with significance level a. Then:

åi=1k( 1-pa,idgp) ,

is a measure of the cost of inference when there are k variables in the DGP.

Let pa,igum denote the probability of retaining the ith variable when commencing from the GUM, also using significance level a. Let Sr denote the set of relevant, and S0, the set of irrelevant, variables. Then pure search costs are:

åiÎSr( pa,idgp-pa,igum) +åiÎS0pa,igum.

For irrelevant variables, pa,idgpº0, so the whole cost of retaining adventitiously-significant variables is attributed to search, plus any additional costs from failing to retain relevant variables. The former can be lowered by increasing the significance levels of selection tests, but at the cost of reducing the latter. However, it is feasible to lower size and raise power simultaneously by an improved search algorithm. When different selection strategies are used on the DGP and GUM (e.g., conventional t-testing on the former; pre-selection F-tests on the latter), then pa,igum could exceed pa,idgp (see, e.g., the critique of theory testing in Hendry and Mizon, 2000).

10.4 Selection probabilities

When searching a large database for a DGP, an investigator might retain the relevant regressors less often than when the correct specification is known, as well as retaining irrelevant variables in the finally-selected model. We first examine the difficulty of retaining relevant variables when commencing from the DGP, then turn to any additional power losses resulting from search. Later we will consider the additional costs of retaining irrelevant variables.

Consider a t-test, denoted t( n,y) , of a null hypothesis H0, where y=0 under the null, when for a critical value ca, a two-sided test is used with P( | t( n,0) | ³ca|H0) =a. When the null is false, such a test will reject with a probability which varies with its non-centrality parameter y (dependent on the sample size and the magnitude of the departure from the null), and the degrees of freedom n. To calculate the power to reject the null when E[ t] =y>0, we use:

P( t³ca½E[ t] =y) ~=P( t-y³ca-y½H0) .

The following table records some of these approximate power calculations when a single null hypothesis is tested and when six are tested, in each case, precisely once for n=100 and different values of y.

t-test powers
yaca ca-yP( | t| ³ca) P( | t| ³ca) 6
1 0.05 1.98  0.98 0.165 0.000
2 0.05 1.98 -0.02 0.508 0.017
2 0.01 2.625  0.625 0.267 0.000
3 0.05 1.98 -1.02 0.845 0.364
3 0.01 2.625 -0.375 0.646 0.073
4 0.01 2.625 -1.375 0.914 0.583
5 0.01 2.625 -2.375 0.990 0.941
6 0.001 3.40 -2.60 0.995 0.968

The fifth column of the table reveals that there is little chance of retaining variables with y=1, and only a 50--50 chance of retaining a single variable with a population |t| of 2 when the critical value is also 2, falling to 25--75 for a critical value of 2.6. When y=3, the power of detection is sharply higher, but still leads to more than 35%mis-classifications at a=0.01. Finally, when y³4, one such variable will almost always be retained, even at stringent significance levels. These powers could be increased slightly by using a one-sided test when the sign is certain.

However, the final column shows that the probability of retaining all six relevant variables with the given non-centralities is essentially negligible when the tests are independent, except in the last three cases. Mixed cases (with different values of y) can be calculated by multiplying the probabilities in the fourth column (e.g., for y=2,3,3,4,5,6 the joint P( .) ~=0.10 at a=0.01). Such combined probabilities are highly non-linear in y, since one is almost certain to retain six variables with y=6, even at a 1%significance level. The important conclusion is that, despite `knowing' the DGP, low signal-noise variables will rarely be retained using t-tests when there is any need to test the null; and if there are many relevant variables, all of them are unlikely to be retained even when they have quite large non-centralities.

One alternative is to use an F-testing approach, after implementing (say) the first stage pre-selection filter discussed above. A joint test will falsely reject a null model d% of the time when the critical value is cd, and the resulting model would then be the post-selection GUM. However, the reliability statistics should help reveal such a problem. Conversely, this joint procedure has a dramatically higher probability of retaining a block of relevant variables. For example, if the 6 remaining variables all had expected t-values of two -- an essentially impossible case above -- then:

E[ F( 6,100) ] ~= 1/6 ( åi=16E[ ti2] ) ~=4.
(eq:10.2)

When d=0.025, cd~=2.5 so to reject we need:

P( F( 6,100) ³2.5½F=4) ,

which we solve by using a non-central c2( 6) approximation to 6F( 6,100) under the null, with critical value ca,k=14.5 , and the approximation under the alternative that:

c2( 6,24) =hc2( m,0) ,

where:

h=
6+48

6+24
=1.8 and m=
( 6+24) 2

6+48
~=17,
(eq:10.3)

so using:

P[ hc2( m,0) >ca,k] =P[ c2( m,0) >h-1ca,k] ~=P[ c2( 17,0) >8] ~=0.97,

thereby almost always retaining all six relevant variables. This is in complete contrast with the near zero probability of retaining all six variables using t-tests on the DGP as above.

10.5 Deletion probabilities

One might expect low deletion probabilities to entail high search costs when many variables are included but none actually matters. That would be true in a pure t-testing strategy as we now show. The probability distribution of one or more null coefficients being significant in pure t-test selection at significance level a is given by the k+1 terms of the binomial expansion of:

1=( a+( 1-a) ) k.

The following table from Hendry (2000a) illustrates by enumeration for k=3:

event probability number retained
P( |ti|<ca,  "i=1,...3) ( 1-a) 3 0
P( |ti|³ca½|tj|<ca,  "j¹i) 3a( 1-a) 2 1
P( |ti|<ca½|tj|³ca,  "j¹i) 3( 1-a) a2 2
P( |ti|³ca,  "i=1,...3) a3 3

The average number of variables retained is m=ka. When a=0.05 and k=40, m equals 2, falling to 0.4 for a=0.01: so even if only t-tests are used, few spurious variables would then be retained.

However, PcGets first systematically checks simplifications of the GUM up to the empty model. To see the approximate properties of that strategy, consider a one-off F-test FG of the GUM against the null model using the critical value cg. Such a test would have size P(FG³cg)=g under the null if it was the only test implemented. Consequently, path searches would only commence g% of the time. Let there be k regressors in the GUM, of which m are retained on average when t-test selection is used following rejection of the null model. When there are no relevant variables, the probability of retaining no variables using t-tests with critical value ca is given by the first row in the above table, namely:

P( |ti|<ca  "i=1,...,k) =( 1-a) k.
(eq:10.4)

When k=40 and a=0.05, there is only a 13%chance of retaining no spurious variables, and on average ka=2 will be retained.

However, combining (eq:10.4) with the FG-test, the null model will be selected with approximate probability:

pG~=( 1-g) +g( 1-a) k,
(eq:10.5)

which will be dramatically smaller even if g is set at quite a high value, such as 0.10 (so the null is incorrectly rejected 10%of the time), whereas a=0.05 is more usual (note that FG³c0.10 could occur without any |ti|³c0.05). Evaluating (eq:10.5) for g=0.10, a=0.05 and k=20 yields pG~=0.91, so the null model is found most of the time.

In the Hendry and Krolzig (1999b) re-run of the Hoover--Perez experiments with k=40, using g=0.01 yielded pG=97.2%, as against a theory prediction from (eq:10.5) of 99%. If m variables are retained when the event FG³c0.01 occurs, then the average `non-deletion' probability across the null-DGP Monte Carlos (i.e., the probability any given variable will be retained) is pr=(1-pG)m/k=0.21% (approximating by m=3), as against the reported value of 0.19% found by Hendry and Krolzig (1999b). These are very small retention rates of spuriously-significant variables, so it is relatively easy to obtain a high probability of locating the null model even if 40 irrelevant variables are included, when relatively tight significance levels are used -- or a reasonably high probability for looser significance levels.

Thus, in contrast to the relatively high costs of inference discussed in the previous section, the costs of search arising from retaining irrelevant variables seem small. For a reasonable GUM with say 20 variables where 15 are irrelevant, even using just t-tests at 5%, less than one spuriously-significant variable will be retained by chance. Pre-selection and multiple path searches of PcGets lower those probabilities. Against such costs, the previous section showed that there is at most a 50%chance of retaining variables with non-centralities less than 2, and little chance of keeping several. Thus, the difficult problem is retention of relevant, not elimination of irrelevant, variables: critical values should be selected with these findings in mind. Practical usage of PcGets suggests that its operational characteristics are quite well described by this analysis.

In applications, we often find that the multi-path searches and the pre-selection procedures produce similar outcomes, so although we cannot yet present a complete probability analysis of the former, it seems to behave almost as well in practice.

10.6 Monte Carlo evidence on PcGets

Considerable Monte Carlo evidence on the behaviour of earlier incarnations of PcGets is presented in Hendry and Krolzig (1999b) and Krolzig and Hendry (2000) for two different types of experimental design. Here we re-examine its performance on the Hoover and Perez (1999) experiments reported in Hendry and Krolzig (1999b). Table Table:10.1 records the DGPs in those experiments that did not involve variables with population t-statistics less than unity in absolute value. In all cases, et~IN[0,1].

Table:10.1 Selected Hoover--Perez DGPs

1 yt=130 et
2 yt=0.75yt-1+130et
2* yt = 0.5 yt-1+130et
7 yt=0.75yt-1+1.33Dxt-0.975Dxt-1+9.73et

The GUM nested the DGP, with the addition of between 37--40 irrelevant variables, depending on the experiment. The basic PcGets setting used was the Liberal strategy, with some variation in the two pre-selection tests, namely the F-test on the GUM being the null model, and the F-test for the significance of the lag length. The outcomes are based on M=1000 replications of the DGP with a sample size of T=100.

Table:1.1 Simulation results for Hoover--Perez experiments

Conservative strategy
HP1 HP2 HP2* HP7
T:DGPfound 1.0000 1.0000 0.9920 1.0000
S:DGPfound 0.8650 0.7890 0.7290 0.7880
S:NonDeletion 0.1350 0.2110 0.2290 0.2120
S:NonSelection ----- 0.0010 0.0850 0.0050
T:Dominated (0.025) 0.1310 0.1880 0.1780 0.1870
S:Dominated (0.025) 0.0040 0.0220 0.0620 0.0200
S:Size 0.0062 0.0109 0.0125 0.0115
S:Power ------ 0.9990 0.9150 0.9983
reliability based
S:Size 0.0047 0.0073 0.0085 0.0075
S:Power ------ 0.9990 0.9132 0.9973
Liberal strategy
HP1 HP2 HP2* HP7
T:DGPfound 1.0000 1.0000 0.9990 1.0000
S:DGPfound 0.4590 0.3560 0.3570 0.3590
S:NonDeletion 0.5410 0.6440 0.6380 0.6410
S:NonSelection ------ 0.0000 0.0320 0.0030
T:Dominated (0.075) 0.5400 0.6360 0.6140 0.6340
S:Dominated (0.075) 0.0010 0.0080 0.0120 0.0050
S:Size 0.0409 0.0556 0.0545 0.0542
S:Power ------ 1.0000 0.9680 0.9990
reliability based
S:Size 0.0348 0.0449 0.0436 0.0431
S:Power ------ 1.0000 0.9660 0.9982
Conservative strategy:
DGP variables fixed
HP2 HP2* HP7
T:DGPfound 1.0000 0.9920 1.0000
S:DGPfound 0.8210 0.8280 0.8250
S:NonDeletion 0.1790 0.1720 0.1750
S:NonSelection ------ ------ ------
T:Dominated (0.025) 0.1660 0.1600 0.1650
S:Dominated (0.025) 0.0130 0.0120 0.0100
S:Size 0.0094 0.0093 0.0097
S:Power ------ ------ ------
reliability based
S:Size 0.0065 0.0065 0.0067
S:Power 1.0000 0.9870 0.9996

The probabilities of retaining the DGP when commencing from it, and from the GUM (T:DGPfound and S:DGPfound) are shown first: the former is always close to unity and the latter almost always above 75%for the range of experiments shown. The power of PcGets (the probability of retaining the variables that matter) is close to that of the DGP, and the size is usually less than 0.75%, even for the Liberal strategy -- although with 37+ irrelevant variables, the Conservative strategy would almost certainly do better.

Next the non-deletion and non-selection probabilities are shown: the latter is usually tiny, so the former is close to 1-S:DGPfound. Finally, T:Dominated and S:Dominated record the probabilities that the DGP or the selected model dominates (i.e., encompasses) the other: as can be seen, the former occurs quite often, between 13--20%, whereas the latter is usually under 5%.

Overall, these finding cohere with those reported earlier (for a different version and very different settings), and suggest that PcGets performs well even in a demanding problem, where the GUM is hugely overparameterized. The outcomes suggest that loose critical values be selected for pre-selection tests as suggested above, and that the considerations in section 5.10 are apposite.

References

Akaike, A. (1973). "Information theory and an extension of the maximum likelihood principle" In Petrov, B. N., and Saki, F. L.(eds.), Second International Symposium of Information Theory. Budapest.

Akaike, H. (1985). "Prediction and entropy" In Atkinson, A. C., and Fienberg, S. E.(eds.), A Celebration of Statistics, pp. 1--24. New York: Springer-Verlag.

Akerlof, G. A. (1979). "Irving Fisher on his head: The consequences of constant target-threshold monitoring of money holdings" Quarterly Journal of Economics, 93, 169--188.

Amemiya, T. (1980). "Selection of regressors" International Economic Review, 21, 331--354.

Anderson, T. W. (1971). The Statistical Analysis of Time Series. New York: John Wiley & Sons.

Andrews, D. W. K. (1991). "Heteroskedasticity and autocorrelation consistent covariance matrix estimation" Econometrica, 59, 817--858.

Banerjee, A., Dolado, J. J., Galbraith, J. W., and Hendry, D. F. (1993). Co-integration, Error Correction and the Econometric Analysis of Non-Stationary Data. Oxford: Oxford University Press.

Bårdsen, G. (1989). "The estimation of long run coefficients from error correction models" Oxford Bulletin of Economics and Statistics, 50.

Bean, C. R. (1977). "More consumers' expenditure equations" Academic panel paper (77)35, H.M. Treasury, London.

Bean, C. R. (1978). "The determination of consumers' expenditure in the UK" Government economic service working paper 4, H.M. Treasury, London.

Bontemps, C., and Mizon, G. E. (2001). "Congruence and encompassing" In Stigum, B.(ed.), Studies in Economic Methodology. Cambridge, Mass.: MIT Press.

Boswijk, H. P. (1992). Cointegration, Identification and Exogeneity, Vol. 37 of Tinbergen Institute Research Series. Amsterdam: Thesis Publishers.

Bowman, K. O., and Shenton, L. R. (1975). "Omnibus test contours for departures from normality based on Öb1 and b2" Biometrika, 62, 243--250.

Box, G. E. P., and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and Control. San Francisco: Holden-Day. First published, 1970.

Box, G. E. P., and Pierce, D. A. (1970). "Distribution of residual autocorrelations in autoregressive-integrated moving average time series models" Journal of the American Statistical Association, 65, 1509--1526.

Breusch, T. S. (1990). "Simplified extreme bounds" in Granger 1990, pp. 72--81.

Brown, R. L., Durbin, J., and Evans, J. M. (1975). "Techniques for testing the constancy of regression relationships over time (with discussion)" Journal of the Royal Statistical Society B, 37, 149--192.

Burns, A. F., and Mitchell, W. C. (1946). Measuring Business Cycles. New York: NBER.

Campos, J., and Ericsson, N. R. (1999). "Constructive data mining: Modeling consumers' expenditure in Venezuela" Econometrics Journal, 2, 226--240.

Carruth, A., and Henley, A. (1990). "Can existing consumption functions forecast consumer spending in the late 1980s?" Oxford Bulletin of Economics and Statistics, 52, 211--222.

Chatfield, C. (1995). "Model uncertainty, data mining and statistical inference" Journal of the Royal Statistical Society, A, 158, 419--466. With discussion.

Chow, G. C. (1960). "Tests of equality between sets of coefficients in two linear regressions" Econometrica, 28, 591--605.

Chow, G. C. (1981). "Selection of econometric models by the information criteria" In Charatsis, E. G.(ed.), Proceedings of the Econometric Society European Meeting 1979, Ch. 8. Amsterdam: North-Holland.

Clayton, M. K., Geisser, S., and Jennings, D. E. (1986). "A comparison of several model selection procedures" In Goel, P., and Zellner, A.(eds.), Bayesian Inference and Decision Techniques: Elsevier Science.

Clements, M. P., and Hendry, D. F. (1998). Forecasting Economic Time Series. Cambridge: Cambridge University Press.

Clements, M. P., and Hendry, D. F. (1999a). Forecasting Non-stationary Economic Time Series. Cambridge, Mass.: MIT Press.

Clements, M. P., and Hendry, D. F. (1999b). "Modelling methodology and forecast failure" Unpublished typescript, Economics Department, University of Oxford.

Coen, P. G., Gomme, E. D., and Kendall, M. G. (1969). "Lagged relationships in economic forecasting" Journal of the Royal Statistical Society A, 132, 133--163.

Cox, D. R. (1961). "Tests of separate families of hypotheses" In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, pp. 105--123 Berkeley: University of California Press.

Cox, D. R. (1962). "Further results on tests of separate families of hypotheses" Journal of the Royal Statistical Society B, 24, 406--424.

Cran, G. W., Martin, K. J., and Thomas, G. E. (1977). "A remark on algorithms. AS 63: The incomplete beta integral. AS 64: Inverse of the incomplete beta function ratio" Applied Statistics, 26, 111--112.

D'Agostino, R. B. (1970). "Transformation to normality of the null distribution of g1" Biometrika, 57, 679--681.

Davidson, J. E. H., and Hendry, D. F. (1981). "Interpreting econometric evidence: The behaviour of consumers' expenditure in the UK" European Economic Review, 16, 177--192. Reprinted in Hendry, D. F., op. cit., (1993) and (2000).

Davidson, J. E. H., Hendry, D. F., Srba, F., and Yeo, J. S. (1978). "Econometric modelling of the aggregate time-series relationship between consumers' expenditure and income in the United Kingdom" Economic Journal, 88, 661--692. Reprinted in Hendry, D. F., op. cit., (1993) and (2000).

Deaton, A. S. (1982). "Model selection procedures or, does the consumption function exist" In Chow, G. C., and Corsi, P.(eds.), Evaluating the Reliability of Macro-Economic Models, Ch. 5. New York: John Wiley.

Doan, T., Litterman, R., and Sims, C. A. (1984). "Forecasting and conditional projection using realistic prior distributions" Econometric Reviews, 3, 1--100.

Doornik, J. A. (1999). Object-Oriented Matrix Programming using Ox 3rd ed. London: Timberlake Consultants Press.

Doornik, J. A., and Hansen, H. (1994). "A practical test for univariate and multivariate normality" Discussion paper, Nuffield College.

Doornik, J. A., and Hendry, D. F. (1996). GiveWin: An Interactive Empirical Modelling Program. London: Timberlake Consultants Press.

Doornik, J. A., and Hendry, D. F. (2001a). Econometric Modelling using PcGive 10, Volume II. London: Timberlake Consultants Press.

Doornik, J. A., and Hendry, D. F. (2001b). Interactive Monte Carlo Experimentation in Econometrics using PcNaive. London: Timberlake Consultants Press.

Engle, R. F. (1982a). "Autoregressive conditional heteroscedasticity, with estimates of the variance of United Kingdom inflation" Econometrica, 50, 987--1007.

Engle, R. F. (1982b). "Autoregressive conditional heteroskedasticity with estimates of the variance of UK inflation". 50, 987--1008.

Engle, R. F., Hendry, D. F., and Richard, J.-F. (1983). "Exogeneity" Econometrica, 51, 277--304. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000; and in Ericsson, N. R. and Irons, J. S. (eds.) Testing Exogeneity, Oxford: Oxford University Press, 1994.

Engle, R. F., Hendry, D. F., and Trumbull, D. (1985). "Small sample properties of ARCH estimators and tests" Canadian Journal of Economics, 43, 66--93.

Engle, R. F., and White, H.(eds.)(1999). Cointegration, Causality and Forecasting. Oxford: Oxford University Press.

Ericsson, N. R. (1983). "Asymptotic properties of instrumental variables statistics for testing non-nested hypotheses" Review of Economic Studies, 50, 287--303.

Ericsson, N. R., Campos, J., and Tran, H.-A. (1990). "PC-GIVE and David Hendry's econometric methodology" Revista De Econometria, 10, 7--117.

Ericsson, N. R., Hendry, D. F., and Prestwich, K. M. (1998). "The demand for broad money in the United Kingdom, 1878--1993" Scandinavian Journal of Economics, 100, 289--324.

Ericsson, N. R., and Irons, J. S.(eds.)(1994). Testing Exogeneity. Oxford: Oxford University Press.

Faust, J., and Whiteman, C. H. (1997). "General-to-specific procedures for fitting a data-admissible, theory-inspired, congruent, parsimonious, encompassing, weakly-exogenous, identified, structural model of the DGP: A translation and critique" Carnegie--Rochester Conference Series on Public Policy, 47, 121--161.

Friedman, M., and Schwartz, A. J. (1982). Monetary Trends in the United States and the United Kingdom: Their Relation to Income, Prices, and Interest Rates, 1867--1975. Chicago: University of Chicago Press.

Frisch, R., and Waugh, F. V. (1933). "Partial time regression as compared with individual trends" Econometrica, 1, 221--223.

Gilbert, C. L. (1986). "Professor Hendry's econometric methodology" Oxford Bulletin of Economics and Statistics, 48, 283--307. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.

Godfrey, L. G. (1978). "Testing for higher order serial correlation in regression equations when the regressors include lagged dependent variables" Econometrica, 46, 1303--1313.

Godfrey, L. G., and Orme, C. D. (1994). "The sensitivity of some general checks to omitted variables in the linear model" International Economic Review, 35, 489--506.

Godfrey, L. G., and Veale, M. R. (1999). "Alternative approaches to testing by variable addition" Mimeo, York University, UK.

Gourieroux, C., and Monfort, A. (1995). "Testing, encompassing, and simulating dynamic econometric models" Econometric Theory, 11, 195--228.

Granger, C. W. J. (1969). "Investigating causal relations by econometric models and cross-spectral methods" Econometrica, 37, 424--438.

Granger, C. W. J.(ed.)(1990). Modelling Economic Series. Oxford: Clarendon Press.

Haavelmo, T. (1944). "The probability approach in econometrics" Econometrica, 12, 1--118. Supplement.

Hannan, E. J., and Quinn, B. G. (1979). "The determination of the order of an autoregression" Journal of the Royal Statistical Society, B, 41, 190--195.

Harris, R. I. D. (1995). Using Cointegration Analysis in Econometric Modelling. London: Prentice Hall.

Harvey, A. C. (1981). The Econometric Analysis of Time Series. Deddington: Philip Allan.

Harvey, A. C. (1990). The Econometric Analysis of Time Series, 2nd ed. Hemel Hempstead: Philip Allan.

Hendry, D. F. (1976). "The structure of simultaneous equations estimators" Journal of Econometrics, 4, 51--88. Reprinted in Hendry, D. F., op. cit., (1993) and (2000).

Hendry, D. F. (1979). "Predictive failure and econometric modelling in macro-economics: The transactions demand for money" In Ormerod, P.(ed.), Economic Modelling, pp. 217--242. London: Heinemann. Reprinted in Hendry, D. F., op. cit., (1993) and (2000).

Hendry, D. F. (1980). "Econometrics: Alchemy or science?" Economica, 47, 387--406. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.

Hendry, D. F. (1985). "Monetary economic myth and econometric reality" Oxford Review of Economic Policy, 1, 72--84. Reprinted in Hendry, D. F., op. cit., (1993) and (2000).

Hendry, D. F. (1994). "HUS revisited" Oxford Review of Economic Policy, 10, 86--106.

Hendry, D. F. (1995a). Dynamic Econometrics. Oxford: Oxford University Press.

Hendry, D. F. (1995b). "Econometrics and business cycle empirics" Economic Journal, 105, 1622--1636.

Hendry, D. F. (1996). "On the constancy of time-series econometric equations" Economic and Social Review, 27, 401--422.

Hendry, D. F. (1997). "On congruent econometric relations: A comment" Carnegie--Rochester Conference Series on Public Policy, 47, 163--190.

Hendry, D. F. (1999). "An econometric analysis of US food expenditure, 1931--1989" in Magnus, and Morgan 1999, pp. 341--361.

Hendry, D. F. (2000a). Econometrics: Alchemy or Science? Oxford: Oxford University Press. New Edition.

Hendry, D. F. (2000b). "Epilogue: The success of general-to-specific model selection" In Econometrics: Alchemy or Science?, pp. 467--490. Oxford: Oxford University Press. New Edition.

Hendry, D. F., and Doornik, J. A. (1994). "Modelling linear dynamic econometric systems" Scottish Journal of Political Economy, 41, 1--33.

Hendry, D. F., and Doornik, J. A. (1999). "The impact of computational tools on time-series econometrics" In Coppock, T.(ed.), Information Technology and Scholarship, pp. 257--269. Oxford: Oxford University Press.

Hendry, D. F., and Doornik, J. A. (2001). Econometric Modelling using PcGive 10: Volume I. London: Timberlake Consultants Press.

Hendry, D. F., and Ericsson, N. R. (1991a). "An econometric analysis of UK money demand in `Monetary Trends in the United States and the United Kingdom' by Milton Friedman and Anna J. Schwartz" American Economic Review, 81, 8--38.

Hendry, D. F., and Ericsson, N. R. (1991b). "Modeling the demand for narrow money in the United Kingdom and the United States" European Economic Review, 35, 833--886.

Hendry, D. F., and Krolzig, H.-M. (1999a). "General-to-specific model selection using PcGets for Ox" Unpublished paper, Economics Department, Oxford University.

Hendry, D. F., and Krolzig, H.-M. (1999b). "Improving on `Data mining reconsidered' by K.D. Hoover and S.J. Perez" Econometrics Journal, 2, 202--219.

Hendry, D. F., and Krolzig, H.-M. (2000). "The econometrics of general-to-simple modelling" Mimeo, Economics Department, Oxford University.

Hendry, D. F., Leamer, E. E., and Poirier, D. J. (1990). "A conversation on econometric methodology" Econometric Theory, 6, 171--261.

Hendry, D. F., and Mizon, G. E. (1978). "Serial correlation as a convenient simplification, not a nuisance: A comment on a study of the demand for money by the Bank of England" Economic Journal, 88, 549--563. Reprinted in Hendry, D. F., op. cit., (1993) and (2000).

Hendry, D. F., and Mizon, G. E. (1990). "Procrustean econometrics: or stretching and squeezing data" in Granger 1990, pp. 121--136.

Hendry, D. F., and Mizon, G. E. (1993). "Evaluating dynamic econometric models by encompassing the VAR" In Phillips, P. C. B.(ed.), Models, Methods and Applications of Econometrics, pp. 272--300. Oxford: Basil Blackwell.

Hendry, D. F., and Mizon, G. E. (1999). "The pervasiveness of Granger causality in econometrics" in Engle, and White 1999.

Hendry, D. F., and Mizon, G. E. (2000). "Reformulating empirical macro-econometric modelling" Oxford Review of Economic Policy, 16, 138--159.

Hendry, D. F., and Morgan, M. S. (1995). The Foundations of Econometric Analysis. Cambridge: Cambridge University Press.

Hendry, D. F., Muellbauer, J. N. J., and Murphy, T. A. (1990). "The econometrics of DHSY" In Hey, J. D., and Winch, D.(eds.), A Century of Economics, pp. 298--334. Oxford: Basil Blackwell.

Hendry, D. F., and Neale, A. J. (1987). "Monte Carlo experimentation using PC-NAIVE" In Fomby, T., and Rhodes, G. F.(eds.), Advances in Econometrics, Vol. 6, pp. 91--125. Greenwich, Connecticut: Jai Press Inc.

Hendry, D. F., and Richard, J.-F. (1982). "On the formulation of empirical models in dynamic econometrics" Journal of Econometrics, 20, 3--33. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press and in Hendry D. F., op. cit., (1993) and (2000).

Hendry, D. F., and Richard, J.-F. (1989). "Recent developments in the theory of encompassing" In Cornet, B., and Tulkens, H.(eds.), Contributions to Operations Research and Economics. The XXth Anniversary of CORE, pp. 393--440. Cambridge, MA: MIT Press.

Hendry, D. F., and von Ungern-Sternberg, T. (1981). "Liquidity and inflation effects on consumers' expenditure" In Deaton, A. S.(ed.), Essays in the Theory and Measurement of Consumers' Behaviour, pp. 237--261. Cambridge: Cambridge University Press. Reprinted in Hendry, D. F., op. cit., (1993) and (2000).

Hendry, D. F., and Wallis, K. F.(eds.)(1984). Econometrics and Quantitative Economics. Oxford: Basil Blackwell.

Hoover, K. D., and Perez, S. J. (1999). "Data mining reconsidered: Encompassing and the general-to-specific approach to specification search" Econometrics Journal, 2, 167--191.

Jarque, C. M., and Bera, A. K. (1980). "Efficient tests for normality, homoscedasticity and serial independence of regression residuals" Economics Letters, 6, 255--259.

Johansen, S. (1988). "Statistical analysis of cointegration vectors" Journal of Economic Dynamics and Control, 12, 231--254. Reprinted in R.F. Engle and C.W.J. Granger (eds), Long-Run Economic Relationships, Oxford: Oxford University Press, 1991, 131--52.

Johansen, S. (1992). "Testing weak exogeneity and the order of cointegration in UK money demand" Journal of Policy Modeling, 14, 313--334.

Judge, G. G., and Bock, M. E. (1978). The Statistical Implications of Pre-Test and Stein-Rule Estimators in Econometrics. Amsterdam: North Holland Publishing Company.

Judge, G. G., Griffiths, W. E., Hill, R. C., Lütkepohl, H., and Lee, T.-C. (1985). The Theory and Practice of Econometrics, 2nd ed. New York: John Wiley.

Kent, J. T. (1986). "The underlying nature of nonnested hypothesis tests" Biometrika, 73, 333--343.

Keynes, J. M. (1939). "Professor Tinbergen's method" Economic Journal, 44, 558--568.

Keynes, J. M. (1940). "Comment" Economic Journal, 50, 154--156.

Kiviet, J. F. (1985). "Model selection test procedures in a single linear equation of a dynamic simultaneous system and their defects in small samples" Journal of Econometrics, 28, 327--362.

Kiviet, J. F. (1986). "On the rigor of some mis-specification tests for modelling dynamic relationships" Review of Economic Studies, 53, 241--261.

Koopmans, T. C. (1947). "Measurement without theory" Review of Economics and Statistics, 29, 161--179.

Koopmans, T. C., Rubin, H., and Leipnik, R. B. (1950). "Measuring the equation systems of dynamic economics" In Koopmans, T. C.(ed.), Statistical Inference in Dynamic Economic Models, No. 10 in Cowles Commission Monograph, Ch. 2. New York: John Wiley & Sons.

Krolzig, H.-M. (2001). "General-to-specific reductions of vector autoregressive processes" Economics discussion paper 2000-W34, Nuffield College, Oxford.

Krolzig, H.-M., and Hendry, D. F. (2001). "Computer automation of general-to-specific model selection procedures" Journal of Economic Dynamics and Control, 25, 831--866.

Leamer, E. E. (1978). Specification Searches. Ad-Hoc Inference with Non-Experimental Data. New York: John Wiley.

Leamer, E. E. (1983a). "Let's take the con out of econometrics" American Economic Review, 73, 31--43. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.

Leamer, E. E. (1983b). "Model choice and specification analysis" In Griliches, Z., and Intriligator, M. D.(eds.), Handbook of Econometrics, Vol. 1, Ch. 5. Amsterdam: North-Holland.

Leamer, E. E. (1984). "Global sensitivity results for generalized least squares estimates" Journal of the American Statistical Association, 79, 867--870.

Leamer, E. E. (1990). "Sensitivity analyses would help" in Granger 1990, pp. 88--96.

Ljung, G. M., and Box, G. E. P. (1978). "On a measure of lack of fit in time series models" Biometrika, 65, 297--303.

Lovell, M. C. (1983). "Data mining" Review of Economics and Statistics, 65, 1--12.

Lütkepohl, H. (1991). Introduction to Multiple Time Series Analysis. Berlin: Springer.

Magnus, J. R., and Morgan, M. S.(eds.)(1999). Methodology and Tacit Knowledge: Two Experiments in Econometrics. Chichester: John Wiley and Sons.

Majunder, K. L., and Bhattacharjee, G. P. (1973a). "Algorithm AS 63. The incomplete beta integral" Applied Statistics, 22, 409--411.

Majunder, K. L., and Bhattacharjee, G. P. (1973b). "Algorithm AS 64. Inverse of the incomplete beta function ratio" Applied Statistics, 22, 411--414.

Mayo, D. (1981). "Testing statistical testing" In Pitt, J. C.(ed.), Philosophy in Economics, pp. 175--230: D. Reidel Publishing Co. Reprinted as pp. 45--73 in Caldwell B. J. (1993), The Philosophy and Methodology of Economics, Vol. 2, Aldershot: Edward Elgar.

McAleer, M., Pagan, A. R., and Volker, P. A. (1985). "What will take the con out of econometrics?" American Economic Review, 95, 293--301. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.

Mizon, G. E. (1977a). "Inferential procedures in nonlinear models: An application in a UK industrial cross section study of factor substitution and returns to scale" Econometrica, 45, 1221--1242.

Mizon, G. E. (1977b). "Model selection procedures" In Artis, M. J., and Nobay, A. R.(eds.), Studies in Modern Economic Analysis, pp. P97--120. Oxford: Basil Blackwell.

Mizon, G. E. (1984). "The encompassing approach in econometrics" in Hendry, and Wallis 1984, pp. 135--172.

Mizon, G. E. (1995). "Progressive modelling of macroeconomic time series: the LSE methodology" In Hoover, K. D.(ed.), Macroeconometrics: Developments, Tensions and Prospects, pp. 107--169. Dordrecht: Kluwer Academic Press.

Mizon, G. E., and Richard, J.-F. (1986). "The encompassing principle and its application to non-nested hypothesis tests" Econometrica, 54, 657--678.

Moore, H. L. (1914). Economic Cycles -- Their Law and Cause. New York: MacMillan.

Muellbauer, J. N. J. (1994). "The assessment: Consumer expenditure" Oxford Review of Economic Policy, 10, 1--41.

Nicholls, D. F., and Pagan, A. R. (1983). "Heteroscedasticity in models with lagged dependent variables" Econometrica, 51, 1233--1242.

Pagan, A. R. (1984). "Model evaluation by variable addition" in Hendry, and Wallis 1984, pp. 103--135.

Pagan, A. R. (1987). "Three econometric methodologies: A critical appraisal" Journal of Economic Surveys, 1, 3--24. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.

Paroulo, P. (1996). "On the determination of integration indices in I(2) systems" Journal of Econometrics, 72, 313--356.

Pesaran, M. H. (1974). "On the general problem of model selection" Review of Economic Studies, 41, 153--171.

Pesaran, M. H.(ed.)(1987). The Limits of Rational Expectations. Oxford: Basil Blackwell.

Pike, M. C., and Hill, I. D. (1966). "Logarithm of the gamma function" Communications of the ACM, 9, 684.

Rahbek, A., Kongsted, H. C., and Jørgensen, C. (1999). "Trend-stationarity in the I(2) cointegration model" Journal of Econometrics, 90, 265--289.

Sargan, J. D. (1964). "Wages and prices in the United Kingdom: A study in econometric methodology (with discussion)" In Hart, P. E., Mills, G., and Whitaker, J. K.(eds.), Econometric Analysis for National Economic Planning, Vol. 16 of Colston Papers, pp. 25--63. London: Butterworth Co. Reprinted as pp. 275--314 in Hendry D. F. and Wallis K. F. (eds.) (1984). Econometrics and Quantitative Economics. Oxford: Basil Blackwell, and as pp. 124--169 in Sargan J. D. (1988), Contributions to Econometrics, Vol. 1, Cambridge: Cambridge University Press.

Sargan, J. D. (1973). "Model building and data mining" Discussion paper, London School of Economics. Presented to the Association of University Teachers of Economics, Meeting, Manchester, April 1973.

Sargan, J. D. (1980). "Some tests of dynamic specification for a single equation" Econometrica, 48, 879--897. Reprinted as pp. 191--212 in Sargan J. D. (1988), Contributions to Econometrics, Vol. 1, Cambridge: Cambridge University Press.

Sargan, J. D. (1981). "The choice between sets of regressors" Mimeo, Economics Department, London School of Economics.

Savin, N. E. (1984). "Multiple hypothesis testing" In Griliches, Z., and Intriligator, M. D.(eds.), Handbook of Econometrics, Vol. 2--3, Ch. 14. Amsterdam: North-Holland.

Sawa, T. (1978). "Information criteria for discriminating among alternative regression models" Econometrica, 46, 1273--1292.

Schwarz, G. (1978). "Estimating the dimension of a model" Annals of Statistics, 6, 461--464.

Shea, B. L. (1988). "Algorithm AS 239: Chi-squared and incomplete gamma integral" Applied Statistics, 37, 466--473.

Shenton, L. R., and Bowman, K. O. (1977). "A bivariate model for the distribution of Öb1 and b2" Journal of the American Statistical Association, 72, 206--211.

Shibata, R. (1980). "Asymptotically efficient selection of the order of the model for estimating parameters of a linear process" Annals of Statistics, 8, 147--164.

Sims, C. A. (1980). "Macroeconomics and reality" Econometrica, 48, 1--48. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.

Sims, C. A., Stock, J. H., and Watson, M. W. (1990). "Inference in linear time series models with some unit roots" Econometrica, 58, 113--144.

Smith, G. W. (1986). "A dynamic Baumol-Tobin model of money demand" Review of Economic Studies, 53, 465--469.

Spanos, A. (1989). "On re-reading Haavelmo: A retrospective view of econometric modeling" Econometric Theory, 5, 405--429.

Sullivan, R., Timmermann, A., and White, H. (1998). "Dangers of data-driven inference: The case of calendar effects in stock returns" Mimeo, Economics Department, University of California at San Diego.

Summers, L. H. (1991). "The scientific illusion in empirical macroeconomics" Scandinavian Journal of Economics, 93, 129--148.

Teräsvirta, T. (1976). "Effect of feedback on the distribution of the portmanteau statistic" Manuscript, London School of Economics.

Theil, H. (1971). Principles of Econometrics. London: John Wiley.

Tinbergen, J. (1940a). Statistical Testing of Business-Cycle Theories. Geneva: League of Nations. Vol. I: A Method and its application to Investment Activity.

Tinbergen, J. (1940b). Statistical Testing of Business-Cycle Theories. Geneva: League of Nations. Vol. II: Business Cycles in the United States of America, 1919--1932.

Vuong, Q. H. (1989). "Likelihood ratio tests for model selection and nonnested hypotheses" Econometrica, 50, 1--25.

White, H. (1980). "A heteroskedastic-consistent covariance matrix estimator and a direct test for heteroskedasticity" Econometrica, 48, 817--838.

White, H. (1984). Asymptotic Theory for Econometricians. London: Academic Press.

White, H. (1990). "A consistent model selection" in Granger 1990, pp. 369--383.

Wooldridge, J. M. (1999). "Asymptotic properties of some specification tests in linear models with integrated processes" in Engle, and White 1999, pp. 366--384.

Wright, P. G. (1915). "Moore's economic cycles" Quarterly Journal of Economics, 29, 631--641.

Yancey, T. A., and Judge, G. G. (1976). "A Monte Carlo comparison of traditional and stein-rule estimators under squared error loss" Journal of Econometrics, 4, 285--294.