Most economic modelers using the GTAP database will want to build their own model, making decisions about the structure of technology, preferences and policy parameters. The tools provided here are intended to simply facilitate the use of GTAP data. A modeler would typically run these programs once to produce a dataset as part of a modelling exercise.
All of the dataset aggregation and recalibration tools provided here are packaged in DOS batch files. The command files include:
Unpack the GTAP V4 distribution data into GAMS-readable formats, and generate a filtered version of the full dataset suitable for large-scale computation. The "filtering" step rounds all values in the dataset to the nearest 100,000 $; and all tax rates are rounded to the nearest percent.
Aggregate a larger GTAP dataset into a smaller GTAP dataset.
Generate a new dataset by imposing an exogenous set of tax rates on an existing GTAP dataset. This permits adjustment of tariffs, export taxes, sales taxes and factor taxes.
Read a dataset and check benchmark consistency, producing an echoprint of base year GDP and trade shares.
A utility routine to move GTAPinGAMS datasets between GAMS ZIP and GEMPACK header-array formats.32
This program is typically run once to generate a GAMS-readable dataset from the original GEMPACK distribution file GSDDAT.HAR. This begins by translating the full GTAP dataset into GAMS-readable format (GTAPV4.ZIP). This is done using the GEMPACK utility SEEHAR.EXE, a small Fortran program REWRITE.EXE and a GAMS program SCALE.GMS. The last of these programs scales trade and production data from billions of dollars to tens of billions of dollars.
The next step in the translation is to "filter" the GTAPV4 dataset removing all very small coefficients, extreme tax rates and various other inconsistencies. The default filter tolerance is 0.001 (one tenth of one percent), defined in FILTER.GMS. I use this tolerance to name the filtered dataset GTAP4001. When using GTAP version 4 data, I would normally aggregate using the GTAP4001 dataset as a source. The filtering process improves numerical robustness in large-scale models while introducing very small changes in the results. If you are working with a highly aggregate model, however, it should be possible to aggregate directly from the unfiltered dataset GTAPV4.
Specific steps in this program are as follows:
Eliminate any imports of a good into a region where the total value of imports is less than TOLERANCE times the combined value of import and domestic demand.
Define the MAGNITUDE of a trade flow as maximum of the ratio of the trade flow net of tax to the associated aggregate export level or the trade flow gross of tax to the associated aggregate import level.
Drop all trade flows which have MAGNITUDE less than TOLERANCE.
Rescale remaining trade flows to maintain consistent values of aggregate imports and aggregate transport cost.
Define the MAGNITUDE of an intermediate input as the maximum of the ratio of the input value gross of tax to total cost, or the ratio of the input value net of tax to total domestic supply.
Define the MAGNITUDE of a factor input as the ratio of the factor payment gross of tax to the value of output gross of tax.
Drop all intermediate inputs and factor inputs which have MAGNITUDE less than TOLERANCE.
Even with a very small TOLERANCE (0.1%), the filtering just described generates a substantial reduction in the number of nonzeros:
PARAMETER DENSITY summary of changes in matrix density BEFORE AFTER TRADE 53.074 43.357 PROD 81.942 46.824
Finally, we do the same thing with final demand (private and public), filtering both imports and domestic demand. We also filter inputs to the international transport activity. This removes all tiny coefficients from the dataset.
The foregoing assignments represent a large number of small changes to the model data, and it is certain that we have introduced some inconsistencies which show up as violations of the profit and market clearance conditions defined in chkeq. For this reason, at this point we use a modified least-squares procedure to restore consistency, holding the international trade matrices fixed and recalibrating each of the regional economic flows.
This is the step where it is very helpful to use a complementarity formulation and the PATH solver, as the solution is extremely difficult with MINOS or CONOPT due to the large number of accumlated superbasics. I have include model definitions here for an equivalent nonlinear programming approach, but I have not included this as a standard feature because I have found the NLP codes to be somewhat unreliable. If you own an NLP solver but don't have PATH, then it will be necessary to convert the SOLVE statements from MCP to NLP. If this proves difficult, contact GTAP and we can arrange for you to get a copy of GTAP4001.ZIP, the filtered dataset.
Ferris and Rutherford [1998] present details of how we have set up constraints and objective function which are interesting but not essential to understanding the program. The key point is that at this point we have changed some of the base year value flows to reinstate equilibrium, holding all tax rates fixed.
For energy-related analysis, I find it helpful to maintain a process-oriented representation of the oil sector. For this purpose, I have included code which routes all crude oil flows in each region through the refined oil sector. This involves some careful programming to assure that tax payments and all base year transactions remains constant.
Inputs: ..\gtapdata\gsddat.har ..\defines\gtapv4.set ..\defines\gtap4001.set Outputs: ..\data\gtapv4.zip ..\data\gtap4001.zip
Once you have built the initial GTAPinGAMS dataset GTAP4001 (or GTAPV4), you can begin to think about a particular application and which aggregations of the original GTAP data would be appropriate for studying those issues. I typically create two aggregations for any new model, one with a minimal number of regions and commodities and another with a larger number of dimensions. I use the small aggregation for model development and bring out the larger dataset whenever I am confident that the model is running reliably and producing sensible results.
The GTAPAGGR.BAT program is used to aggregate a GTAPinGAMS dataset. A command line argument defines the name of the target aggregation. You only need to provide the batch file with the target because the target's mapping file defines the source. Before running GTAPAGGR.BAT, you must create two files, one defining the sets of commodities, regions and primary factors in the target dataset, and another defining the name of the source dataset and a correspondence between elements of the source and target. The aggregation routine produces a brief report of GDP and trade shares in the new dataset. This is written to a file in the build directory.
Inputs: Command line argument: target ..\defines\%target%.set ..\defines\%target%.map (defines source) ..\data\%source%.zip Output: ..\data\%target%.zip ..\build\%target%.ech
The SET and MAP files for a new dataset are GAMS-readable files located in the defines subdirectory.
Table 20 a sample set file defining dataset DOEMACRO. The file defines the sets of goods, regions, and primary factors which are in the model. Commodity CGD, the investment-savings composite, must be included in every aggregation:
$TITLE An Aggregation of the DOE Dataset SET I Sectors/ Y Aggregate output COL Coal OIL Petroleum and coal products (refined) CRU Crude oil GAS Natural gas ELE Electricity CGD Savings good /; SET R Aggregated Regions / USA United States JPN Japan EUR Europe OOE Other OECD CHN China FSU Former Soviet Union CEA Central European Associates ASI Other Asia MPC Mexico plus OPEC ROW Other countries /; SET F Factors of production / LAB Labor, CAP Capital /;
Table 21 presents the associated mapping file, DOEMACRO.MAP. The file provides a definition of the source dataset together with mapping definitions for commodities and factors. When no mapping is defined for the set of regions, the aggregation routine retains the same set as in the source data.
$SETGLOBAL source doe * ------------------------------------------------------------------- * The target dataset has fewer sectors, so we need to specify how * each sector in the source dataset is mapped to a sector in the * target dataset: SET MAPI Sectors/ MTL.Y Metals-related industry (IRONSTL & NONFERR) EIS.Y Other energy intensive (CHEMICAL & PAPERPRO) MFR.Y Other manufactures SER.Y Other Services COL.COL Coal OIL.OIL Petroleum and coal products (refined) CRU.CRU Crude oil GAS.GAS Natural gas ELE.ELE Electricity CGD.CGD Savings good /; * The following statements illustrate how to aggregate * factors of production in the model. Unlike the aggregation * of sectors or regions, you need to declare the set of * primary in the source as set FF, then you can specify the * mapping from the source to the target sets. * The reason for this special treatment is to permit the * aggregation program to operate with both GTAP version 4 and * GTAP version 3 data. Sorry for the inconvenience! TFR set ff /LND,SKL,LAB,CAP,RES/; SET MAPF mapping of primary factors /LND.CAP,SKL.LAB,LAB.LAB,CAP.CAP,RES.CAP/; * NB: There is no need to specify a MAPR array for generating * DOEMACRO from the DOE dataset. This implies that the * source and target datasets have the identical sets of * regions, so there is no need to specify an set named MAPR. * The aggregation routine will automatically assign a * one-to-one mapping from the source to the target regions.
Here are a couple of exercises which could help a new user learn about the error messages returned by GTAPAGGR: (i) Comment out the line with MFR.Y mapping and run GTAPAGGR. (You will get an error message indicating the MFR has not been mapped). (ii) Change the COL.COL mapping to COL.OIL and run GTAPAGGR. You will get an error message indicating that sector COL in the target dataset has no sector mapped to it.
This program is used principally to create a new dataset by imposing a new set of benchmark rates on an existing GTAP dataset. Two command line arguments define the target and source datasets. The source dataset must be in the DATA subdirectory, and a file defining benchmark tax rates for the target dataset is specified in the DEFINES (see Table 22). This program also generates a summary echo-print of trade and GDP shares for the new dataset and places this file in the BUILD subdirectory.
When you write the definitions file for adjusting tax rates, bear in mind that a gross basis tax (TY) is defined as a percentage of the gross-of-tax price, hence these tax rates have a maximum value of 100% and no minimum. A net basis tax, such as TF, TP, TG, TX or TM is defined as a percentage of the net-of-tax price, hence these tax rates have no maximum value and a minimum value of -100%.
* Set up a benchmark equilibrium in which we eliminate all domestic taxes: ty(i,r) = 0; tp(i,r) = 0; tf(f,i,r) = 0; tg(i,r) = 0; ti(i,j,r) = 0;
Inputs: Command line arguments: source target defines\%source%.set defines\%source%.map defines\%target%.def data\%source%.zip Outputs: defines\%target%.set defines\%target%.map data\%target%.zip build\%target%.ech
This utility routine reads a GTAP dataset in GAMS-ZIP format and writes the data in a self-extracting compressed header array format.
Inputs: Command line argument: dataset data\%dataset%.zip Outputs: data\%dataset%.har