Documentation of DaTrAMo (Data Transfer- and Aggregation...
Transcript of Documentation of DaTrAMo (Data Transfer- and Aggregation...
1 (14)
Documentation of DaTrAMo (Data Transfer- and Aggregation Module)
1. Introduction
The DaTrAMo for Microsoft Excel is a solution, which allows to transfer or aggregate
data very easily from one worksheet (workbook) to another. This solution bases on the
computer language Visual Basic for Applications (VBA for EXCEL).
In general, data transfers by copy and paste or linking to another Excel workbook. This
approach is often susceptible to errors because even slight changes in the structure of
the source workbook can have far-reaching consequences on existing. DaTrAMo avoids
these mistakes by the unique assignment of data points of a time series regardless of
the structure of the source workbook. In order to use this solution, the source workbook
(mostly databases) must be organized as follows: The first row and the first column of a
source sheet have to be used for the assignment of the time series.
Time series in the source workbook can be organized in columns as well as in rows.
Because worksheets (until version Excel 2003) are restricted to 256 columns the time
series in large databases should be ordered in rows such that the identifiers of the time
series are in the first column and the time identifiers are in the first row. In figure 1, for
example, the value of finalized construction works (C1n) in the first quarter of 2010
(q1.10) is uniquely identified. The row and column assignments (e.g., D9) are no more
necessary to identify a certain data point.
Figure 1
In order to transfer data from a source workbook the target workbook must contain the
VBA solution. The whole program code is available as a template. The sheet “Options”
summarizes the minimum requirement of information (name of the source workbook,
2 (14)
path where it is stored and the file suffix). The sheet “Options” contains also a set of
other possibilities for individual adjustment (e.g., formatting data, making list of names).
You can fetch the data into any sheet by using an easy function that only needs the
source workbook name as argument. The time identifiers and the variable names in the
first column or the first row of the target worksheet compare to the corresponding identi-
fiers of the source file.
The function = DbF ("Database name") fetches a value of a time series in the source
workbook and orders it in columns in the worksheet (see figure 2). The name of the
source workbook has to be put always in quotation marks.
Figure 2
Beside the use in working files, the own database functions can also be used to provide
databases from different sources. If a data delivery occurs at regular intervals if, e.g.,
new monthly data are available, transfer of new data into a database can be done auto-
mated by use of the own database functions.
For a smooth work with DaTrAMo, the following adjustment has to be activated:
Excel => Options => Trust Center => Trust Center Settings
ActiveX Settings => enable all controls without restrictions and without prompting
Macro Settings => enable all macros
3 (14)
For Excel versions up from 2007, use of files with suffix .xlsb recommends.
2. The worksheet “Options”
Figure 3
The template (db_fct_2012Mne_sr1.xlsb) provides the sheet “Options”. Primarily, in this
worksheet you have to specify the source workbooks (see figure 3). For VBA technical
reasons you must list one source workbook at least. You can identify source workbooks
by choosing the cell B8 “Select DB” or you select the sources direct in column B by
using a list. You can open or close the chosen source workbook(s) by choosing the
cell C8 “Open selected databases” or the cell G8 “Close db”. Further the fetched
data of a certain source can be given colourfully or receive also a certain version of a
font-style. It is possible, for example to format data from a monthly database in green
colour and data from a quarterly database in blue colour. In addition, the DaTrAMo can
count the number of the read in data from different source workbooks. You can change
the group levels for all worksheets in the workbook by only one click.
2.1. Selection of source workbooks (mostly databases)
In worksheet “Options” (column K), you find all potentially source workbooks (figure 4).
Figure 4
4 (14)
In this example, you see a workbook for different labour force data (LFS_db) together
with a workbook for employment data (employment_db) and some other.
Now you can select the desired workbooks in the worksheet “Options” in column B (figu-
re 5). The “store” path of the workbook you have to write manually in column C. In addi-
tion, you can select workbooks with path and suffix by selecting the cell B8 “Select DB”.
In this case, a dialog box will appear where you can choose the workbook. This works
only from row 12 on.
Figure 5
The program will use this information together with the suffix from column I in all func-
tions. To avoid error messages attention has to be paid in particular to the fact that the
path information is fixed for the databases carefully. If you are working on a network, it
is obvious to store source databases centrally in a common folder, e.g., D:\DB\mne12.
In the following column D, you can select which source workbooks you want to open
automatically with cell C8: "Open selected databases". The source workbooks you
have to open for the calculation of the functions. In column G, you can specify individu-
ally which workbook should be closed by one click. In this case, the source files are not
stored. This procedure you should use carefully when the source isn’t a pure database.
In case of reading data from other workbooks, this file you shouldn’t close automatically.
2.2. Formatting
Any colours and different versions of a font style you can use to mark for example diffe-
rent data sources. In column L, you can make a list of the desired colours (figure 6). To
be able to use automatically the internal colour codes of Excel, the respective cell with
the desired written colour is to be formatted. By selecting the cell L8 "Change list of
Colours“, the internal colour coding appears in column M. In the cells in column E, a
5 (14)
certain colour can be assigned when required to every source workbook. In addition, a
version of a font style can be selected in the cells of the column F for every source file.
After selection of colour and font style for each database, you have to choose the cells
E10 and F10 for initializing the formats. To apply different colours the name of the
sheets has to be listed in the range C41 to C99 (see item 2.3. for further details).
Figure 6
2.3. Value copy
If you want to convert the workbook into a value copy, e.g., a backup copy of certain
data, you can do this automatically. This function is helpful also to construct a new
database by combining different sources. Other users can apply this value copy as
database. The procedure to make a value copy is located in the lower part of the sheet
(figure 7). In column K „Complete list of worksheets“, you can provide a list of the sheets
available in this workbook by pressing the button „New ws-list”.
Figure 7
6 (14)
Other functionality allows the transformation of the cell entries with own database func-
tions in values for selected sheets in the workbook. In column B, the desired sheets can
be selected (figure 8). To receive a new numbering and new entries in column E and
column G, an initialization is necessary after end of choice of the sheets by selecting the
cell A39 "Initialize". There are two possibilities to generate value copies. Either all func-
tions of the whole file are converted into a value copy (choose range B39:D39), or from
a selection of sheets the results of the own database functions are converted into
values (select cell F39).
Figure 8
Before the complete values transformation will execute you have to carry out a security
query until the beginning of the transformation whether one liked to convert the func-
tions into values (figure 9) really in all work sheets. Then a temporary value copy of the
file is stored on the hard disk. After this, the program will ask you whether also the
macros and the option sheet should be removed (figure 10). Finally, the query show up
whether the value copy should be closed (figure 11). The value copy is always stored in
a new file with the name affix "_V".
Figure 9
Figure 10
7 (14)
Figure 11
If you want to convert only the results of the database functions into values for one or all
sheets in the file, you can select the corresponding sheets by the entry “Yes” (or “No”) in
column E (see figure 12).
Figure 12
The conversion into values works without automatic storage in a new file. By selecting
the cell "Get values" after a previous query (see figure 13) the conversion starts imme-
diately. The results of the database functions are now converted into values. This
procedure is irreversible; this means all database functions contained in the marked
sheets get lost!
Figure 13
2.4. Other features
The worksheet “Options” offers some other useful features.
Thus, you can fade in or out the grouping levels for all work-
sheets simultaneous (see figure 14).
Further you can switch on or off the view of the grouping symbols
to increase the visible screen (Cell Q11 „Display On/Off“).
Figure 14
8 (14)
To raise the arithmetic performance it is obvious to switch off “Automatic calculation”
(“Tools” and “Calculating”). Then you can carry out the calculations manually by pres-
sing F9 or, however, also for selected sheets with activating the cell Q1 in the option
sheet or, however, completely with CTRL + ALT + F9.
Further, it is possible to remove assigned names in the
completely working file all at once (button "Delete names"
- see figure 15).
Figure 15
3. Type of functions
Two basic function types are distinguished: Functions for the easy data transfer, e.g.,
from databases or other working files as well as the data transmission with the possibi-
lity of aggregation, e.g., monthly data to quarterly data or to annual data. The functions
may have many several arguments and merely the input of the name of the source file
is stringently necessary; all other arguments are optional. In the English version, you
have to separate the arguments with a comma and in the German version with a
semicolon. You can enter the database functions and arguments with the use of small
letters as well as in capitalization.
3.1. Simple data transfer
Time series in the working files can be shown in columns as well as in rows. You can
use the function „=dbf (…)“ without predetermination of the time series position (the sys-
tem finds the position itself).
According to representation, you can select the suitable function, too (this was the solu-
tion in earlier releases). If the time series names are in the column head the function
„=dbs (…)“ can be used and if they are against it in the rows head, one uses the func-
tion „=dbz (…)“.
In the easiest case, you have to enter merely the name of the source file as the first
function argument in quotation mark (see for an example figure 16):
=DbF („database name“) or =DbS („database name “) or =DbZ („database name “).
9 (14)
The function applies the time identifier from
the first column (or row) and the time series
name from the first row (or column) to select
the suitable value from the source file. For
this easy case, the sheet name in the
source file must exactly match with the
name of the source file.
Figure 16
The time identifier should be defined to be able to work also with the aggregations func-
tion with the following formats:
Annual data with „a.“, e. g. a.11 for the data point of the year 2011; Semi-annual with „h1.“ or „h2.“ and the year (e. g. h1.11 or h2.11); Quarterly data with the keys „q1.“ to „q4.“ and the year (e. g. q1.11); Monthly data with the keys „m1.“ to „m12.“ and the year (e. g. m1.11).
For additional linking of data, you can use like time identifier entries which starts with “i.”
(Item). This makes sense when you need information about the variables, e.g., the com-
plete name of the variable or the name of the deliverer.
If the source file contains several sheets, the time series from sheets with a name devi-
ating from the file name can be also selected and transferred. Then in addition, the in-
formation of the sheet name is necessary (second argument):
= DbF („database name “, „sheet name”) or = DbS („database name “, „sheet name””) or = DbZ („database name “, „sheet name”).
In the usual case, time series names are in the first column and so the third argument
of the function is “1”. This is the default setting and thus the argument is optional. Are
the time series in the source file in columns, i.e. the time identifiers are in the first row
(in the column head), "2" must be given for the data transfer as the third function
argument.
= DbF („database name “, „sheet name” , 2) or = DbS („database name “, „sheet name” , 2) or = DbZ („database name “, „sheet name” , 2).
In the case, that you want to transfer values from different time series (variables) in one
row (column) you have to define the suitable identifiers in the same column (row). Then
you have to enter as fourth argument the row number and/or as fifth argument the
column number of the value to be read in (time identifier and/or variable name), too.
10 (14)
= DbF („database name “, „sheet name” , , # , # ) or = DbS („database name “, „sheet name” , , # , # ) or = DbZ („database name “, „sheet name” , , # , # ).
To calculate, e.g., a deflator by division of the nominal values by the real values, it is ob-
vious to put down the variable name of the nominal values (G1 in figure 17) and in the
second line the variable of the real values (G2 in figure 16). By the formula
= DbF( "DBJ_Mne" ) / DbF( "DBJ_Mne" ; ; ; 2 ) * 100 - 100
a division of the nominal values by the real values will be done. The second function in
this example has as the fourth argument the row from where the variable comes from.
Figure 17
With the fifth argument of the function, you can use the time identifier or variable name
from any column. Thereby is it possible to show different table structures in one sheet
(figure 18). In this example, the function gets the variable name from the tenth column.
Figure 18
11 (14)
3.2. Aggregation of data
For aggregation of the source data, the basic function DBF was extended by additional
arguments. For the aggregation, two functions are available as a function depending on
the structure of the source file.
If the time series are in the source file in rows (it means, that the time identifiers are in
the first row of the source file), the function DbA should be used. If the time series in the
source file are in columns (it means, that the time identifiers are in the first column of the
source file), you can use DbT. The character "A" in the function stands for the possibility
for aggregation. "T" stands for the transposed form of the source file.
You can aggregate monthly data to quarterly, semi-annual or annual data, quarterly
data to semi-annual or annual data and semi-annual data to annual data. In the target
file, the time identifier for the aggregation (q, h or a) has to be given.
As additional function arguments, the periodicity of the source data must be given at the
third and the way in which aggregation should be done at the fourth place of the
function. If the source data is in monthly periodicity, 12 is to be given with the periodici-
ty, with quarterly data the periodicity is 4 and with semi-annual data the periodicity is 2.
The default setting for the periodicity is 12 (monthly data). Hence, in this case no
explicit information of the parameter periodicity is necessary.
The aggregation could be done as average (e.g., for indices) or as sum. To get the ave-
rage the function argument is 2. To get the sum the argument is 1 (default setting).
The following example explains how to aggregate: The formula in cell D7 (figure 19)
= DbA ( " DbQ_Mne " ; ; 4 )
aggregates the four quarterly values of 2010 of the variable " C1n " from the source
data base to the annual sum. The aggregation occurs in each case to the sum, because
no information was carried out in the argument for how aggregation should takes place.
All data are taken from the source file "DbQ_Mne". The information of the sheet name is
not necessary, because the sheet has the same name like the source file.
12 (14)
Figure 19
4. The help function
The program has a short help function. The
syntax for the database functions you can dis-
play before writing the arguments of a
function. With a double click in any cell, the
help is activated (figure 20).
Figure 20
Then a syntax help to the different function types is offered. If the desired function type
is selected and confirmed with Yes (Ja), the arguments and their order are described in
new windows (figure 21, 22 and 23).
Figure 21
13 (14)
Figure 22
Figure 23
14 (14)
5. Frequent errors
The program runs very solidly. Regular failures or falls have not seemed during the past
years. The most frequent mistake is faulty information of the syntax. For instance, if the
variable name should not be taken from the first column or the first row, a wrong num-
ber of semicolons (commas) between the arguments are often the reason for an error.
In addition, the program delivers the following error messages by wrong inputs:
WB unknown! The source file (database) is not known or is not declared or not correctly declared in the sheet “Options”.
Sheet unknown! The sheet name given as argument in the function does not exist in the source file.
Variable name? The variable name in the target sheet does not match with any variable name (identifier) in the source file.
Time? The time identifier in the target sheet does not match with any time identifier in the source file.
Source structure? The structure of the source file does not match with the argu-ment of the function or the type of aggregation (DbA or DbT).
Source closed! The source file is not open.
Period? There is a wrong periodicity within an aggregation.
Aggr.Funct.? The rule for aggregation is not known, arguments are 1 for sum and 2 for mean.
Aggregat? Time identifier does not fit with the specifications for aggre-gation.
Missing values! For the aggregation function values are missing or empty (number of values is less than predefined).
“” For the function DbF the value in the source is empty.
6. Use of names
Further capability to rationalize work with DaTraMo is the use of names.
For example, you can define the name “DbQ” (with the names manager) and assign the
function = DbF(“DbQ_Mne”) to this name (with the names manager, too). It is important,
during the work with the names manager to be located in an according cell.
Afterwards you can use this name (e.g.
with help of F3) instead the function (But
only with accurate the same arguments,
see figure 24).
Figure 24