Documentation of DaTrAMo (Data Transfer- and Aggregation...

14
1 (14) Documentation of DaTrAMo (Data Transfer- and Aggregation Module) 1. Introduction The DaTrAMo for Microsoft Excel is a solution, which allows to transfer or aggregate data very easily from one worksheet (workbook) to another. This solution bases on the computer language Visual Basic for Applications (VBA for EXCEL). In general, data transfers by copy and paste or linking to another Excel workbook. This approach is often susceptible to errors because even slight changes in the structure of the source workbook can have far-reaching consequences on existing. DaTrAMo avoids these mistakes by the unique assignment of data points of a time series regardless of the structure of the source workbook. In order to use this solution, the source workbook (mostly databases) must be organized as follows: The first row and the first column of a source sheet have to be used for the assignment of the time series. Time series in the source workbook can be organized in columns as well as in rows. Because worksheets (until version Excel 2003) are restricted to 256 columns the time series in large databases should be ordered in rows such that the identifiers of the time series are in the first column and the time identifiers are in the first row. In figure 1, for example, the value of finalized construction works (C1n) in the first quarter of 2010 (q1.10) is uniquely identified. The row and column assignments (e.g., D9) are no more necessary to identify a certain data point. Figure 1 In order to transfer data from a source workbook the target workbook must contain the VBA solution. The whole program code is available as a template. The sheet “Options” summarizes the minimum requirement of information (name of the source workbook,

Transcript of Documentation of DaTrAMo (Data Transfer- and Aggregation...

Page 1: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

1 (14)

Documentation of DaTrAMo (Data Transfer- and Aggregation Module)

1. Introduction

The DaTrAMo for Microsoft Excel is a solution, which allows to transfer or aggregate

data very easily from one worksheet (workbook) to another. This solution bases on the

computer language Visual Basic for Applications (VBA for EXCEL).

In general, data transfers by copy and paste or linking to another Excel workbook. This

approach is often susceptible to errors because even slight changes in the structure of

the source workbook can have far-reaching consequences on existing. DaTrAMo avoids

these mistakes by the unique assignment of data points of a time series regardless of

the structure of the source workbook. In order to use this solution, the source workbook

(mostly databases) must be organized as follows: The first row and the first column of a

source sheet have to be used for the assignment of the time series.

Time series in the source workbook can be organized in columns as well as in rows.

Because worksheets (until version Excel 2003) are restricted to 256 columns the time

series in large databases should be ordered in rows such that the identifiers of the time

series are in the first column and the time identifiers are in the first row. In figure 1, for

example, the value of finalized construction works (C1n) in the first quarter of 2010

(q1.10) is uniquely identified. The row and column assignments (e.g., D9) are no more

necessary to identify a certain data point.

Figure 1

In order to transfer data from a source workbook the target workbook must contain the

VBA solution. The whole program code is available as a template. The sheet “Options”

summarizes the minimum requirement of information (name of the source workbook,

Page 2: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

2 (14)

path where it is stored and the file suffix). The sheet “Options” contains also a set of

other possibilities for individual adjustment (e.g., formatting data, making list of names).

You can fetch the data into any sheet by using an easy function that only needs the

source workbook name as argument. The time identifiers and the variable names in the

first column or the first row of the target worksheet compare to the corresponding identi-

fiers of the source file.

The function = DbF ("Database name") fetches a value of a time series in the source

workbook and orders it in columns in the worksheet (see figure 2). The name of the

source workbook has to be put always in quotation marks.

Figure 2

Beside the use in working files, the own database functions can also be used to provide

databases from different sources. If a data delivery occurs at regular intervals if, e.g.,

new monthly data are available, transfer of new data into a database can be done auto-

mated by use of the own database functions.

For a smooth work with DaTrAMo, the following adjustment has to be activated:

Excel => Options => Trust Center => Trust Center Settings

ActiveX Settings => enable all controls without restrictions and without prompting

Macro Settings => enable all macros

Page 3: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

3 (14)

For Excel versions up from 2007, use of files with suffix .xlsb recommends.

2. The worksheet “Options”

Figure 3

The template (db_fct_2012Mne_sr1.xlsb) provides the sheet “Options”. Primarily, in this

worksheet you have to specify the source workbooks (see figure 3). For VBA technical

reasons you must list one source workbook at least. You can identify source workbooks

by choosing the cell B8 “Select DB” or you select the sources direct in column B by

using a list. You can open or close the chosen source workbook(s) by choosing the

cell C8 “Open selected databases” or the cell G8 “Close db”. Further the fetched

data of a certain source can be given colourfully or receive also a certain version of a

font-style. It is possible, for example to format data from a monthly database in green

colour and data from a quarterly database in blue colour. In addition, the DaTrAMo can

count the number of the read in data from different source workbooks. You can change

the group levels for all worksheets in the workbook by only one click.

2.1. Selection of source workbooks (mostly databases)

In worksheet “Options” (column K), you find all potentially source workbooks (figure 4).

Figure 4

Page 4: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

4 (14)

In this example, you see a workbook for different labour force data (LFS_db) together

with a workbook for employment data (employment_db) and some other.

Now you can select the desired workbooks in the worksheet “Options” in column B (figu-

re 5). The “store” path of the workbook you have to write manually in column C. In addi-

tion, you can select workbooks with path and suffix by selecting the cell B8 “Select DB”.

In this case, a dialog box will appear where you can choose the workbook. This works

only from row 12 on.

Figure 5

The program will use this information together with the suffix from column I in all func-

tions. To avoid error messages attention has to be paid in particular to the fact that the

path information is fixed for the databases carefully. If you are working on a network, it

is obvious to store source databases centrally in a common folder, e.g., D:\DB\mne12.

In the following column D, you can select which source workbooks you want to open

automatically with cell C8: "Open selected databases". The source workbooks you

have to open for the calculation of the functions. In column G, you can specify individu-

ally which workbook should be closed by one click. In this case, the source files are not

stored. This procedure you should use carefully when the source isn’t a pure database.

In case of reading data from other workbooks, this file you shouldn’t close automatically.

2.2. Formatting

Any colours and different versions of a font style you can use to mark for example diffe-

rent data sources. In column L, you can make a list of the desired colours (figure 6). To

be able to use automatically the internal colour codes of Excel, the respective cell with

the desired written colour is to be formatted. By selecting the cell L8 "Change list of

Colours“, the internal colour coding appears in column M. In the cells in column E, a

Page 5: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

5 (14)

certain colour can be assigned when required to every source workbook. In addition, a

version of a font style can be selected in the cells of the column F for every source file.

After selection of colour and font style for each database, you have to choose the cells

E10 and F10 for initializing the formats. To apply different colours the name of the

sheets has to be listed in the range C41 to C99 (see item 2.3. for further details).

Figure 6

2.3. Value copy

If you want to convert the workbook into a value copy, e.g., a backup copy of certain

data, you can do this automatically. This function is helpful also to construct a new

database by combining different sources. Other users can apply this value copy as

database. The procedure to make a value copy is located in the lower part of the sheet

(figure 7). In column K „Complete list of worksheets“, you can provide a list of the sheets

available in this workbook by pressing the button „New ws-list”.

Figure 7

Page 6: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

6 (14)

Other functionality allows the transformation of the cell entries with own database func-

tions in values for selected sheets in the workbook. In column B, the desired sheets can

be selected (figure 8). To receive a new numbering and new entries in column E and

column G, an initialization is necessary after end of choice of the sheets by selecting the

cell A39 "Initialize". There are two possibilities to generate value copies. Either all func-

tions of the whole file are converted into a value copy (choose range B39:D39), or from

a selection of sheets the results of the own database functions are converted into

values (select cell F39).

Figure 8

Before the complete values transformation will execute you have to carry out a security

query until the beginning of the transformation whether one liked to convert the func-

tions into values (figure 9) really in all work sheets. Then a temporary value copy of the

file is stored on the hard disk. After this, the program will ask you whether also the

macros and the option sheet should be removed (figure 10). Finally, the query show up

whether the value copy should be closed (figure 11). The value copy is always stored in

a new file with the name affix "_V".

Figure 9

Figure 10

Page 7: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

7 (14)

Figure 11

If you want to convert only the results of the database functions into values for one or all

sheets in the file, you can select the corresponding sheets by the entry “Yes” (or “No”) in

column E (see figure 12).

Figure 12

The conversion into values works without automatic storage in a new file. By selecting

the cell "Get values" after a previous query (see figure 13) the conversion starts imme-

diately. The results of the database functions are now converted into values. This

procedure is irreversible; this means all database functions contained in the marked

sheets get lost!

Figure 13

2.4. Other features

The worksheet “Options” offers some other useful features.

Thus, you can fade in or out the grouping levels for all work-

sheets simultaneous (see figure 14).

Further you can switch on or off the view of the grouping symbols

to increase the visible screen (Cell Q11 „Display On/Off“).

Figure 14

Page 8: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

8 (14)

To raise the arithmetic performance it is obvious to switch off “Automatic calculation”

(“Tools” and “Calculating”). Then you can carry out the calculations manually by pres-

sing F9 or, however, also for selected sheets with activating the cell Q1 in the option

sheet or, however, completely with CTRL + ALT + F9.

Further, it is possible to remove assigned names in the

completely working file all at once (button "Delete names"

- see figure 15).

Figure 15

3. Type of functions

Two basic function types are distinguished: Functions for the easy data transfer, e.g.,

from databases or other working files as well as the data transmission with the possibi-

lity of aggregation, e.g., monthly data to quarterly data or to annual data. The functions

may have many several arguments and merely the input of the name of the source file

is stringently necessary; all other arguments are optional. In the English version, you

have to separate the arguments with a comma and in the German version with a

semicolon. You can enter the database functions and arguments with the use of small

letters as well as in capitalization.

3.1. Simple data transfer

Time series in the working files can be shown in columns as well as in rows. You can

use the function „=dbf (…)“ without predetermination of the time series position (the sys-

tem finds the position itself).

According to representation, you can select the suitable function, too (this was the solu-

tion in earlier releases). If the time series names are in the column head the function

„=dbs (…)“ can be used and if they are against it in the rows head, one uses the func-

tion „=dbz (…)“.

In the easiest case, you have to enter merely the name of the source file as the first

function argument in quotation mark (see for an example figure 16):

=DbF („database name“) or =DbS („database name “) or =DbZ („database name “).

Page 9: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

9 (14)

The function applies the time identifier from

the first column (or row) and the time series

name from the first row (or column) to select

the suitable value from the source file. For

this easy case, the sheet name in the

source file must exactly match with the

name of the source file.

Figure 16

The time identifier should be defined to be able to work also with the aggregations func-

tion with the following formats:

Annual data with „a.“, e. g. a.11 for the data point of the year 2011; Semi-annual with „h1.“ or „h2.“ and the year (e. g. h1.11 or h2.11); Quarterly data with the keys „q1.“ to „q4.“ and the year (e. g. q1.11); Monthly data with the keys „m1.“ to „m12.“ and the year (e. g. m1.11).

For additional linking of data, you can use like time identifier entries which starts with “i.”

(Item). This makes sense when you need information about the variables, e.g., the com-

plete name of the variable or the name of the deliverer.

If the source file contains several sheets, the time series from sheets with a name devi-

ating from the file name can be also selected and transferred. Then in addition, the in-

formation of the sheet name is necessary (second argument):

= DbF („database name “, „sheet name”) or = DbS („database name “, „sheet name””) or = DbZ („database name “, „sheet name”).

In the usual case, time series names are in the first column and so the third argument

of the function is “1”. This is the default setting and thus the argument is optional. Are

the time series in the source file in columns, i.e. the time identifiers are in the first row

(in the column head), "2" must be given for the data transfer as the third function

argument.

= DbF („database name “, „sheet name” , 2) or = DbS („database name “, „sheet name” , 2) or = DbZ („database name “, „sheet name” , 2).

In the case, that you want to transfer values from different time series (variables) in one

row (column) you have to define the suitable identifiers in the same column (row). Then

you have to enter as fourth argument the row number and/or as fifth argument the

column number of the value to be read in (time identifier and/or variable name), too.

Page 10: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

10 (14)

= DbF („database name “, „sheet name” , , # , # ) or = DbS („database name “, „sheet name” , , # , # ) or = DbZ („database name “, „sheet name” , , # , # ).

To calculate, e.g., a deflator by division of the nominal values by the real values, it is ob-

vious to put down the variable name of the nominal values (G1 in figure 17) and in the

second line the variable of the real values (G2 in figure 16). By the formula

= DbF( "DBJ_Mne" ) / DbF( "DBJ_Mne" ; ; ; 2 ) * 100 - 100

a division of the nominal values by the real values will be done. The second function in

this example has as the fourth argument the row from where the variable comes from.

Figure 17

With the fifth argument of the function, you can use the time identifier or variable name

from any column. Thereby is it possible to show different table structures in one sheet

(figure 18). In this example, the function gets the variable name from the tenth column.

Figure 18

Page 11: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

11 (14)

3.2. Aggregation of data

For aggregation of the source data, the basic function DBF was extended by additional

arguments. For the aggregation, two functions are available as a function depending on

the structure of the source file.

If the time series are in the source file in rows (it means, that the time identifiers are in

the first row of the source file), the function DbA should be used. If the time series in the

source file are in columns (it means, that the time identifiers are in the first column of the

source file), you can use DbT. The character "A" in the function stands for the possibility

for aggregation. "T" stands for the transposed form of the source file.

You can aggregate monthly data to quarterly, semi-annual or annual data, quarterly

data to semi-annual or annual data and semi-annual data to annual data. In the target

file, the time identifier for the aggregation (q, h or a) has to be given.

As additional function arguments, the periodicity of the source data must be given at the

third and the way in which aggregation should be done at the fourth place of the

function. If the source data is in monthly periodicity, 12 is to be given with the periodici-

ty, with quarterly data the periodicity is 4 and with semi-annual data the periodicity is 2.

The default setting for the periodicity is 12 (monthly data). Hence, in this case no

explicit information of the parameter periodicity is necessary.

The aggregation could be done as average (e.g., for indices) or as sum. To get the ave-

rage the function argument is 2. To get the sum the argument is 1 (default setting).

The following example explains how to aggregate: The formula in cell D7 (figure 19)

= DbA ( " DbQ_Mne " ; ; 4 )

aggregates the four quarterly values of 2010 of the variable " C1n " from the source

data base to the annual sum. The aggregation occurs in each case to the sum, because

no information was carried out in the argument for how aggregation should takes place.

All data are taken from the source file "DbQ_Mne". The information of the sheet name is

not necessary, because the sheet has the same name like the source file.

Page 12: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

12 (14)

Figure 19

4. The help function

The program has a short help function. The

syntax for the database functions you can dis-

play before writing the arguments of a

function. With a double click in any cell, the

help is activated (figure 20).

Figure 20

Then a syntax help to the different function types is offered. If the desired function type

is selected and confirmed with Yes (Ja), the arguments and their order are described in

new windows (figure 21, 22 and 23).

Figure 21

Page 13: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

13 (14)

Figure 22

Figure 23

Page 14: Documentation of DaTrAMo (Data Transfer- and Aggregation ...gogov.org.ua/wp-content/uploads/2016/05/Documentation-of-DaTrA… · 3 (14) For Excel versions up from 2007, use of files

14 (14)

5. Frequent errors

The program runs very solidly. Regular failures or falls have not seemed during the past

years. The most frequent mistake is faulty information of the syntax. For instance, if the

variable name should not be taken from the first column or the first row, a wrong num-

ber of semicolons (commas) between the arguments are often the reason for an error.

In addition, the program delivers the following error messages by wrong inputs:

WB unknown! The source file (database) is not known or is not declared or not correctly declared in the sheet “Options”.

Sheet unknown! The sheet name given as argument in the function does not exist in the source file.

Variable name? The variable name in the target sheet does not match with any variable name (identifier) in the source file.

Time? The time identifier in the target sheet does not match with any time identifier in the source file.

Source structure? The structure of the source file does not match with the argu-ment of the function or the type of aggregation (DbA or DbT).

Source closed! The source file is not open.

Period? There is a wrong periodicity within an aggregation.

Aggr.Funct.? The rule for aggregation is not known, arguments are 1 for sum and 2 for mean.

Aggregat? Time identifier does not fit with the specifications for aggre-gation.

Missing values! For the aggregation function values are missing or empty (number of values is less than predefined).

“” For the function DbF the value in the source is empty.

6. Use of names

Further capability to rationalize work with DaTraMo is the use of names.

For example, you can define the name “DbQ” (with the names manager) and assign the

function = DbF(“DbQ_Mne”) to this name (with the names manager, too). It is important,

during the work with the names manager to be located in an according cell.

Afterwards you can use this name (e.g.

with help of F3) instead the function (But

only with accurate the same arguments,

see figure 24).

Figure 24