Big Data and Renewable Energy

13
Big Data and Efficiency of PV Plants Silvano Vergura, Department of Electrical and Information Engineering, Technical University of Bari 1 20th IMEKO TC4 International Symposium – Benevento (Italy)

description

Big Data and Renewable Energy

Transcript of Big Data and Renewable Energy

Page 1: Big Data and Renewable Energy

Big Data and Efficiency of PV Plants

Silvano Vergura, Department of Electrical and Information Engineering, Technical University of Bari

1 20th IMEKO TC4 International Symposium – Benevento (Italy)

Page 2: Big Data and Renewable Energy

Aims

1. To present the issue of monitoring the energy efficiency of one PV

plant

2. To present the issue of monitoring several PV plants (constituting a

Constellation of PV plants) by an unique supervision centre

3. To introduce the issue of the computational burden for monitoring

a Constellation of PV plants

4. To propose a possible effective solution, bootstrap technique

5. To compare results applying bootstrap technique and standard

sampling for a real case

20th IMEKO TC4 International Symposium – Benevento (Italy) 2

Page 3: Big Data and Renewable Energy

Standard indexes for monitoring the energy performance of PV plant

Standard performance parameters1

Final PV system Yield (Yf)

Reference Yield (Yr)

Performance Ratio (PR)

Drawbacks: a) they supply rough information about the performance of the overall PV plant; b) they don’t allow any assessment of the behavior of the PV plant single parts.

20th IMEKO TC4 International Symposium – Benevento (Italy) 3

[1] - IEC, “Photovoltaic system performance monitoring guidelines for measurement, data exchange and analysis, IEC Standard 61734”, Geneva, Switzerland, 1998

Page 4: Big Data and Renewable Energy

How to monitor several PV plants constituting a Constellation?

• The Italian PV market has been very attractive

for new PV plants installations. • Several O&M enterprises have a lot of PV

plants to be monitored and a data-logger on each PV plant.

• Each dataset stored in the datalogger of each PV plant is constituted by voltage, current, power and energy in both DC and AC side, besides radiation, module and environment temperature, etc.

• Data of different plants are independent each other.

• Each datalogger (client) is linked to a server box which manages all the data.

From the monitoring point of view the PV plants constitute a constellation.

20th IMEKO TC4 International Symposium – Benevento (Italy) 4

Page 5: Big Data and Renewable Energy

“high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization” (Douglas Laney, Gartner, 2012)

Hypothesis to monitor 10 PV plants, each one of them rated of 1-MWp with 250 sub-arrays

• Data-logger samples each 10 minutes (6 datasets/hour) for 10 hours/day for 365 days/year for each array, for 250 sub-arrays;

• Then, 6*10*365*250=5.475.000 datasets/year for 1-MWp rated power PV plant.

• Each dataset of analysis is constituted by three-phase voltage and current, energy, cell temperature, solar radiation ,then data to be processed are 5.475.000*10 = 54.750.000

• For a constellation of 10 PV plants, the supervision centre have to process 54.750.000*10 = 547.500.000 data!

20th IMEKO TC4 International Symposium – Benevento (Italy) 5

10

20

30 40

50

60 3

6

9

10

15

20

24 60

120

180 240

300

365

Minutes/hour Hours/day Days/year

Big data

Page 6: Big Data and Renewable Energy

Statistics for preliminary information about the operation of a PV plant

Proposal

• to carry out preliminary analysis on each PV plant utilizing sampled data of energy in role of the whole data population, in order to manage a smaller dataset for each PV plant. If an anomaly is revealed on a specific PV plant, a successive in depth analysis (based on the whole data population of that PV plant) will be carried out

• It will be shown that the bootstrap technique (with respect to the standard sampling techniques) allows to define a representative sample of the energy data stored in the data-logger of a PV plant and is time-saving.

PS: In the following, all the considerations about the sampling will regard only one PV plant, being intended that the concepts have to be applied to each PV plant of the constellation.

20th IMEKO TC4 International Symposium – Benevento (Italy) 6

Page 7: Big Data and Renewable Energy

Differences between bootstrap and standard sampling

• Standard sampling is based on many random samples from the population with the aim to find the sampling distributions of sample statistics (mean, variance, etc.).

• Instead, the bootstrap is a resampling technique (as jack-knife and permutation test are) able to approximately find the sampling distribution from just one single sample.

• Then, in place of many samples from the population, the bootstrap creates many resamples by repeatedly sampling from only one random sample.

20th IMEKO TC4 International Symposium – Benevento (Italy) 7

N samples from the population vs.

N resamples from a unique sample

Page 8: Big Data and Renewable Energy

Sources of random variation between population statistics and bootstrap ones

1. variation due to the first sample

2. variation due to the bootstrap resamples with respect to the first sample

For example, let us consider the variation introduced on the mean value.

Starting from a data population, the mean value of the first sample has a bias with respect to the population mean (type-1 variation). The mean of each successive resample, based on the first sample, has a bias with respect to the mean value of the original sample (type-2 variation) and then with respect to the population mean.

As the type-1 variation affects all the successive resamplings, it is more critical than type-2 variation.

20th IMEKO TC4 International Symposium – Benevento (Italy) 8

Page 9: Big Data and Renewable Energy

Case study - 20kWp PV plant with 6 sub-arrays (1 for inverter)

• 6 sub-arrays can be easily represented on one slide.

• The same approach has been applied also to 1-MWp PV plant.

• The results are similar, whichever peak power is; the only difference is the total amount of operations and the related computational time. The sample size for each sub-array is 20% of the data population, consisting in about 100.000 data

20th IMEKO TC4 International Symposium – Benevento (Italy) 9

Fig. 1. Population distribution of each sub-array.

Legenda:

red line = population mean µ

blue line = sample mean

Fig. 2. Sampling distribution of the mean of each sub-array.

Page 10: Big Data and Renewable Energy

Bootstrap

The sample size for each sub-array is 20% of the data population as for the previous standard sampling.

20th IMEKO TC4 International Symposium – Benevento (Italy) 10

Fig. 3. Bootstrap distribution of the bootstrap mean for each sub-array

Legenda:

red line = population mean µ

blue line = sample mean

fuchsia line = bootstrap mean

Page 11: Big Data and Renewable Energy

Comparison and type-2 variation

Blue line represents the mismatch value between the sample means and the population means for each sub-array, while the magenta line is the mismatch value between the bootstrap means and the population means for each sub-array. The columns 1 and 3 report the previous numerical mismatch values and it can be noted that they are limited in the range of ±1%. Nevertheless, the resamplings for the bootstrap are done in a shorter time than the sampling one (0.6s versus 12.4s); it depends from the number of resamplings.

For this specific case, the bootstrap time is about 5% of the sampling one.

Similar results have been obtained for a lot of repetitions, because resamplings based on the same original sample introduce only little variation (type-2 variation).

20th IMEKO TC4 International Symposium – Benevento (Italy) 11

Mismatch of

sampled data

(in %)

Sampling

time [s]

Mismatch of

bootstrapped

data (in %)

Bootstrapping

time [s]

12.4 0.6

0.04 0.33

-0.35 0.21

0.13 -0.94

-0.03 1.19

-0.36 -0.12

-0.29 -0.50 1 2 3 4 5 6

-1

-0.5

0

0.5

1

Blu = sampled data, Magenta = bootstrap, Black = min/max values

Page 12: Big Data and Renewable Energy

Comparison and type-1 variation

This sub-section reports the results of the bootstrapping on the basis of a new random original sample. Type-1 variation, due to the original sample, is stronger than that due to the random resamplings.

Once again, the resamplings for the bootstrap are done in a shorter time than the sampling one (0.8 s versus 12.4 s).

Applying the proposed approach to a 1 MWp rated power PV plant, the total amount of the data will be greater as well as the computational time saving. Considering the application of the procedure to each PV plant belonging to a constellation, the efficiency of the overall system increases considerably.

20th IMEKO TC4 International Symposium – Benevento (Italy) 12

Blu = sampled data, Magenta = bootstrap, Black = min/max values Mismatch of

sampled

data (in %)

Sampling time

[s]

Mismatch of

bootstrapped

data (in %)

Bootstrapping time

[s]

12.4 0.8

-0.11 -0.12

0.29 -2.49

-0.20 -2.74

0.04 -0.76

0.04 -0.37

-0.21 -0.56 1 2 3 4 5 6 -3

-2

-1

0

1

2

3

Page 13: Big Data and Renewable Energy

Conclusions

• The monitoring of energy efficiency becomes harder when a lot of large PV plants have to be monitored from a unique supervision centre (constellation of PV plants).

• To reduce the processing time it is possible to represent the population of data by means of a representative sample which allows to perform a faster evaluation of the operating conditions of the plants.

• For these aims, the paper has proposed the bootstrap methodology:

– it allows to obtain representative sample of the population, whichever its distribution is;

– it needs smaller time than that required by standard sampling techniques.

20th IMEKO TC4 International Symposium – Benevento (Italy) 13