The OGEMA Evaluation framework

Last modified by Su Hyun Hwang on 2019/03/18 14:52

Introduction

The OGEMA system supports collection of time series data from various sensors of various gateways. This is an important functionality, but this can also be done with other solutions. The crucial part of a most efficient monitoring system is an automated analysis of the incoming data and a most efficient information for users and support staff on actions that should be taken based on the incoming information. Due to the high amount of incoming data e.g. from the sema field test it is obvious that a regular manual inspection, even on a graphic level, is hardly feasible and makes little sense. For this reason the fundamental mechanism of the OGEMA Monitoring is to create alarms if the system deviates from the expected behavior. The art of creating alarms is to focus on the events that are actually relevant for human analysis and that the receivers (users or support staff) can realistically process.
The basic mechanism for this is processing the incoming data by so-called EvaluationProviders. These can be applied to any interval of the incoming data, but typically each EvaluationProvider used on a monitoring system is applied daily on the incoming data and calculates a set of single-value results (e.g. average temperature, minimum and maximum temperature) for a room or a gateway for each day. The evaluation framework allows to further aggregate these values to calculate so called key-performance indicators (KPIs) of selected result value for wach day. These KPIs are used to create alarms and these values are also shown to users on demand in web pages. More detailed information can be accessed e.g. the raw data used to create the calculations or a graphical representation of the daily KPI values over a longer period of time, e.g. several months.

Overview

Monitoring Overview

  • Evaluation can either take place on an OGEMA gateway collecting the data or on an evaluation server that collects data from various gateways. The default evaluation setup is to use an evaluation server as usually more than one OGEMA instance is involved into a monitoring solution and in this case an overall monitoring is usually preferred. EvaluationOfflineControl will support also evaluation on the gateway in the near future, though.
  • The major goal of OGEMA evaluation is to generate automated KPI reports and alarms. In the standard setting data is transferred from the gateways to the evaluation server once or twice per day and is evaluated on the server once per day. This can be adapted to a higher transfer and evaluation rate if necessary.
    On the gateway also instant alarming apps are supported like BatteryStateControl and WindowOpenDetected. These apps are currently not using the OGEMA Evaluation Framework.

Evaluation Processes

Adding a new KPI page for an evaluation

Usually when a new evaluation has been developed and shall be used on an evaluation server a KPI overview page needs to be provided. The following steps and considerations can be used to implement this:

  • One evaluation provider must implement GaRoSingleEvalProvider.getPageDefinitionsOffered() and provider the page definition there. This does not necessarily be the EvaluationProvider itself for which the page shall be provided. Pages can incorporate results from different EvaluationProviders and the provider declaring the page in getPageDefinitionsOffered can be a completely different EvaluationProvider. In some cases a specific EvaluationProvider for a project declares all KPI pages for the project.
  • When an EvaluationProvider implements getPageDefinitionsOffered the respective start page in EvaluationOfflineControl will have a button in the bottom right area "Add KPI-pages offered by provider". When the button is pressed the respective pages are created and updated. See also Evaluation Offline Control page regarding this topic.
  • When new evaluation results are added to an EvaluationProvider usually also KPI pages have to be adapted - or sometimes an additional page needs to be defined when too many KPIs would be too much for a single page.
  • Configured pages are stored in the ResourceList offlineEvaluationControlConfig/kpiPageConfigs (resources of type KPIPageConfig). If a page shall be removed the respective entry has to be deleted (e.g. using the ResourceManipulator app) and EvaluationOfflineControl app has to be restarted.

Adding a new email report / alarm for an evaluation

TODO

Adding evaluation results and other calculations to an existing EvaluationProvider

There are several components typically used to implement EvaluationProviders:

  • The widget timeseries evaluation API fosters the development of evaluation modules that do not store intermediate result time series. For real online evaluation this is usually not possible anyways as the next step would have to wait until the intermediate timeseries is calculated completely. Instead of applying small generic evaluation methods to entire time series there are several generic utility classes that can be used to perform certain standard tasks such as calculation of mean, standard deviation and median/quantiles within an EvaluationProvider online on the incoming data. The collection of such standard modules is provided in online/utils
  • GenericGaRoSingleEvalProvider: Standard abstract class for the implemenation of simple GaRo-EvaluationProviders. You have to adpt ID, LABEL, DESCRIPTION, define getGaRoInputTypes and RESULTS with the respective result definitions. Usually the core logic is implemented in the sub-class EvalCore (constructor and method processValue). See Result Levels for more details how/where to implement the evaluation logic and results.
  • GenericGaRoSingleEvalProviderPreEval: Standard abstract class for the implementation of GaRo-EvaluationProviders requesting the results from other evaluations (transfer of results via JSON file).
  • When providing a MultiResult class extending GaRoMultiResultExtended you can also generate 'overallResults' meaning results that depend on more than one room or timer period. See 'Using GaRoMultiResultExtended' below for more details on this.
  • Using GaRoMultiResultExtended: You can use git\fhg-alliance-internal\src\widgets\timeseries-tools\timeseries-heating-analysis-multi\src\main\java\de\iwes\timeseries\provider\genericcollection\OutsideTempGenericMultiResult.java and ComfortTempRB_OverallMultiResult as examples:

    • If you want to provide new values that are a result of an entire MultiEvaluation (all rooms, gateways and timesteps) you usually define additional members of the class extening GaRoMultiResultExtended as in OutsideTempGenericMultiResult. You have to make sure you get the right results into JSON (all public members and public methods starting on 'get' are exported).
    • If you want to provider per-timestep results typically a new TimeSeries should be created as a member to the class extending GaRoMultiResultExtended as in OutsideTempGenericMultiResult.
    • If you want to provide per-gateway results typically an "overall room" is created in RoomData and added to the map of results, so no additional members are required in the class (see ComfortTempRB_OverallMultiResul).
  • ...

General considerations:

  • Add additional results/calculations
  • Test with manual evaluation
  • Adapt Auto-Evaluation
  • Adapt KPI result page(s) and email / alarm report definitions (see above)

Adding a new evaluation to an evaluation server

Usually an existing evaluation is used as a template. Initially a new evaluation provider should be implemented as simple as possible together with an initial evaluation page. Features can be added as described above.

Setting up an evaluation server

Usually a new evaluation server can be set up using an existing evaluation server rundir and configuration as template.

Evaluation of Gateway's state by collected messages

Unexpected errors could be occurred in operating gateways and server. The alarm messages wake us up to react to errors within a day after problematic situation in terms of everyday monitoring. If you proceed this in long term, then you will have amounts of messages. Collecting the alarming messages and evaluating these in a given term, you can have a feedback, which gateway has fundamentally problem in there functionality and how fast we reacted them. Furthermore it would raise the reliability at analyzing the results of important values of the competition. For the statistic evaluation you don’t need any other software skills. It can be easily done in Microsoft-Excel.

Overview for errors in gateways

A diagram for when errors occurred in gateways and how many days took to be fixed will help you to get an overview. You have collected the messages and they are enough for your wished period. Then you can build a table in excel writing numbers of gateways in column and dates in row. You can insert the short names of the datatypes, which had errors, in the right cell after gateway’s number and the date. Make the cell background light red. Before the day of resolving it, make the cells for next days in the same way. On the day of resolving you can fill the cell light green without text. After this process for all gateways and the period you will have an overview better than listing of texts.

At the same time it would be also important, how many errors were occurred within a day and how often in a gateway in the period. For the evaluation of number of errors within a day, you can use the function ‘ANZAHL2’, which counts cells only with texts in appointed range. In the row under the Gateways you can adapt this function. Write this once on the first cell and drag this until last day of the period. Excel will automatically copy it for next columns. Now you will see the number of errors for every day in your period. The evaluation for the number of errors in a gateway in the period can be done in the same way but in the column after dates.

image-20190227161537-1.png

This is just one of the methods you can make an overview. If you have other methods, ideas or skills you can use them. With this date you can make statistical graphics. The two weeks and monthly terms will give you a good feedback for the gateway’s status.

Tags:
Created by David Nestle on 2019/02/15 09:52