Development and Usage of EvaluationProvider Modules for Data Analysis

Last modified by Jakob Ruckel on 2019/03/08 13:28

Introduction to the Widget Timeseries Evaluation API

Evaluation of time series data is a common task for energy management and building automation tasks and applications. The Widget Timeseries Evaluation API provides a set of interfaces and concepts that allow for the provision of evaluation components and data access mechanisms that can be used to start and use such evaluation modules in various applications and contexts.

Features of the Widget Timeseries Evaluation API

  • Each evaluation logic is implemented as a class that implements the interface EvaluationProvider. Such a class is called an 'EvaluationProvider' in the following. An EvaluationProvider defines a set of input timeseries and a set of result output values. These output values typically are single values representing an evaluation result for the entire input time interval, but they can also be result time series. Result single values can be considered key performance indicators (KPIs) for the respective evaluation interval.
  • Access to the monitoring data that is collected from various sensors and that is the base input for the evaluations is accessed via classes implementing the interface DataProvider. In this way EvaluationProviders are not linked to a specific input data source, but can use any suitable input data source for which a DataProvider is implemented. This is a very similar concept to the OGEMA Resources but is focused on providing time series and uses a simpler hierarchy that is more suitable for defining evaluation input. See Section 'DataProvider concept' below for more details.
    The DataProvider interface is also very suitable for implementing GUIs that let the user search and select input time series for evaluations.
  • The API supports online and offline evaluation: Data can be analyzed online during data acquition or offline with stored data. The evaluation components provided in the API are designed to be used for online as well as offline evaluation. This is especially the case with the interface EvaluationProvider and all related API. Such an evaluation typically takes online or offline data series e.g. from a single room as input and calculates certain evaluations.
    The advantage of online evaluation is that the input time series does not have to be stored for the evaluation and that the calculation of the current result variables is much faster at a certain time compared to starting an evaluation offline based on collected data. In practice, however, in most cases input data is collected and stored anyways as most measurement data is considered very valuable data. For typical basic evaluation periods of one hour or one day computing from the input time series is so quick for most typical situations that the second advantage is also not very relevant. For this reason real online evaluation has not been much used so far and for most applications real online evaluation is not required. See Section 'Management of JSON Results and KPI data' for Quasi-Online-Evaluation (Auto-Scheduling) that is an offline evaluation that is started automatically e.g. each day when the data for the previous day is collected and stores the KPIs for each day.
  • 'Multi-Evaluation':  We explain the relevance with an example: A single evaluation may take temperature and relative humdity measurement from a single room as input and calculate the percentage of time with a risk of reaching the dew point at the wall of the room. If we collect data from various buildings with various rooms we usually do not want to perform the evaluation on a single room only but we want to get an overview on average and maximum percentages of time with a dew point problem in any room. If you implement this evaluation just as a single room evaluation (interface GaRoSingleEvalProvider) the API lets you apply this evaluation to all suitable data available via a DataProvider and provides several tools for handling the MultiEvaluation result.  
  • KPIs for destination intervals: In most cases you would like to see result KPIs for each day, each week, month and possible each year if you have collected data for such a time span. This can also be done with the MultiEvaluation-API (this is documented in the Sections below). The implementation for this interface only supports offline evaluation up to now as real online evaluation is not really required for most applications as explained above.
  • A standard data hierarchy for building data is to define the building as first level of hierarchy and the room as second level of hierarchy (with grouping data that does not belong to any room such as central electricity metering and central heating into a special 'overall'-room). Very special buildings or monitoring tasks not limited to buildings may require another hierarchy so the DataProvider interface allows to define an arbitrary hierarchy. Important aspects of multi-evaluation APIs and tools need to be linked to a certain hierarchy for efficient implementation and usage, though. So the standard DataProvider is the interface GaRoMultiEvalDataProvider that refers to the standard building data hierarchy (see Section GaRo Data Setup below).
  • We assume a single OGEMA gateway per building in the API. So the term 'gateway' can be seen as a synonym to 'building' in the documentation. If a large building contains more than one OGEMA gateway and each gateway transfers its data separatly to a server where a multi-evaluation is performed than the data from each gateway would be treated as a separate building if no special measures would be taken on the server to aggregate the data from the gateways into a single virtual gateway that represents the entire building.
  • Use the same EvaluationProvider implementation on OGEMA Gateways and on OGEMA servers. You just have to use different data providers. On an OGEMA Gateway typically the DataProvider will collect data from the resource tree including RecordedData, on an OGEMA server the DataProvider typically will collect data from the slotsDB or fendoDB data that was acquired from the gateways connected. It is also possible to have a DataProvider collect data from any data base that provides time series data.

The DataProvider concept

An implementation of the interface DataProvider can be used to access Data via a simple hierarchical structure. A typical data provider that is limited to data of a single gateway and not just groups data into rooms first allows to choose a room, then a device and then a certain time series within the device (note that this is NOT a standard GaRo-DataProvider). Finding the time series within the OGEMA resource structure or from an external data base is the task that has to be covered by the DataProvider implementation. Even within the OGEMA resource tree the devices inside a room are no child resources of the room and references of the devices to the room may be represented in slightly different ways in the resource structure, so finding the devices for a room is part of the "intelligence" of a data provider. A data provider may be quite general trying to find all devices inside a room or may be explicitly limited to certain device types or devices of a certain application domain.

Tools for the Widget Timeseries Evaluation API

  • Currently only the API and core implementations are provided as Open Source. Additional components and applications relevant for using the API are provided in the Fraunhofer Git repository accessible to the OGEMA Alliance in git\fhg-alliance-internal\src\widgets\timeseries-tools.
  • For starting manual single evaluations the app timeseries-eval-viz can be used. (Description on OGEMA Home screen tile: "Tools for time series analysis"). As this cannot be used for multi-evaluation the app is not used in this documentation.
  • The data collected from the gateways on a monitoring server is stored as a combination of OGEMA SlotsDB information and OGEMA resource information that is obtained from JSON/XML backups. The latter is provided via the resource representation org.ogema.serialization.jaxb.Resource.
    Tools to view this information on the server are:
    - Server backup analysis
    - Timeseries vizualisation (choose 'Backup log data provider' as data source)
  • Managing and starting evaluation providers can done with the OGEMA app evaluation-offline-control (git\fhg-alliance-internal\src\apps\evaluation-offline-control). This apps contains several pages supporting various tools and functionalities of the multi-evaluation API. Some of them are still experimental or may require adaptations in the code for some specific configurations.
  • The result data from a Multi-evaluation is provided in form of a JSON file and can be further processed with the Python-based library provided in git\sema\python. See documentation in Analysis of (Multi-)EvaluationProvider Results with Python.

Generic EvaluationProviders, Domain-specific EvaluationProviders and the standard GaRo Data Setup

There are EvaluationProviders like BasicEvaluation that perform calculations that can be applied to almost all numerical time series like calculation of the mean, standard deviation, minimum and maximum. These providers are generic providers and typically directly implement the API defined in the basic bundles timeseries-eval-base and timeseries-multieval-base. In contrast the majority of meaningful evaluations does something much more specific like the calculation of the "mean room temperature when at least one person is present". This requires a room temperature measurement input time series and a presence/motion detection input timerseries for the same room. If other time series would be applied to this EvaluationProvider the calculation still might run, but the result most likely would be completely meaningless. It is possible to document the input requirements in the input definition of the generic evaluation EvaluationProvider API, but this cannot be processed automatically by input GUIs offering potential selections to a user and this can also not processed for Multi-Evaluation, which is a requirement to automatically perform all meaningful evaluations for a given EvaluationProvider on a given data set.

For this reason typically a Domain-specific extension of the evaluation API should be provided that allows to express input requirements for a specific domain. Currently this is available for data that is structured along rooms inside buildings, which is suitable for most building automation applications, so one LinkingOption of the respective DataProvider is linked to the room of the time series. When a applied to an OGEMA server typically the primary data structure is data collected for each gateway, so the full hierarchy of this Domain-specific EvaluationProvider is Gateway-Room (GaRo). The primary LinkingOption of server DataProviders will most likely always be the gateway. The terminal option of DataProviders is the actual time series within a room.

Note that the description of required input timeseries types is also part of the domain-specific evaluation API. We cannot use standard app mechanisms to search the OGEMA resource data base here as evaluations are also applied to input data via DataProviders that comes not directly from the OGEMA resource tree. For the GaRo provider the supported input types are given in GaRoDataType and would have to be extended there.

Result structure

Each EvaluationProvider defines the result output values it can provide. If the EvaluationProvider is run as single evaluation these outputs can be obtained and processed directly. For an example of the acquistion of the results of a single evaluation run see the OGEMA tutorial snippet for single evaluations.

For a Multi-Evaluation usually selected KPIs of each single evaluation are collected into a larger structure. This basic result may contain a lot of data. By default the result of an evaluation is not written directly to an OGEMA resource for this reason. The result of a Multi-evaluation is provided via the interface MultiResult (generic version). As described before multi-evaluation usually only makes sense for domain-specific EvaluationProviders, so usually there is also a domain-specific extension of this interface. For GaRo this is GaRoMultiResult, which contains all evaluation results for all a single evaluation interval. In AbstractSuperMultiResult the results for all intervals evaluated within a multi-evaluation are collected.

The result structure must fit the further usage of the results. For multi-evaluations there are currently three major applications:

  • Scientific evaluations of larger data sets collected: Here only offline evaluation is relevant. Results are typically stored as JSON files and further processed in Python. Typically only the basic evaluations on room/building level are performed in Java in day intervals, the further processing, generation of histograms and other graphs as well as the calculation of overall KPIs is done in Python. See the documentation below on the GaRo scheme for more details.
  • Usage as input to further evaluations (see Pre-Evaluation). Here also the JSON files are the standard format used.
  • System supervision and operational monitoring and reporting: In this case current results shall be displayed typically on the OGEMA server, but also on gateways. In this case the entire processing typically shall be done in OGEMA. A helper for bringing such results into the OGEMA resource structure is currently designed.

For this reason the standard result format is the generation of JSON files AbstractSuperMultiResult objects. See the Section on JSON results for GaRo for details.

Dependent Evaluations

The widget timeseries evaluation API fosters the development of evaluation modules that do not store intermediate result time series. For real online evaluation this is usually not possible anyways as the next step would have to wait until the intermediate timeseries is calculated completely. Instead of applying small generic evaluation methods to entire time series there are several generic utility classes that can be used to perform certain standard tasks such as calculation of mean, standard deviation and median/quantiles within an EvaluationProvider online on the incoming data. The collection of such standard modules is provided in online/utils.

But modular evaluation development requires dependent evaluations in some cases, of course. Dependencies on a domain-specific EvaluationProvider can be done with a PreEvaluationProvider, see "Pre-Evaluation" for GaRo below.

Start existing evaluations

Multi-Offline Evaluation (example for GaRo system)

The application git\fhg-alliance-internal\src\apps\evaluation-offline-control can be used to start GaRo multi-evaluations. It shows all GaRoSingleEvaluation providers and starts them as multi-evaluations for predefined evaluation interval options. Also the application timeseries-teststarter can easily be extended and configured to start certain multi-evaluations for testing and scientific evaluation (see below).

You should use the rundir git\fhg-alliance-internal\rundirs\rundir-eval as base. You need to build the eval-super-project timeseries-tools and the project timeseries-teststarter. The rundir-eval as provided in Git contains some very tiny GaRo sample data. To test evaluation you should get another development data set (ask your contact person at the OGEMA contact point). Make a copy of the template rundir in your workspace and place the folder with the GaRo data that should have a structure like rundir-eval-data/ogemaCollect/rest next to your rundir folder. In the ogema.properties file add:

org.smartrplace.analysis.backup.parser.basepath=../rundir-eval-data/ogemaCollect/rest

You may specify another directory here and may copy the directories representing some gateways to this directory in order to perform an evaluation only on part of the gateways.

In the config.xml add the following components:

        <bundle dir="bin/apps" groupId="de.iwes.tools" artifactId="timeseries-multieval-base" version="&widgets-version;" startLevel="30" />
        <bundle dir="bin/apps" groupId="de.iwes.tools" artifactId="timeseries-multieval-garo-base" version="&widgets-version;" startLevel="30" />
        <bundle dir="bin/apps" groupId="de.iwes.tools" artifactId="timeseries-multieval-garo-jaxb" version="&widgets-version;" startLevel="30" />

To test the example multi evaluation, also add:

         <bundle dir="bin/apps" groupId="de.iwes.tools" artifactId="timeseries-heating-analysis-multi" version="&widgets-version;" startLevel="30" />   

Now you should be able to use your rundir and get results into the directory evaluationsresults next to your rundir.

Single Evaluation (example for GaRo system)

(Not relevant if you focus on Multi-evaluation, this Section is currently not updated)

The following steps can be used to start a single evaluation run on an OGEMA gateway:

  • You need the same bundles as listed below for the multi-evaluation except you would replace the example multi evaluation with your application.
  • Implement your GaRoSingleEvalProvider and related classes as described in 'Option 1 (Standard)' below - or use one of the EvaluationProviders given in the OGEMA Alliance Repository in timeseries-energy-analysis-multi and timeseries-heating-analysis-multi.
  • An example how to start such an evaluation provider programmatically is given in an OGEMA Tutorial Snippet.
    Note that this example works only for offline evaluation. An effective way to start online evaluations still needs to be developed.
  • Note: The application timeseries-eval-viz in the OGEMA Alliance Repository can be used to start a single generic evaluation via any DataProvider that is available as OSGi service.

Developing GaRo Evaluations

Overview

A typical data source for OGEMA-based projects and field tests is a structure that comprises several buildings with a single OGEMA Gateways providing data from several rooms of the building. This structure is called Gateway-Room-structure and indicated by GaRo in class and package names. In larger buildings more than one OGEMA system may be used, but typically data derlivery to a central evaluation system is only done by a single OGEMA instance providing a consistent data view on the entire building. The building may also provide some data not linked to a single room, this can also be covered by the GaRo structure as the room dependency level does not necessarily be specified. More details are provided in the Section "Develop your own GaRo evaluation".

If a different data setup is used for a project another implementation similar to the bundle timeseries-multieval-garo would have to be implemented and used, the following steps should then be applicable in a very similar way.

GaRo EvaluationProvider options

The basic EvaluationProvider class for GaRo evaluations is GaRoSingleEvalProvider. The most important extensions to the standard EvaluationProvider interface is the provision of GaRoDataTypes as input data definition and the provision of room types that shall be evaluated, which includes the definition that an evaluation shall run on building data not connected to a specific room. There are several extension interfaces and abstract classes available for special cases that mostly can be freely combined:

  • GenericGaRoSingleEvalProvider: Standard abstract class for the implemenation of simple GaRo-EvaluationProviders. You have to adpt ID, LABEL, DESCRIPTION, define getGaRoInputTypes and RESULTS with the respective result definitions. Usually the core logic is implemented in the sub-class EvalCore (constructor and method processValue). See Result Levels for more details how/where to implement the evaluation logic and results.
  • GenericGaRoSingleEvalProviderPreEval: Standard abstract class for the implementation of GaRo-EvaluationProviders requesting the results from other evaluations (transfer of results via JSON file).
  • When providing a MultiResult class extending GaRoMultiResultExtended you can also generate 'overallResults' meaning results that depend on more than one room or timer period. See 'Using GaRoMultiResultExtended' below for more details on this.
  • GenericGaRoSingleEvalProviderResResult: In this variant the result Map obtained via the standard EvaluationProvider interface is limited to a minimum and the real results are returned via an OGEMA resource. This is most relevant when the result of a single evaluation run already gives quite a complex result structure or the number of results varies depending on the input data. Note that writing such results to JSON is currently not supported.

If the callbacks of these abstract classes are not sufficient you might have to extend the provider and GenericGaRoMultiEvaluation or just provide your own implementation of GaRoEvalProvider/GaRoMultiResult/GaRoMultiEvaluationInstance.

Using GaRoMultiResultExtended

You can use git\fhg-alliance-internal\src\widgets\timeseries-tools\timeseries-heating-analysis-multi\src\main\java\de\iwes\timeseries\provider\genericcollection\OutsideTempGenericMultiResult.java and ComfortTempRB_OverallMultiResult as examples:

  • In testing cases this is just an adaptation of the method getSummary.
  • If you want to provide new values that are a result of an entire MultiEvaluation (all rooms, gateways and timesteps) you usually define additional members of the class extening GaRoMultiResultExtended as in OutsideTempGenericMultiResult. You have to make sure you get the right results into JSON (all public members and public methods starting on 'get' are exported).
  • If you want to provider per-timestep results typically a new TimeSeries should be created as a member to the class extending GaRoMultiResultExtended as in OutsideTempGenericMultiResult.
  • If you want to provide per-gateway results typically an "overall room" is created in RoomData and added to the map of results, so no additional members are required in the class (see ComfortTempRB_OverallMultiResul).

Generation and Usage of JSON based results in GaRo Multi-evaluations

As explained above the standard result format of multi-evaluations is the generation of a JSON file. The JSON result files are generated by servialization of objects of type GaRoSuperEvalResult, which is identical with AbstractSuperMultiResult except for some generics specification. The method exportToJSONFile(String fileName, MultiResult multiResult) in class MultiEvaluationUtils is used for the serialization based on the Jackson library. The SuperMultiResult contains a list of GaRoMultiResult, which contains the elements defined by the interface MultiResult and the extensions in GaRoMultiResult. It is important to understand that the Jackson library writes all public member fields into the result file as well as the result of all public methods starting with 'get' using the name behind the get as JSON element name. So these public fields and getter-methods are used to provide the result content for the JSON file. Other fields and methods can be used to transfer information between the single evaluations of the multi-evaluation and also to transfer result information to processing results that are not linked to a single room, but to an entire gateway or to the entire multi-evaluation.

To analyze and plot the results we recommend to use Python/Spyder. For details see Analysis of (Multi-)EvaluationProvider Results with Python.

Structure of GaRoMultiResult and calculation of different Result Levels

Each multi-evaluation is run by running all relvant single evaluations and aggregating the results into a common structure. In the simpliest case for each room a single evaluation is performed and for each result KPI a single value for each room is stored. Some evaluations are not performed on room level, but on building level (e.g. using the electricity consumption of the building). Some evaluations run on room level but shall provide results on building/gateway level (e.g. maximum temperature measured in entire building). For this reason different result levels are specified here with a documentation how to implement the respective evaluation:

  • Results linked to a specific room are directly obtained from each single evaluation. For each room evaluated an object of class RoomData is generated containing the room and gateway id allowing to identify the room to which it belongs. The objects are stored in the list roomEvals. Each scalar result provided by the EvaluationInstance of the EvaluationProvider is stored in the public map evalResults (as String representation for safe conversion into JSON). The actual result objects are stored in the map evalResultObjects, which is not public and thus not written into the JSON file.
  • Results linked to an entire gateway are written into the default room. Such results are also written into a RoomData object, but with room-id ##Building and stored like room results. Each GaRoSingleEvalProvider should either provide results for a specific room (standard case) or for an entire gateway. In the latter case the method getRoomTypes should return new int[] { -1 } . The data requested via getGaRoDataType must be available on building level for the gateways evaluated. For the sema data this is only GaRoDataType.PowerMeter, all other sensors are attached to rooms. EvaluationProviders that operate on room-level can still setup a ##Building-RoomData result in its own extension of GaRoMultiResultExtended (see next Sections).
  • Results linked to an entire evaluation interval, to all gateways evaluated: Such results cannot be calculated in the single evaluations direclty as these evaluations always operate on room or on gateway level. So in this case the GenericGaRoSingleEvalProvider has to define its own GaRoMultiResultExtended in the method extendedResultDefinition. An example providing a time series to JSON and the value generalOutsideTemperature internally for communication between the single evaluation is given in ExampleOverallGaRoMultiResult.

Pre-Evaluation

If an EvaluationProvider needs the results of another EvaluationProvider as input it must declare a Pre-Evaluation request. In the GaRo system this measns that the interface GaRoSingleEvalProviderPreEvalRequesting has to be implemented, usually by using the abstract class GenericGaRoSingleEvalProviderPreEval instead of the standard GenericGaRoSingleEvaluProvider. Note that this abstract class may just be used to obtain the current gateway and room id via the method provideCurrentValues as this information is not available in the single evaluations otherwise (there are now also Configurations named gwId, roomId and roomName that can be used to obtain this information, but this is usually more programming effort).

There are two ways of using Pre-Evaluation data inside the destination EvaluationProvider:

  • Read the respective value or time series from the PreEvaluationProvider, usually with the methods getRoomData or getGatewayData in GaRoPreEvaluationProvider, but also getIntervalData or getSuperEvalData in the parent interface PreEvaluationProvider can be relevant. Usually only the current gateway and room are used, but in principle also information from other gateways and rooms can be obtained.
  • Inject a timeseries from the pre-evaluation data into the input time series of the destination EvaluationProvider. This can just be done by adding the respective time series result type of the PreEvaluationProvider into the return list of GenericGaRoSingleEvalProviderPreEval#getGaRoInputTypes() . See git\fhg-alliance-internal\src\apps\automated-heating-time-detection\src\main\java\de\iee\app\automatedheatingtimedetection\provider\heatcontrol\HeatControlQualityEvalProvider.java for an example.

The data transfer from a PreEvaluationProvider to another EvaluationProvider using the data is usually done via the JSON result file of the source evaluation. There is also the option of using data directly from an AbstractSuperMultiResult obtained from the source application, but there is currently no API/utility support for this.

The initial examaple evaluation provider using Pre-Evaluation is in the OGEMA Alliance repository on git\fhg-alliance-internal\src\widgets\timeseries-tools\timeseries-heating-analysis-multi\src\main\java\de\iwes\timeseries\provider\genericcollection\ComfortTempRB_OverallProvider . An older example is git\fhg-alliance-internal\src\widgets\timeseries-tools\timeseries-heating-analysis-multi\src\main\java\de\iwes\timeseries\provider\heatingloss .

Evaluations that use Pre-Evaluation only

In some cases an evaluation does not need input from any DataProvider anymore and does not need to process unaligned time series via a method processValue, but shall just process the MultiResults from the Pre-Evaluation data. In this case the methods executeSuperLevelOnly and performSuperEval need to be overwritten in GaRoSingleEvalProviderPreEvalRequesting. An example for this can be found in timeseries-heating-analysis-multi\src\main\java\de\iwes\timeseries\provider\valve\outtemp\ValveOuttempEval.java . Note that this is quite similar to processing evaluation results with the Python libraries provided.

Multi-Dataprovider Evaluation

The GaRo evaluation framework supports giving more than one DataProvider as input to a multi-evaluation. It has not been tested though, to use time series from more than one provider in a single evaluation. If data from different sources shall be combined into a single (multi-)evaluation the usage of a DataProvider is recommended that is able to aggregate different other data providers and time series data sources. An example for such a data provider is given in the SmartEffGaRoProviderTSGaRo, but this requires a special environment.
In any case the correct mapping of gatewayIds and roomIds between different DataProviders cannot be done without additional information in general. So special implementations like the SmartEffGaRoProvider may be a good solution for this. In some cases it may be easier to implement a time series injection via GaRoSingleEvalProviderPreEvalRequesting#timeSeriestoInject . In this case the capbility of GenericGaRoMultiEvaluation to obtain instances of GaRoSingleEvalProviderPreEvalRequesting via an object obtained as configuration would have to be implemented.

Parameters and Scalar inputs

Up to now we have described how to define time series as input as well as result valued from pre-evaluation. Usually also parameters as scalar values are required as input to an evaluation. There are several options how to include them:

  • Define constants in the evaluation provider. This is the easiest way and especially suitable for initial testing. As these constants are hard coded this is not suitable for evaluations that need to be used with different parameter values in (semi-)productive systems.
  • Define constant via system property. This is a bit more flexible than a constant defined in the evaluation provider. It does not effectively allow to change and store values during run time, though. If this is not required this option is still the easiest to implement that does not affect the OGEMA data base etc.
  • Define configuration element in the evaluation provider (usually with a default value defined in one of the options above). This allows setting the value via the timeseries-eval-viz, but implementation for automated parameter management requires quite a lot of code.
  • Define a parameter resource that may reference several resources if several values from different resources are relevant for input. This is recommended if individual parameters shall be set by users/applications for the same evaluation provider via a special GUI and the values shall be stored in the OGEMA data base. See ElectricityProfileEvalProvider for an example.

Management of JSON Results and KPI data

Up to here you have learnt how to develop evaluations that can be exceuted as multi-evaluations, how to run them and how to access results. The EvalResultManagement and EvalScheduler API allows to calculate, manage and view result data and KPIs for intervals like days, weeks, months etc. Check the respective pages in evaluation-offline-control and in the source code how the API works. A very short introduction is given below.

You find the APIs in git\ogema-alliance\src\tools\util-extended-eval\src\main\java\org\ogema\util\jsonresult\management\api\EvalResultManagement.java and git\ogema-alliance\src\tools\util-extended-eval\src\main\java\org\ogema\util\evalcontrol\EvalScheduler.java .

When an EvaluationProvider shall be applied regularly (e.g. every day) and as multi-evaluation the settings for the evaluation that normally would be set in the evaluation-offline-control app need to be stored persistently. If an evaluation is started e.g. once per day the result for each day usually also shall be stored persistently. Both configurations and the result data is stored in the OGEMA resource data base in resources of type MultiKPIEvalConfiguration.  The management of these configurations is done in the class EvalScheduler. This management also provides the functionality to search for existing pre-evaluation results (stored as JSON with a resource reference information in JSONResultFileData) and to perform pre-evaluations to get missing pre-evaluation data. It also has a functionality to queue evaluations to make sure only a single evaluation is run under each instance of EvalScheduler at the same time. Furthermore it provides an auto-scheduling functionality that allows to run an evaluation automatically an a regular base e.g. once per day.
Such configurations are used for repeated manual starting of an evaluation of for repeated automated starting, which is called Auto-Scheduling. Auto-Scheduling is part of the API but is still under development.

The class  provides a template that allows to display KPIs calculated via this module easily in a widget template page class org.ogema.util.directresourcegui.kpi.KPIMonitoringReport.

Generation of Graphical Views of Time Series

Many graphical representations for data analysis, especially histograms, are most easily generated using Python. For graphical representation genereation directly on OGEMA systems various variants of the Schedule Viewer are provided. The latest version is the Timeseries Viewer Expert that can be configured and opened by any application. A documentation is available on the Timeseries Viewer Expert documentation page.

Start / End and Gap Handling

When taking measurements from the field in practive almost always data gaps occur. It is very important to handle the gaps correctly as this may be a source of significant evaluation errors. If, e.g., there are no values for the second half of a day evaluated and the last value collected is used like a valid value for the entire second half of the day the average value for the day would be highly dominated by this single value. Taking into account gaps also means that the total duration of valid values may be different than the total time of the evaluation interval, which has to be taken into account when calculating averages etc.

The basic class SpecificEvalBaseImpl contains a standard gap handling. See the documentation there or the respective wrapper/access methods in GenericGaRoSingleEvaluationProvider. The method getMaximumGapTimes can be used to define that maximum gap time for each input type. The method GenericGaRoEvaluationCore.gapNotification is called whenever a gap is detected. The duration of each valid value is also automatically limited to the maximum gap time so that an implementation of gapNotification is usually not necessary for GaRo-EvaluationProviders.

Further notes:

  • The PrimaryPresenceEvalProvider contains a gap checking for motion sensor signals. It differentiates between gap time inbetween daily time series and gaps that occur because there is no data for certain time series and days at all. The gap time is given relative to the evaluation time so that values can be compared even for different base evaluation intervals.
  • There is a basic gap evaluation in timeseries-eval-base\src\main\java\de\iwes\timeseries\eval\base\provider\gap (status not tested after 2017, note that this is not a GaRo-EvaluationProvider).

TODO: It should be checked how the initial interval from startTime until the first values inside the interval are processed and the gap handling should be discussed. Maybe further options on the InterpolationMode need to be implemented.

Evaluation of estimation algorithm quality

The evaluation system can be used to test estimation processes such as sensor replacement and other algorithms on measurement data collected. In this case you typically define 3 EvaluationProviders that set up a Pre-Evaluation chain:

  • EvaluationProvider that calculates general parameters and models for the algorithm. Usually this is done on training data that is separated from the test data used in the next step. Depending on the algorithm requirements additional ground truth data may be used that is not available to the algorithm test evaluation in the next step.
  • The Algorithm is applied to the test data using the parameter and model data calculated in the previous step
  • An algorithm quality evaluation compares the the algorithm result to some ground truth that needs to be available in order to evaluate the estimation quality
    In some cases also a plausibility quality evaluation can be defined if no ground truth is available.

An example for this is available in git\fhg-alliance-internal\src\widgets\timeseries-tools\timeseries-sensor-replacement .

Plausibility Checks

Checking plausibility is very important for any evaluation. If no other plausibility check is available it is recommended to generate a histogram of the result file using Python which can be applied to almost all EvaluationProviders and can reveal several errors (but not all, of course).

Helpers and Utils

Most algorithms for time series cannot be used with the EvaluationProvider interface as all evaluations have to be put into online logic meaning that you do not have all values at once. Algorithmic helpers that can be used with online logic, usually offering an addValue functionality instead of expecting all data to be given in a time series etc. are provided in timeseries-multieval-base/src/main/java/de/iwes/timeseries/eval/online/utils.

Test and Run your evaluation provider with timeseries-teststarter (deprecated for most applications)

Note: The timeseries-teststarter is mainly for testing and scientific evaluation. If you would like to run a GaRo evaluation without modifying the source code of your starter app, use git\fhg-alliance-internal\src\apps\evaluation-offline-control.

Then you typically add a new button to the Eval-Starter GUI page timeseries-teststarter\src\main\java\de\iwes\app\timeseries\teststarter\gui\MainPage.java to be able to start the new evaluation. You can use gaRoButton4 (option 1) or gaRoButton4ct_rt_overall (option 2) as template. This will also require to add the new service to the class TeststarterApp.

Finally run the evaluation and evaluate the JSON that was created based on the name you specified in the GaRoEvalHelper.GaRoTestStarter constructor in the Teststarter-page. To analyze and plot the results we recommend to use Python/Spyder. For details see Analysis of (Multi-)EvaluationProvider Results with Python.

When you have finished implementing the evaluation you should provide an extension to the initial requirement specification describing how you actually solved the problem, where to find sources and provide a little presentation of the results.

Run evaluation with partial input (deprecated for most applications)

When using the Multi evaluation as described above all data from all gateways in the dataset is evaluated that fits the EvaluationProvider input definition. In order to run the evaluation only with partial input you can do the following:

  • In order to use not all gateways in the dataset prepare an input data set only containing the gateway information you want to evaluate and provide the reference in the property org.smartrplace.analysis.backup.parser.basepath as described above in "Setting up the Rundir".
  • To use only some roomTypes use an inherited class of the respective EvaluationProvider that should also implement GaRoSingleEvalProvider and overwrite method getRoomTypes. Use this EvaluationProvider with the GenericGaRoMultiProvider.
Tags:
Created by David Nestle on 2017/08/23 14:13