Group and Aggregate Data in an EDS Extract

This page describes how to use data grouping and aggregation functions of EDS extract interface.

Extract Warehouse Positions for a Specific Date, Source, and Entity

To extract warehouse positions for a specific date, source, and entity:

image-20240730-202041.png
  1. Create a new interface based on the warehouseposition core data object. 
    You see the EDS worksheet.

  2. Map the following fields to the target taxonomy: 
    - sourceName
    - effectiveDate
    - entityId
    - localCurrency
    - securityAlias
    - marketValueBase

    EDS Worksheet in Taxonomy Mapping Mode
  3. To test run your interface, on the Solution tab, in the Debug group, click Test Solution.
    You see the Run box. 

  4. In the Run box, do the following:
    - Provide parameters for entity_id, effective_date, and source.
    - Remove the limitation on maxrows
    - Change the default EagleML output format to CSV With Header.

    Run box - Task Parameters tab
  5. Click Run to run the extract, just to make sure that it produces the output.
    Preferably, your result should contain records with multiple currencies.

  6. Adjust your parameters, if necessary.
    It is normal and expected that the data is ungrouped at this point.

Group the Result Rows by the Local Currency

To group the result by a column: 

In the properties pane, under Group By, populate the Group Key box. When you do so:

  • The key can be any expression based on columns from source data set, such as |LOCAL_CURRENCY|

  • You can type the pipe symbol for the autocomplete feature to show you all the available variables from which to select

  • Use a plus sign, +, if you need to concatenate multiple fields into one key. For example:
    |ENTITY_ID| + |LOCAL_CURRENCY|

Properties pane - Group Key box

If you test run your interface again, you see that the result is now grouped. In the Formatted Result tab, notice the localCurrency column values. 

Add a Group Header to Display the Local Currency Value

To add a group header displaying the value of group key:

  1. In the Properties pane, under Mapping Scope, select OnGroupStart.
    In the destination grid, you now can define columns and values to be shown before each group.

  2. Reuse the first column name for the eachRow section for a nice alignment.
    In this example, the column name is irrelevant, but you do not want any extra column names in your output.

  3. In the value, you paste the expression:  '-- start of group ' + |LOCAL_CURRENCY|

    EDS Worksheet - Add a Group Header

Even though you can test run your interface without publishing it, and even without saving, it is still a good idea to click Save from time to time.

Add a Group Footer with Expression to Calculate Sum of MarketValueBase and Number of Records in the Group

To create a footer for each group:

  1. In the Properties pane, under Mapping Scope, select OnGroupEnd.

  2. Add the following column names and values:
    - Add label1 with a value of '== MarketValueTotal: '
    - Add marketValueBaseTotal with a value of sum( atof(|MARKET_VALUE|) )
    - Add label2 with a value of 'count: '
    - Add countTotal with a value of count()

    NOTE: You used the atof() function on MARKET_VALUE before passing it to the sum() aggregator. You did so because by default all the values from the source data set are considered strings. And if you need to make any arithmetic operations on them, they should be converted to integer or float numbers with the atoi() or atof() functions first.

  3. Run the interface again, and see how the group headers and footers look.

    Run box for group headers and footers

Make Totals and Counts More Readable by Applying a Formatting Function

In this example, the calculations are correct, but the format of the numbers is not attractive. You can modify your expression for the total by limiting the number of decimal places to 2 and adding a comma as the thousands separator.

To do so, you have to enclose your existing expression into the convert() function, as follows.

Convert( oldsumexpression, 2, '.' , ',')
Which becomes:
Convert( sum( atof(|MARKET_VALUE|) ), 2, '.' , ',')

The additional parameters of the convert functions mean number of decimal points, decimal point symbol, and thousands separator.

Add a Grand Total Footer to Show the Sum of the marketValueBase Column Except for USD, and Total Count of Records with Negative Values of MarketValueBase

You can now add a master header and footer, and look at conditional aggregation. By the time the header is sent to output, no records were processed yet. This means the header cannot contain any kind of totals and counts, but can contain only globally available information.

To add a footer with a grand total:

  1. In the Properties pane, under Mapping Scope, select OnStart.

  2. Add the current date to the header in the onStart section, as follows: '* Run date:' + convert(today(),'dd-MMM-yyyy')

    Properties pane - OnStart used for current date header
  3. In the Properties pane, under Mapping Scope, select OnEnd.
    The onEnd footer has information about all the records in the output. And you can count grand totals across all the groups. Notice that the name of the function differs from those which had a group scope.

  4. Add the following column names and values to the onEnd mapping:
    - Add label1 with a value of '== MarketValueTotal non-USD: '
    - Add marketValueBaseTotal with a value of convert( sumTotal( atof(|MARKET_VALUE|), |LOCAL_CURRENCY| != 'USD' ), 0,'.',',')
    - Add label2 with a value of 'count of negatives: '
    - Add countTotal with a value of countTotal( atof(|MARKET_VALUE|) < 0)
    As you can see, the expressions have some new criteria:
    - You used sumTotal() and countTotal() functions instead of sum() and count(). Such 'total' functions work in the whole extract scope, ignoring the groups.
    - You provided a second parameter to the sumTotal() function. It is a condition, telling this function to consider only lines with non-USD currency. Sum() function of course supported such parameter as well.
    - Similarly, count() and countTotal() can accept a condition parameter. In this example, you only count the number of rows, where the market value was negative.

    Run box results

The system always displays the onStart output at the top of the results, appearing above the CSV header.

List and Describe All the Supported Aggregate Functions

A short summary of all the available aggregate functions in EDS follows. For more information about these functions, see List of Supported Functions for EDS Tool.

Functions: Min, Max, and Sum

  • Min

  • Max

  • Sum

Usage

func[Total](value[,condition[,defaultValue]])

[Total] - This suffix after method name means that function will be calculated for all the records. Otherwise only for records in the group.

value - Any expression that returns the same type of value as defalutValue.

Remember to use the atoi() and atof() function to convert strings into numbers.

condition - Any expression that returns a boolean type value. The function will ignore the elements for which the condition evaluates to false.

defaultValue - Any expression that will be used as initial value for calculation. Default value is 0.

Function: Count

  • Count

Usage

func[Total]([condition[,defaultValue]])

Examples

sum(atof(|MARKET_VALUE|)) - sum of all MARKET_VALUE

sum(atof(|MARKET_VALUE|), atof(|MARKET_VALUE|) >= 0) - sum of all non-negative MARKET_VALUE

sum(atof(|MARKET_VALUE|), atof(|MARKET_VALUE|) >=0, 100)- sum of all non-negative MARKET_VALUE + 100 (100 will be in result if there are no rows with non-negative MARKET_VALUE at all)

count() - count of all processed records

count(atof(|MARKET_VALUE|) >=0) - count of all records with non-negative MARKET_VALUE

sum(':'+|UPDATE_SOURCE|, 'True', '') – example of string concatenation – The result will contain colon-separated list of update sources.