Introduction

There are many ways to design, configure and operate your Data Mart. Since different approaches are valid in context, it is not possible to define best practices for every task. However, there are some useful lessons from the experience of Eagle and its clients with Data Mart. This document assumes that you are familiar with how to use Data Mart. The information in this section does not cover basic concepts, but instead recommends specific choices of tools and techniques which lead to better results.

Supporting Non-OLAP Data in the Mart

All data built by PACE in Data Mart tables comes from the action of an OLAP process. There are strong arguments for maximizing the potential of the OLAPs to populate as much required reportable data as possible, given the powerful and flexible tools available in the Data Mart for data management of the Mart. However, it is sometimes necessary to support data in the Mart not placed there by the OLAPs.

When to Move Non-OLAP Data to the Mart

Some types of data either should or must bypass the OLAP processes that build data into mart tables:

Data in structures not supported by OLAPs. The Eagle data model is very flexible and supports a wide range of data structures. You can build custom tables and join them to core tables to permit access by the OLAPs. You can also create database views and build field attributes to report their contents. However, there may be instances where data is loaded to the Eagle warehouse for which OLAP access is not readily available and cannot easily be provided.
Report-ready, single-use data loaded in high volume. Even in the case of single use data, Eagle recommends as a best practice that all data is stored in the warehouse followed by Data Mart running the OLAP processes to move a set of values from one warehouse schema to the Data Mart schema. This is the best approach if you want to load data from an external source such as report-ready performance data.
Data Moved in Anticipation of a Future Release Enhancement. You may need to deliver a type of information that is supported in Data Mart only in a future release due to a product enhancement. Release upgrade may not be practical before the information is required. If the nature of the enhancement is to enable a new type of field to be built into existing Mart tables, a temporary workaround strategy may be available. Refer to “Direct Loads to Future Table Extensions” for additional information.

How to Move Non-OLAP Data to the Mart

The following sections describe how to deploy non-OLAP data to the Data Mart schema.

Non-Materialized Views

A non-materialized view is simply a SQL query against other database tables that is exposed to reports and applications as if it were a physical table. This simplifies access since joins between tables and views are supported. The main drawback of such a view is that its use involves the latency of execution of a database query. Reserve this type of view for the simplest views in use.

Materialized Views

Materialized views are physically built from data returned by underlying SQL. They are a sort of database copy. Queries against them perform much better than those against non-materialized views of the same datasets, since data is physically present and does not have to be assembled on the fly. They are best suited to views of larger numbers of fields. A drawback of materialized views is that they must be updated and maintained. Oracle offers a powerful toolset, Oracle Materialized Views, for deploying materialized views.

Direct Loads to Reserved Columns of Mart Tables

The Selective Fields enhancement in V9.0 Data Mart allows you to directly load fields to Data Mart tables as long as you can populate all rows of those tables by a regular Data Mart Submit. You could reserve some fields in the tables that need supplemental data from direct loads. To do this, create and select a set of dummy field attributes for the model to create columns in the table to receive the external load. Selective fields are then used to avoid populating those fields in the regular build, reserving them for the external load. Eagle recommends you create and populate an extension since the Mart does not delete data, but simply updates it.

This practice introduces complexities, and you should attempt it only in consultation with our Professional Services Consulting Group. It may be simpler just to add a custom table to the Data Mart schema and join to appropriate tables in reporting.

Direct Loads to Future Table Extensions

If you want to use a type of Mart data fields that are only supported in a release after yours, you may be able to load those fields to a table having the same key structure as a Data Mart table extension. You can give the table the same name as the extension you would create for the fields in a higher release, and report against the table as if it were a regular Mart table extension. Then, when you upgrade, create the table extension in the Mart and give it the name of the table you are using for the new fields. You add to that extension the Mart fields necessary to create your table fields, now supported in the Mart. This is a simple process with the benefit that reporting data access logic need not change at all.

Planning the Mart

Develop a Data Mart Implementation Plan

With a clear and complete functional specification, a knowledgeable resource can configure a Data Mart very quickly. This makes it worth the time to create a good implementation plan at the start.

Inventory the Fields and Groupings to be Supported

The first step in planning a Data Mart is to take as complete an inventory as possible of all the fields and groupings you plan to support in your information delivery project. The more complete the inventory, the better you can create a logical and economical design of models, model extensions and groupings. This does not mean that use of the Mart is an all or nothing or big bang step, only that within the scope of the data to be supported in the Mart, a complete inventory of it makes for a better plan.

This inventory process should include an effort to standardize where possible on a minimal set of definitions of common fields like market value and security price. Standardization makes possible a compact Mart where data elements are widely shared among consumers of the data.

Start Small and Build on Your Successes

While it is best practice to create a complete inventory of fields you will need for a specific information delivery goal, that goal may be a modest one, especially if it is your first Data Mart project. The mart is architected for future growth. So doing a complete analysis of a present known need does not require foreknowledge of all future reporting requirements.

Identify and Create any Missing Meta Data Elements

Every Data Mart field requires a field attribute, and every Data Mart grouping requires some set of field attributes, range rules and dictionaries (regular and performance). The Mart implementation plan should identify these meta data elements quite specifically, and indicate where something must be created.

Use the Data Mart Planning Template

Global Professional Services has a template that is very useful for storing all of the Data Mart and regular meta data elements involved in a Mart, and can serve as the basis of an implementation plan. It is available as Knowledge Base article #8619.

Planning a Sources Strategy

A sources strategy for Data Mart is a design that uses the minimal list of source rules and source-specific fields required to provide the entire source perspectives needed in reporting. Often, this implies a single source rule if data of every type, such as positions and trades, can correctly be reported using just a single hierarchy of sources. More than a single source rule is required only if certain individual positions, trades, or securities must be viewed in more than one way within the scope of information delivery.

Common use cases for multiple source rules or source-specific fields include:

Multiple providers of bond ratings or other security analytic data whose viewpoints must all be considered in the security evaluation process;
Multiple accounting bases, such as GAAP and statutory, that require accounting fields to be computed and reported in different ways.

When more than one type of data must be reported in more than one source hierarchy, the number of required source rules can proliferate and lead to several snapshots and heavy duplication of data. For this reason, you should make choices to limit the required list of snapshots. See “Sharing of Security Details Among Snapshots” for additional information.

A best practice is to build best data in PACE where possible, and then use the resulting Eagle PACE source as the main source for those data types built.

Planning for Good System Performance

Building Only the Data You Need

The single most important aspect of Mart planning directed at good system performance in the build process is a realistic assessment of what data you really need to send to the Mart. The best way to answer this question is to consider all of the processes that will use Mart data, and understand their demands in detail. Requirements are likely to expand over time, but you can add fields. It is better to add later than to over-build your Mart from the start.

Which fields are needed daily? Many organizations maintain a monthly reporting cycle that generates demand for a comprehensive list of monthly fields. Daily values may be needed for a much shorter list. If this is true, use the Selective Fields capability to limit the daily build to those fields needed daily, with a full build conducted at month end.

Some types of performance returns, such as returns for specific years or quarters, do not change every day. If you report such returns, you can avoid building them every day or even every month by building them only once on their true effective date, for example on December 31, 2020 for a “Year 2020” return. Then in reporting, you would always select them as of their effective date.

If you plan to deliver a piece of information from Eagle Portal, you can source it from Data Mart or use the dynamic query approach where results are computed on the fly. The dynamic query approach may have limitations as to breadth of data, and the calculation may take somewhat longer than if Portal is queried directly from the Mart. On the other hand, it is not best practice to build out your Mart in an overly comprehensive way just to feed Portal with some types of data that you think might be accessed, but for which no clear current demand exists. Portal dynamic queries are best suited in those cases where you need to support access to particular types of data or calculations, but you expect access frequency to be very low.

Indexing Warehouse Trade and Cash Date Fields

When you select trade and cash activity fields for Data Mart you can specify which of several different date fields, such as trade date and settlement date, you want to use as the effective date. Whichever date you choose should have an index built on it in the appropriate warehouse table to speed processing of trades or cash events. The effective_date field itself is indexed already. Statistics should be built frequently enough, ideally daily.

Table indexing does not really apply to building performance analysis fields, since they are compute-intensive.

Using Extensions to Exploit Parallel Processing

System performance in building Mart data is enhanced when your design takes advantage of parallel processing by multiple concentration (OLAP) engine instances. Each Data Mart table extension is populated under the direction of a separate Model Manager instance that generates its own set of underlying OLAP processes. This means that the use of extensions can increase parallel processing and speed population of the Mart.

A Global Professional Services study has recorded the advantages of extensions in populating performance measurement data in particular. See the Data Mart White Paper in Knowledge Base article 8459. Performance Analysis OLAP processes underlying a Data Mart model are among the most I/O and compute-intensive of any, so extensions can be particularly helpful in minimizing run time as you build performance linked returns, risk fields and other complex calculations. When you assign performance fields to an extension, it is best to group fields together that you would logically assign to the same Performance Analysis OLAP report, such as a single attribution group or a series of performance link analysis fields that span either a short or a long-term time span.

If you install Eagle Mart and the Performance Toolkit, you can see examples of best practices for using extensions to support different types of performance fields. The toolkit uses separate Fund Summary extensions for the following field types:

Performance link fields for periods up to one year
Performance link fields for periods longer than one year
Performance risk analysis fields
GIPS Disclosure fields
GIPS Composite Performance Analysis fields

There are restrictions on building only one type of field attribute in a single extension.

Sharing Security Details Among Snapshots

If you build data for multiple accounting sources as different snapshots, you may not want to build the same Security Details model data for each snapshot, as this would be repetitive. You may find it preferable to build Security Details for just one snapshot, such as the Default snapshot.

Since Data Mart requires you to build Security Details whenever you build any detail-level table, you cannot completely avoid building Security Details in each snapshot. However, you can use Selective Fields to build just one field in Security Details in all but the Default snapshot, and save considerable time. Data Mart offers the option of building detail-level tables without also building any Security Details.

If you have different snapshots with different securities, you should build all securities. However, if you have different snapshots with the same securities you would only need to build once.

Limiting the Number of Entities per OLAP Process

Experience shows that performance OLAP processes typically run best when you change the No. of Funds per Report option in the Configuration dialog box in Manage Mart to about 20. The option is set to 50 by default. You can experiment with this setting to achieve optimal results, especially if you are building a significant amount of performance data.

Limiting the Number of Entities per Model Manager

Similar to the number of entities per OLAP process, you can also control the number of entities per Model Manager. You can increase the default value of 1,000 to find the value that works best for you.

Minimizing Log Level when Not Troubleshooting

Logging at a high level such as 8 or 10 can occupy material time in a build. During normal production you can set the level to 1 to save time and then raise it again if necessary for troubleshooting purposes.

Selecting Settings for Process Manager

If you use Process Manager to build your Mart, pay special attention to the settings described in the following sections.

Avoid the Sequential Processing Setting

You can use Process Manager to manage a variety of Eagle processes. For some processes, each engine should launch in sequence. For others, parallel processing is preferable.

Parallel processing is required for Data Mart to build data with the greatest possible efficiency. To use parallel processing if you use Process Manager to build your Mart, when you configure the Process Manager Workflows, deselect the Process Effective Date Change by Batch checkbox in the Workflow Definition tab on the Workflows Setup wizard. Only deselect this option when you meet the needs of the Funds Per Model Manager and Funds per Mart conditions. This allows the system to "flood the cluster" and take full advantage of parallel processing.

Ensure Parallel Processing

You must complete the following configuration step for parallel processing for Data Mart whether or not you use Process Manager with the Mart. In Eagle Navigator, type PACE Engines and select the PACE Engines (System Management Center) link. Set a Limit Ratio value of 0.25 or other value less than 1.0 as directed by Eagle Technical Services for three engine types: Data Mart Manager Engine, Model Manager Engine and OLAP reports. The Diagnostic checks that you run in Data Mart check that these Limit Ratios are below 1.0.

Find the Best Process Manager Run Group Batch Size

When you set up a Process Manager workflow, you see the Run Groups tab in the Wizard. On the Run Groups tab, you define events within the workflow. The Batch Size setting on the Run Groups tab indicates the number of entities that are batched together for purposes of your Mart build. The default setting is 1. This indicates that the workflow will start as soon as any eligible entity is found. This setting may be inefficient because the minimal overhead of starting a Data Mart build is potentially repeated for every entity built in the Mart. A setting in the range of 25-50 is preferable. You can try various settings to determine the one that gives you the best results.

Special Considerations for Performance Measurement Fields

Performance measurement fields can involve complex database queries and extensive processing requirements. Data Mart builds that require many of these fields are the most challenging for achieving good system performance. There are several ways to improve efficiency as described in the following sections.

Limit the Lowest Level of Performance OLAP Builds

Performance returns are typically stored in the database at the security, group and fund levels. When a Data Mart performance OLAP process runs to build performance analysis fields, it computes returns for all levels represented in the performance dictionary that is linked to the model by default. For the Fund Summary and group models, you can use the Performance Report Options link in the Filters/Mappings tab of the model user interface to limit computing of returns to just what is required by the model. For example, Entity level for Fund Summary. Note the following:

If you are building performance attribution fields where your fund and/or group returns are based on rollup of lower-level contributions, you must not limit the rollup process or you will not get the expected attribution data at a rollup level. Note that this does not apply to models where you use dynamic performance rather than a performance dictionary.

If you plan to report security-level performance returns linked to your groups, you can use the Create Mapping for Details option on the group model’s Filters/Mappings tab to create a convenient table of pointers from group table groups to detail table rows. However, if you do this, you cannot also limit the OLAP build to exclude the detail level. You must choose one or the other. If the choice is to avoid building the OLAP to security level, you can still associate groups and their underlying detail members by including grouping fields as data fields in the detail model, and using reporting to sort on those fields shared by groups and detail, for example, Country and Sector.

Economize on Associated-Benchmark Fields where Possible

You can associate one or more index entities with a fund as its custom list of benchmarks. Then you can define performance analysis field attributes for the fund that point to any of its associated benchmarks. In Data Mart, all of these associated-benchmark fields are built on the same table rows as the fund, which is convenient for reporting purposes. Further, if the benchmark is of the custom blended variety and fund-specific, or if the field links benchmark returns since the inception date of the fund, only these associated-benchmark fields will do the job.

However, you might need to report benchmark returns whose values do not depend upon association with a particular fund. For example, the QTD return of the “vanilla” S&P 500 index as of a given date is the same number for all funds. Obviously you can reduce your Data Mart build requirements by building that number just once rather than once per fund. Benchmark returns may be built on their own rows in Fund Summary. Then for each individual fund you can populate Fund Summary with the entity IDs of its associated benchmarks, and design your SQL accordingly.

Engage Eagle Technical Services

Database tuning can almost always make a well-designed Mart run even better. In addition, if your Data Mart requirements involve hundreds of fields or thousands of funds, hardware sizing also deserves expert attention. Eagle Technical Services offers the benefits of their experience with a variety of significant Data Mart implementations, and can add substantial value to your Mart planning process.

Assign Roles and Access Privileges

Mart configuration is usually considered an administrator function, and is not a business user function. Data Mart’s privileges in User Manager are granular enough to support the differentiation among configuration roles that most organizations need.

Where there is more than one Data Mart role required, the roles usually fall under the following categories:

Administration, with all privileges.

Model management, with Maintain, Configuration, Migrate and View privileges.

Production, with Submit and View privileges.

Data quality assurance, with Audit, View and Validate privileges.

Adopt an Archive Policy

Archiving Data Mart tables can improve index rebuilding and query performance simply by reducing the number of rows to be navigated. A logical policy is one that preserves each data frequency (daily and monthly) for the time span that reporting may need it. Many clients preserve monthend data indefinitely, but keep daily data for a few months or a year.

Remember that all as-of reporting against Data Mart requires that data be kept for all possible prior dates you might want to report. If you need data for time periods already purged, the lost rows can be rebuilt, but there will likely be restatements of numbers previously reported.

In V9.0 under Archive Engine support of Data Mart, if you want to archive to a table and save each year’s rows together with a labeling effect, you can simply assign names to archive tables with a naming convention that includes the desired date information.

You should purge the Data Mart execution tables and leave them with no more than 1 months’ worth of data. Your specific needs may be different. Not purging can result in system performance issues.

Building Mart Data

Protecting Data Mart Field Attributes

You should protect the field attributes you use to build your Data Mart fields against unintended changes by PACE users unaware of their use in your Mart models. Any change could produce invalid data in your reports, and some changes, especially changes of field type, could cause a model to fail. It is difficult to detect a field attribute change except by checking the update of each field used in a model.

Since Eagle PACE V9.1, the Usage capability that shows which components use a field attribute includes Data Mart as one of those components. Because of this, if you use a field attribute to underlie a Data Mart field, you may not delete that field attribute until you delete the Mart field or use the Modify feature in the Fields tab of the model user interface to change the Mart field’s underlying field attribute to another one. In V10.0, Usage also indicates the Data Mart schema of each Data Mart model where usage occurs.

In addition, field attributes used by Data Mart might be inadvertently changed for some other purpose in a way that undermines their intended use in the Mart. The following sections describe some simple ways to minimize this risk.

Restricting Access to Data Mart Fields

You can restrict the field attributes so that only one user or business group is authorized to change them. This requires that your Data Mart field attributes not be used for OLAP reports or other uses.

Limit the Number of Entities per Model Manager

Similar to the number of entities per OLAP process you can also control the number of entities per model manager. You can increase or decrease the default value of 1,000.

Using a Naming Convention to Clarify Mart Usage

You can reduce use of your Data Mart fields for unintended purposes by prepending them with “dm” or devising a similar convention.

Using Comments to Identify Mart Fields

Each field attribute features a Comments section displayed in the Data Mart field selector. You can display a standardized comment during model field selection, leading you to select only designated Mart field attributes.

Usage Indicator in the Mart Field Add Window

A blue dot next to any field in the Add window, linked from the model UI Fields tab, indicates that the field is used in another model. Your field attribute list may contain multiple field attributes with the same description. If so, the blue dot indicates which of those same-named fields you have used elsewhere in the Mart. You can then reselect the blue-dot fields to maintain consistency of business definition across models.

Clarifying the Identity of Grouping Fields

You may find it useful to use a naming convention to distinguish the Mart table column names of your grouping fields. This is because when these fields are in a group-level table, your report developers may find it difficult to quickly identify which are the grouping fields and which are regular non-grouping fields. A suggestion is to prepend them with “GRP_” or append them with “_GRP”.

Converting Group Models to Dynamic Performance

In V9.0, with Data Mart support of Dynamic Performance, it is possible to convert a group-level model from using a performance dictionary to the dynamic performance approach. This requires, however, that the Mart model’s grouping does not use a performance dictionary, and that the dictionary is mapped to the Mart model in the Fields tab. This restriction applies because you cannot change the grouping structure of a group-level model once it is populated with data. Use the Remove link opposite the Performance Dictionary Mapping link to cancel use of the dictionary and use dynamic performance going forward. You can change back to use of the dictionary at a later time if necessary.

The most common use case for doing this is if your organization has decided to discontinue running performance calculations for a particular performance dictionary, and use a dynamic approach instead. If you do this, you can also apply this policy to the Data Mart.

Using Filtering

Deciding When to Use Filtering

It is not likely that you will want to filter data in your Fund Summary or detail models, though you may do so if necessary. The most likely use case for filtering is if you have certain types of securities, trades or positions that you must always exclude from reporting, and which you can identify by a simple filter.

However, it often makes sense to filter group-level data. Group models are made for delivering information in an analytical way, and a filter can omit information that is not pertinent to the analysis provided by the model. An example is a grouping on ranges of equity beta, in which it is natural to filter out all but equity securities.

Filtering for Specific Versus General Purposes

Any filter in a Data Mart model is applied to just one type of data at a time, such as positions, cash, trades, or security. This means that in a group model you may filter for example, trades for purchases only, but leave your position data unfiltered. It also means that an across-the-board filter such as “supervised assets only” must be applied to each type of data separately and identically. So, to create a group model for fixed-income only featuring position, performance and transaction fields you would need to set up a filter for “investment type = fixed income” three times in the Filters and Mappings tab.

When you filter data in a group model, map details to the group, and use the map table “pointers” in your report SQL query to retrieve data, detail rows are only retrieved for detail rows that pass the filter. This means that a group model filtered for fixed income reports only bond positions under a set of bond groups. However, this does not mean that the Position Detail table loses any non-bond positions; only the map table from groups to positions is missing rows for non-bonds as a result of the filter.

Filtering Performance Data with Dynamic Performance

Regular Performance Analysis OLAP reports based upon performance dictionaries do not support filtering because returns and other performance data stored for a dictionary are pre-computed based upon an unfiltered portfolio.

However, in V9.0, Performance Analysis reports may be generated without a performance dictionary. This is the Dynamic Performance approach that is supported by Data Mart. Since Dynamic Performance computes security-level returns first and then aggregates and links them, filtering can take place during the Mart build process to eliminate any investment category from the analysis that may be required.

Simplifying Entity Hierarchies

Entity Hierarchy field attributes allow you to report data for groups of entities that have parent-child relationships such as members of a composite, where composite members can themselves be composites, etc. You can maintain more than one such hierarchy, if necessary, to express all of the composite relationships among your entities.

Best practice for reporting Entity Hierarchies in Data Mart is to set up and regularly submit, for each distinct entity hierarchy structure supported, a dedicated group-level model whose groups are the Entity Hierarchy field attributes of the hierarchy. You must select some field for any model, but you can simply choose a single position field like market value just to enable the model to populate its table. The model should be built for every entity at the topmost level of the hierarchy, which might be a single entity if the top level is “total firm”.

Your SQL queries in reports and other downstream applications will use the hierarchy table as a directory to enable you to find the up- and down-hierarchy entities for any given entity. You must also populate the Fund Summary model for all entities.

Supporting Portfolio Look-through

Look-through reporting at the detail level is supported in a new Position Details look-through model and table. The granularity of a row in this table is the look-though position piece. The piece is the position in the portfolio or in any child find at whatever level of nesting. This is the same granularity as in a position OLAP report having look-through fields. The data is returned flattened for look-through in Data Mart.

Look-through in Data Mart Group Level Models

Group level models support look-through by enabling look-through processing. By default, the Enable Look thru Processing checkbox on the Model dialog box is blank. When you select it, it conditions the model to support only look-through processing.

Supporting Long-Term Summaries of Trade Activity

You may need to report summaries of trade or cash activity over long periods of time, such as the sum of buys year-to-date. Any generic trade field attribute can build this type of data. While it is possible to rebuild these field attributes over a long time period every day in the mart, this is an inefficient practice. A more efficient strategy is to build that summary field on a month-to-date basis daily, then sum over a set of monthly fields at the report level, perhaps with the use of a general-purpose stored procedure.

Supporting General Ledger Data

Data Mart support of General Ledger data was introduced in Eagle PACE V9.1. For additional information, refer to the Data Mart User Guide.

Although Eagle Accounting was enhanced to facilitate building ledger data in Data Mart, other accounting systems may send ledger data to Data Mart if they load the gl_detail_posting table of the ledger database in the same way as Eagle Accounting does.

Eagle Accounting sends ledger data under different sources for different accounting bases and for daily versus monthly balances. If you want to support more than one basis or periodicity in your reporting, it is best practice to use a separate snapshot for each.

Supporting NAV Data

Data Mart does not directly support fields from the Net Asset Value (NAV) table. However, it does support Dynamic Mutual Fund field attributes.

Before V8.0, the Dynamic Mutual Fund field attribute could only show a return. This left several NAV data items unreportable in Data Mart. In V8.0, a series of field categories and about 20 different effects were added to the Dynamic Mutual Fund field attribute type. This enhancement enabled a wide range of NAV fields to be supported in Data Mart. Refer to Knowledge Base article #8874 for additional information.

Limitations of Selective Fields

The Build Selective Fields option has a number of significant benefits, such as selective back-filling. For example, when you add one or more new fields to a model that you have been populating for a while you can use this option to populate values for the new fields only. Otherwise, you have to submit the model to populate every field of each row. When you re-populate fields that already have data values, the values are not deleted, but re-generated. This can lead to an unintended restatement of reported data values.

Maintaining the Mart

Back-filling History for a New Field

You can add fields to any Data Mart model at any time. When you do, you will likely want back-fill history by building data values for past dates. The Selective Fields enhancement allows you to do this.

Before the enhancement, whenever you built data for a given table, fund and date, you had to build all fields in each row. This was a relatively time-consuming process if there were many fields in the model’s table and possible extensions. Also, rebuilding data for past periods might run counter to your preferences to preserve or “rule off” data values already reported to clients for past periods.

Using Selective Fields, you can choose to build as few as a single field in a model. This may be a new field or an overwrite of an existing field. You can completely bypass model extensions that do not contain any fields you want to build, which saves time. There is complete flexibility with respect to field selection.

Regardless of your release, to back-fill data use the manual or ad hoc submission process accessible via the Submit link in Manage Mart. Take advantage of the date-range approach to specifying a series of effective dates covering the time interval for back-filling.

Building Data for a Series of Dates in One Step

When you use the Submit link in Manage Mart to build data on an ad hoc rather than scheduled basis, you usually specify a single as-of or effective date. However, if you want data for multiple past dates such as for historical back-fill of a newly added field, you can use the date range option in Submit as an alternative to repeatedly running for a single date.

However, you should plan your use of this functionality so that you do not disrupt your regular daily Eagle workflow. A full build of your mart, executed for multiple dates of a date range, takes roughly that multiple of the time taken for a daily build, which can constitute a significant commitment of resources. You should also account for other processing that may be taking place in your system while you build for a date range.

There are two options when you submit a date range:

Process Dates Concurrently. Useful for smaller submits. Engines for multiple dates are launched without strict regard to date order. Engines flood the cluster to maximize process concurrency. If the system goes down during a concurrent built, you may need to restart your build from the beginning, as multiple days are probably partially finished.

Process Dates Sequentially. Useful for larger submits. Engines are launched for first date, and only after they complete are the next day’s engines launched, and so on. This is a slower process, but it “throttles” processing so that other workflows are less likely to be adversely impacted. Also, if the system should go down during a sequential build, you can resubmit starting with the day that was being processed when that happened.

You can determine if your job is small enough for a concurrent submission by following these guidelines:

Count the number of engines required for your submit, as follows:

For each model submitted, there is one Model Manager engine for each group of N entities submitted, where N is the “No. of Funds per Model Manager” that you set in the Data Mart Configuration dialog box.

Within each model, there are a number of OLAP report engines launched. To determine how many, find the Execution Logs of a previous build of that model and count the distinct types of OLAP you see. The basic logic is:

One per type of OLAP report required (for example, position, entity, security).
Times number of table extensions including the main extension.
Times number of currencies built including BASE currency.
Times number of entities built divided by M, where M is “No. of Funds per Report” that you entered in the Data Mart Configuration window.

For example, you have a group model built with entity, position and performance OLAP engines (factor of 3). You have one extension that also requires all 3 OLAP types (factor of 2). You are building for one base currency in addition to BASE (factor of 2). You are building 1000 entities, and “No. of Funds per Report” is 50 (factor of 20). You have 3x2x2x20 = 240 engines launched to build that model.

Sum the engines launched in all models that you are building in the same date range submission to get a grand total.

Divide your total by the number of concurrent PACE engines dedicated to OLAPs and Data Mart. If the result is 2 or fewer, you may use concurrent builds. If between 2 and 4, you can perform the entire build as one overall event but on a sequential rather than concurrent basis. If the result is 4 or more, you should use sequential builds phased over a period of time, completing one before launching the next. If you use sequential builds, try to schedule them off prime shift.

Note:

Use the spreadsheet attached to Knowledge Base article #11487 to enter a date range submission and see Eagle’s recommended best practice for using a Concurrent versus a Sequential build approach.

The date-range submit produces a Data Mart Manager Engine (top-level) log in Execution Logs for zero models. This log is normal. It is the log of the process that launches the several individual Data Mart Manager Engines of a date-range submit. In Data Steward the event associated with this log appears as the parent event with each daily build as child events. The parent event is marked as Complete almost immediately, as that event completes as soon as it spawns its child events. Drill into the parent event to get a correct picture of the child event completion statuses.

Deleting Fields

Before V9.0, if you deleted a field from the user interface, the field was no longer built, but its table column was not dropped. Beginning in V9.0 you have the option to drop the table column when you delete a field. Determine if you want to drop the table column:

If you want to keep past data for the field for reporting purposes, but not populate it going forward, do not drop the table column.

If you have no use for past data for the field in question, drop the table column.

Keeping the Mart in Synch with the Warehouse Using Process Manager

Late trades and security data vendor corrections are examples of events that can create stale data in your Mart. Use the Business Process Manager to automate the process of re-populating the Mart as warehouse data changes make this necessary.

You can set up Process Manager to detect changes to the update_date field value in each of several different warehouse table. Since a change of update_date on a given row indicates a change to table data for the row’s entity on that date, your Data Mart should be rebuilt for that entity and effective date for all tables containing data of that type. Such a change detection rule is known as a Dynamic Process Initiator.

A Dynamic Process Initiator can then trigger a Process Workflow, which is an event or series of events required to start when the initiator is activated. For Data Mart, an event is a Schedule. You can set up a Schedule to re-populate each table that must be brought into synch after a change to each type of Warehouse table covered by your Business Process Manager workflows.

Using History

Data Mart allows you to use or omit the use of history tables for entity and security data. Using history increases build time for the Data Mart, since additional queries must be made against history tables. However, if you are building data for past dates, you need to build with history tables in use. Eagle’s best practice is to always use history when running models. To do this, use source rules where a source hierarchy is set for Security related fields and/or source specific fields are used in the models. Set up Entity History on the Configuration windows.

If entity build is not used and the Always Composite Holdings setting is selected on the Configuration dialog box, PACE enumerates holdings and populates the positions of all composite members in the Position Details table. You can then roll these up to the security level in the Reporting tool to create composite level positions.

Controlling Builds of Security and Issuer Data

Adhoc Submissions

When you build or update data for detail tables, your build automatically includes Security Details and, if the model is configured, Issuer Details. This “come along” build of securities with any detail table build addresses the need to be able to identify the security to which any detail table row pertains.

The options you use to build Securities Details are global settings located on the Configuration dialog box. In the Security Build Information section, there is an option to build all securities, and one to build held securities. If you select the Build all Securities option, the security details model is built for every security contained in the SECURITYDBO.SECURITY_MASTER table. If you select the Held securities and add criteria to include more options, since the Securities Details mode is run last, the engines can determine which securities in the database are actually held by some entity. The engine looks to see which securities were used to build the various detail models for that run. For example, if the Position Details, Performance Details, and Transaction Details models were run, the Securities Details model runs for only those securities that were present in the previous 4 details models run. Click the Define Criteria link to include other non-held securities that may be required to be built for other reporting purposes. The default behavior is to build for only Held Securities.

Scheduled Submissions

When running scheduled submissions of details models, the Security and Issuer Details models are also run at the end of the process and are included in the schedule unless you choose not to include them. Refer to the Data Mart User Guide for details on creating schedules.

Sometimes your workflow needs a rebuild or partial update of previously built detail data and you want to save on build time by not rebuilding all security details. This may be true if you use a dynamic workflow in Process Manager to update detail data as granular as one entity on one date. The presence of an extensive list of security details fields increases your preference not to rebuild them all.

You can limit that build by using the selective fields feature to build a bare minimum of these fields, as few as one per model. Subject to the one-field rule, selective fields does not require any particular table extension to be built.

Data Mart submissions scheduled using Automation Center handle detail builds with Security and Issuer data differently than ad hoc submissions. When running builds with Process Manager, there are some things to remember to minimize the processing cycle times. Since Security Detail and Issuer Detail builds generally occur with all other detail model builds at the end, it is recommended to ensure that these security and issuer builds occur only once at the end. This is especially important when Process Manager is set up to handle builds for adjustments and “as of” changes. The best practice for these events is to create an initiator that triggers these detail model builds, since there may be many running at different times. Once all the builds are complete, create a second event to run the security details and issuer details model once at the end. This requires separate schedules for the details models and the security/issuer models.

If You Build Tax Lot Data

A proper build of the Data Mart Lot Level Details table requires that the lot_number field of the holdingdbo.lot_level_details table be unique within a given security_alias. This field should be the sequence number of the portfolio’s buy lot for the security in question. However, some accounting systems, including Eagle STAR, do not populate this field in the way that Data Mart’s underlying OLAP process requires.

If your lot_number field is not unique within each security, you may load rows to this table while populating the lot_number field with the value of the lot_level_position field, which is unique within the entire table. Refer to Knowledge Base Article #8005 for additional information.

When to Use Multiple Marts

Before V10.0, all Data Mart tables were located in a single database schema called DATAMART in the same Eagle warehouse that contains other schemas like HOLDING and SECURITY. Now, you can use multiple marts to create, maintain and use more than one schema for your Data Marts. Different schemas share a single Eagle data warehouse as their source of data. You control all of your multiple Data Marts from the Data Mart component of PACE.

Should You Use Multiple Marts?

The multiple marts option offers you many possibilities for information delivery. They can help you scale your data delivery as your business grows. However, multiple marts add complexity along with flexibility. Use them only after careful analysis and planning. Each mart is configured and updated independently. There is no support in the product for sharing physical tables across marts, so there may be duplication of data that places additional demands upon system resources in the build process. When duplicate data is built, marts can temporarily get out of synchronization with respect to some of their content. Given that each mart is configured separately, it is possible to select a different field attribute for a field in one mart than was selected for that field in another mart. You can use the Copy Model feature to mitigate this risk. In addition, firm wide reporting is likely done from just one mart, so some level of data duplication is unavoidable.

The following are examples of when multiple marts may be useful. For additional information, refer to the Data Mart User Guide.

Departmental Marts

Operations in different time zones

Focused business marts

EA and BNYM outsourcing

Marts dedicated to third-party apps

Troubleshooting the Mart

Using log files generated as part of a Data Mart build is described in the Data Mart User Guide. These files, combined with the ERX files also described in the Guide, are your best troubleshooting tools. The following sections recommend how to use those files.

When Data Mart Returns No or Unexpected Results

You may find that a Data Mart build sometimes returns no data, or data you were not expecting. The log files produced at the lowest level of drill-down in Data Mart Execution Logs (the Report, OLAP or Concentration engine level) can help if you run the problem scenario at the most detailed log level of 10.

In the log file, search for “Begin:Concentration data acquisition”. What follows is the query that fetches the basic data required to build the fields provided by that OLAP engine. You can copy and paste that into a SQL tool to verify what is returned. If you do not see expected results, experiment with commenting-out parts of the query such as source to see if the anomaly is due to one of the assumptions (like source) built into your mart configuration. This exercise may lead you to correct an aspect of your configuration and resolve the problem.

When the Build Runs Long

A long-running mart build can result from a number of factors. Some of these might be resolved with database tuning help from your database administrator or Eagle Technical Services. The following strategies may be helpful:

Determine if processing time is concentrated in one or a few steps of the build. After running at log level 10, check the time stamps of the log file to see if there are sizeable gaps between one or more successive rows of the log. If you find that one or a few steps account for most of the run time, that can indicate what is taking time and lead to a faster solution.

Sometimes the time-consuming step is the execution of a particular query. Your database administrator can check the query’s execution plan to determine if indexing or a database hint can reduce run time. Your use of warehouse tables may be atypical, and require intervention to accommodate it

Data Mart Best Practices