Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

There are many ways to design, configure and operate your Data Mart. Since different approaches are valid in context, it is not possible to define best practices for every task. However, there are some useful lessons from the experience of Eagle and its clients with Data Mart. This document assumes that you are familiar with how to use Data Mart. The information in this section does not cover basic concepts, but instead recommends specific choices of tools and techniques which lead to better results.

Maintaining the Mart

Back-filling History for a New Field

You can add fields to any Data Mart model at any time. When you do, you will likely want back-fill history by building data values for past dates. The Selective Fields enhancement allows you to do this.

Before the enhancement, whenever you built data for a given table, fund and date, you had to build all fields in each row. This was a relatively time-consuming process if there were many fields in the model’s table and possible extensions. Also, rebuilding data for past periods might run counter to your preferences to preserve or “rule off” data values already reported to clients for past periods.

Using Selective Fields, you can choose to build as few as a single field in a model. This may be a new field or an overwrite of an existing field. You can completely bypass model extensions that do not contain any fields you want to build, which saves time. There is complete flexibility with respect to field selection.

Regardless of your release, to back-fill data use the manual or ad hoc submission process accessible via the Submit link in Manage Mart. Take advantage of the date-range approach to specifying a series of effective dates covering the time interval for back-filling.

Building Data for a Series of Dates in One Step

When you use the Submit link in Manage Mart to build data on an ad hoc rather than scheduled basis, you usually specify a single as-of or effective date. However, if you want data for multiple past dates such as for historical back-fill of a newly added field, you can use the date range option in Submit as an alternative to repeatedly running for a single date.

However, you should plan your use of this functionality so that you do not disrupt your regular daily Eagle workflow. A full build of your mart, executed for multiple dates of a date range, takes roughly that multiple of the time taken for a daily build, which can constitute a significant commitment of resources. You should also account for other processing that may be taking place in your system while you build for a date range.

There are two options when you submit a date range:

Process Dates Concurrently. Useful for smaller submits. Engines for multiple dates are launched without strict regard to date order. Engines flood the cluster to maximize process concurrency. If the system goes down during a concurrent built, you may need to restart your build from the beginning, as multiple days are probably partially finished.

Process Dates Sequentially. Useful for larger submits. Engines are launched for first date, and only after they complete are the next day’s engines launched, and so on. This is a slower process, but it “throttles” processing so that other workflows are less likely to be adversely impacted. Also, if the system should go down during a sequential build, you can resubmit starting with the day that was being processed when that happened.

...

Count the number of engines required for your submit, as follows:

For each model submitted, there is one Model Manager engine for each group of N entities submitted, where N is the “No. of Funds per Model Manager” that you set in the Data Mart Configuration dialog box.

Within each model, there are a number of OLAP report engines launched. To determine how many, find the Execution Logs of a previous build of that model and count the distinct types of OLAP you see. The basic logic is:

  • One per type of OLAP report required (for example, position, entity, security).

  • Times number of table extensions including the main extension.

  • Times number of currencies built including BASE currency.

  • Times number of entities built divided by M, where M is “No. of Funds per Report” that you entered in the Data Mart Configuration window.

For example, you have a group model built with entity, position and performance OLAP engines (factor of 3). You have one extension that also requires all 3 OLAP types (factor of 2). You are building for one base currency in addition to BASE (factor of 2). You are building 1000 entities, and “No. of Funds per Report” is 50 (factor of 20). You have 3x2x2x20 = 240 engines launched to build that model.

Sum the engines launched in all models that you are building in the same date range submission to get a grand total.

Divide your total by the number of concurrent PACE engines dedicated to OLAPs and Data Mart. If the result is 2 or fewer, you may use concurrent builds. If between 2 and 4, you can perform the entire build as one overall event but on a sequential rather than concurrent basis. If the result is 4 or more, you should use sequential builds phased over a period of time, completing one before launching the next. If you use sequential builds, try to schedule them off prime shift.

...

Note:

...

Use the spreadsheet attached to Knowledge Base article #11487 to enter a date range submission and see Eagle’s recommended best practice for using a Concurrent versus a Sequential build approach.

The date-range submit produces a Data Mart Manager Engine (top-level) log in Execution Logs for zero models. This log is normal. It is the log of the process that launches the several individual Data Mart Manager Engines of a date-range submit. In Data Steward the event associated with this log appears as the parent event with each daily build as child events. The parent event is marked as Complete almost immediately, as that event completes as soon as it spawns its child events. Drill into the parent event to get a correct picture of the child event completion statuses.

Deleting Fields

Before V9.0, if you deleted a field from the user interface, the field was no longer built, but its table column was not dropped. Beginning in V9.0 you have the option to drop the table column when you delete a field. Determine if you want to drop the table column:

If you want to keep past data for the field for reporting purposes, but not populate it going forward, do not drop the table column.

If you have no use for past data for the field in question, drop the table column.

Keeping the Mart in Synch with the Warehouse Using Process Manager

Late trades and security data vendor corrections are examples of events that can create stale data in your Mart. Use the Business Process Manager to automate the process of re-populating the Mart as warehouse data changes make this necessary.

You can set up Process Manager to detect changes to the update_date field value in each of several different warehouse table. Since a change of update_date on a given row indicates a change to table data for the row’s entity on that date, your Data Mart should be rebuilt for that entity and effective date for all tables containing data of that type. Such a change detection rule is known as a Dynamic Process Initiator.

A Dynamic Process Initiator can then trigger a Process Workflow, which is an event or series of events required to start when the initiator is activated. For Data Mart, an event is a Schedule. You can set up a Schedule to re-populate each table that must be brought into synch after a change to each type of Warehouse table covered by your Business Process Manager workflows.

Using History

Data Mart allows you to use or omit the use of history tables for entity and security data. Using history increases build time for the Data Mart, since additional queries must be made against history tables. However, if you are building data for past dates, you need to build with history tables in use. Eagle’s best practice is to always use history when running models. To do this, use source rules where a source hierarchy is set for Security related fields and/or source specific fields are used in the models. Set up Entity History on the Configuration windows.

If entity build is not used and the Always Composite Holdings setting is selected on the Configuration dialog box, PACE enumerates holdings and populates the positions of all composite members in the Position Details table. You can then roll these up to the security level in the Reporting tool to create composite level positions.

Controlling Builds of Security and Issuer Data

Adhoc Submissions

When you build or update data for detail tables, your build automatically includes Security Details and, if the model is configured, Issuer Details. This “come along” build of securities with any detail table build addresses the need to be able to identify the security to which any detail table row pertains.

The options you use to build Securities Details are global settings located on the Configuration dialog box. In the Security Build Information section, there is an option to build all securities, and one to build held securities. If you select the Build all Securities option, the security details model is built for every security contained in the SECURITYDBO.SECURITY_MASTER table. If you select the Held securities and add criteria to include more options, since the Securities Details mode is run last, the engines can determine which securities in the database are actually held by some entity. The engine looks to see which securities were used to build the various detail models for that run. For example, if the Position Details, Performance Details, and Transaction Details models were run, the Securities Details model runs for only those securities that were present in the previous 4 details models run. Click the Define Criteria link to include other non-held securities that may be required to be built for other reporting purposes. The default behavior is to build for only Held Securities.

...

Scheduled Submissions

When running scheduled submissions of details models, the Security and Issuer Details models are also run at the end of the process and are included in the schedule unless you choose not to include them. Refer to the Data Mart User Guide for details on creating schedules.

Sometimes your workflow needs a rebuild or partial update of previously built detail data and you want to save on build time by not rebuilding all security details. This may be true if you use a dynamic workflow in Process Manager to update detail data as granular as one entity on one date. The presence of an extensive list of security details fields increases your preference not to rebuild them all.

You can limit that build by using the selective fields feature to build a bare minimum of these fields, as few as one per model. Subject to the one-field rule, selective fields does not require any particular table extension to be built.

Data Mart submissions scheduled using Automation Center handle detail builds with Security and Issuer data differently than ad hoc submissions. When running builds with Process Manager, there are some things to remember to minimize the processing cycle times. Since Security Detail and Issuer Detail builds generally occur with all other detail model builds at the end, it is recommended to ensure that these security and issuer builds occur only once at the end. This is especially important when Process Manager is set up to handle builds for adjustments and “as of” changes. The best practice for these events is to create an initiator that triggers these detail model builds, since there may be many running at different times. Once all the builds are complete, create a second event to run the security details and issuer details model once at the end. This requires separate schedules for the details models and the security/issuer models.

If You Build Tax Lot Data

A proper build of the Data Mart Lot Level Details table requires that the lot_number field of the holdingdbo.lot_level_details table be unique within a given security_alias. This field should be the sequence number of the portfolio’s buy lot for the security in question. However, some accounting systems, including Eagle STAR, do not populate this field in the way that Data Mart’s underlying OLAP process requires.

If your lot_number field is not unique within each security, you may load rows to this table while populating the lot_number field with the value of the lot_level_position field, which is unique within the entire table. Refer to Knowledge Base Article #8005 for additional information.

When to Use Multiple Marts

Before V10.0, all Data Mart tables were located in a single database schema called DATAMART in the same Eagle warehouse that contains other schemas like HOLDING and SECURITY. Now, you can use multiple marts to create, maintain and use more than one schema for your Data Marts. Different schemas share a single Eagle data warehouse as their source of data. You control all of your multiple Data Marts from the Data Mart component of PACE.

Should You Use Multiple Marts?

The multiple marts option offers you many possibilities for information delivery. They can help you scale your data delivery as your business grows. However, multiple marts add complexity along with flexibility. Use them only after careful analysis and planning. Each mart is configured and updated independently. There is no support in the product for sharing physical tables across marts, so there may be duplication of data that places additional demands upon system resources in the build process. When duplicate data is built, marts can temporarily get out of synchronization with respect to some of their content. Given that each mart is configured separately, it is possible to select a different field attribute for a field in one mart than was selected for that field in another mart. You can use the Copy Model feature to mitigate this risk. In addition, firm wide reporting is likely done from just one mart, so some level of data duplication is unavoidable.

The following are examples of when multiple marts may be useful. For additional information, refer to the Data Mart User Guide.

Departmental Marts

Operations in different time zones

Focused business marts

EA and BNYM outsourcing

Marts dedicated to third-party apps

Troubleshooting the Mart

Using log files generated as part of a Data Mart build is described in the Data Mart User Guide. These files, combined with the ERX files also described in the Guide, are your best troubleshooting tools. The following sections recommend how to use those files.

When Data Mart Returns No or Unexpected Results

You may find that a Data Mart build sometimes returns no data, or data you were not expecting. The log files produced at the lowest level of drill-down in Data Mart Execution Logs (the Report, OLAP or Concentration engine level) can help if you run the problem scenario at the most detailed log level of 10.

In the log file, search for “Begin:Concentration data acquisition”. What follows is the query that fetches the basic data required to build the fields provided by that OLAP engine. You can copy and paste that into a SQL tool to verify what is returned. If you do not see expected results, experiment with commenting-out parts of the query such as source to see if the anomaly is due to one of the assumptions (like source) built into your mart configuration. This exercise may lead you to correct an aspect of your configuration and resolve the problem.

When the Build Runs Long

A long-running mart build can result from a number of factors. Some of these might be resolved with database tuning help from your database administrator or Eagle Technical Services. The following strategies may be helpful:

Determine if processing time is concentrated in one or a few steps of the build. After running at log level 10, check the time stamps of the log file to see if there are sizeable gaps between one or more successive rows of the log. If you find that one or a few steps account for most of the run time, that can indicate what is taking time and lead to a faster solution.

Sometimes the time-consuming step is the execution of a particular query. Your database administrator can check the query’s execution plan to determine if indexing or a database hint can reduce run time. Your use of warehouse tables may be atypical, and require intervention to accommodate it