Maintain the Mart

Back-fill History for a New Field

You may add fields to any Data Mart model at any time. When you do, you will likely want to build data values for past dates (back-fill history). You build values for back-fill history using Selective Fields.

Before the introduction of Selective Fields in the Eagle solution, any time you built data for a given table, fund and date, you had to build all fields in each row. This might be a relatively time-consuming process if there were many fields in the model’s table and possible extensions. Also, rebuilding data for past periods might run counter to your preferences to preserve or “rule off” data values already reported to clients for past periods.

Using Selective Fields, you may choose to build as few as a single field in a model. This may be a new field or an overwrite of an existing field. You can completely bypass model extensions that do not contain any fields you want to build, which saves time. There is complete flexibility with respect to field selection.

Regardless of your release, to back-fill data you should use the manual or ad hoc submission process that you access with the Submit link in Manage Mart and take advantage of the date-range approach to specifying a series of effective dates covering the time interval for back-filling.

Build Data for a Series of Dates in One Step

When you use the Submit link of Manage Mart to build data on an ad hoc rather than scheduled basis, you will normally specify a single as-of or effective date. However, if you want data for multiple past dates such as for historical back-fill of a newly added field, you may use the date range option in Submit as an alternative to repeatedly running for a single date.

However, you should plan your use of this functionality so that you do not disrupt your regular daily Eagle workflow. A full build of your mart, executed for multiple dates of a date range, will take roughly that multiple of the time taken for a daily build, which can constitute a significant commitment of resources. You should also account for other processing that may be taking place in your system while you build for a date range.

When you do a date range submission you will have two choices:

Process Dates Concurrently—good for smaller submits; engines for multiple dates are launched without strict regard to date order. Engines “flood the cluster” to maximize process concurrency. If the system goes down during a concurrent built, you will likely need to restart your build from the beginning, as multiple days would probably be partially finished.
Process Dates Sequentially—good for larger submits; engines are launched for first date, and only after they complete are the next day’s engines launched, etc. This is a slower process, but it “throttles” processing so that other workflows are less likely to be adversely impacted. Also, if the system should go down during a sequential build, you will be able to resubmit starting with the day that was being processed when that happened.

You need to determine whether your job is small enough for a concurrent submission. Start by counting the number of engines required for your submit, as follows:

For each model submitted, there is one Model Manager engine for each group of N entities submitted, where N is the “No. of Funds per Model Manager” that you set in the Data Mart Configuration screen.
Within each model, there is a number of OLAP report engines launched. To determine how many, find the Execution Logs of a previous build of that model and count the distinct types of OLAP you see. The basic logic is:
- One per type of OLAP report required (e.g., position, entity, security)
- Times number of table extensions including the main extension
- Times number of currencies built including BASE currency.
- Times number of entities built divided by M, where M is “No. of Funds per Report” that you entered in the Data Mart Configuration screen.
For example, you have a group model that is built with entity, position and performance OLAP engines (factor of 3). You have one extension that also requires all 3 OLAP types (factor of 2). You are building for one base currency in addition to BASE (factor of 2). You are building 1000 entities, and “No. of Funds per Report” is 50 (factor of 20). Then you will have 3x2x2x20 = 240 engines launched to build that model.
Sum the engines launched all models that you are building in the same date range submission to get a grand total.
Divide your grand total by the number of concurrent PACE engines dedicated to OLAPs and Data Mart. If the result is 2 or fewer, you may use concurrent builds. If between 2 and 4, you can perform the entire build as one overall event but on a sequential rather than concurrent basis. If the result is 4 or more, you should use sequential builds phased over a period of time, completing one before launching the next. If you use sequential builds, you should try to schedule them off prime shift.

You will notice that your date-range submit produces a Data Mart Manager Engine (top-level) log in Execution Logs for zero models. This log is normal—it is the log of the process that launches the several individual Data Mart Manager Engines of a date-range submit.

Deleting Fields

You have the option to drop the table column when you delete a field, but should you?

No, if you want to keep past data for the field for reporting purposes, but not populate it going forward.
Yes, if you have no use for past data for the field in question.

Process Manager Keeps the Mart in Synch with the Warehouse

Late trades and security data vendor corrections are examples of events that can create stale data in your Mart. Using the Business Process Manager, you can automate the process of re-populating the Mart as warehouse data changes make this necessary.

Process Manager can be set up to detect changes to the update_date field value in each of several different warehouse table. Since a change of update_date on a given row indicates a change to table data for the row’s entity on that date, your Data Mart should be rebuilt for that entity and effective date for all tables containing data of that type. Such a change detection rule is known as a Dynamic Process Initiator.

A Dynamic Process Initiator can then trigger a Process Workflow, which is an event or series of events required to be started when the initiator is activated. For Data Mart, an event is a Schedule. You can set up a Schedule to re-populate each table that must be brought into sync after a change to each type of Warehouse table covered by your Business Process Manager workflows.

Using History

Data Mart allows you to use or omit the use of history tables for entity and security data. Using history increases build time for the Data Mart, since additional queries must be made against history tables. However, if you are building data for past dates, you will need to build with history tables in use. Eagle’s best practice is to always use history when running models. This accomplished by using source rules where a source hierarchy is set for Security related fields and/or source specific fields are used in the models. Entity History is accomplished via the configuration screens.

If entity build is not used and the Always Composite Holdings setting is selected in the Configuration dialog box, PACE will enumerate holdings and populate the positions of all composite members in the Position Details table. These may then be rolled up to the security level within the reporting tool to create composite level positions.

Controlling Builds of Security and Issuer Data

Adhoc Submissions

When you build or update data for detail tables, your build will automatically include Security Details and, if the model is configured, Issuer Details. This “come along” build of securities with any detail table build addresses the need to be able to identify the security to which any detail table row pertains.

There are a couple options for use of building Securities Details. These options are controlled in the configuration screens and are global settings. One is to build all securities. The other, build held securities. Build all securities options means to build the security details model for every security contained in the SECURITYDBO.SECURITY_MASTER table. This is very straight forward. Building only held securities has a couple of nuances. Since the Securities Details modes is run last, the engines can determine which securities in the database are actually held by some entity. The engine will look to see what securities were uses to build the various detail models that for that run. For example, if the Position Details, Performance Details, and Transaction Details models were run, the Securities Details model will run for only those securities that were present in the previous 4 details models run. There is an additional option where the Define Criteria link can be populated to include other non-held securities that may be required to be built for other reporting purposes. The default behavior is to build for only Held Securities.

Scheduled Submissions

When running scheduled submissions of details models, the Security and Issuer Details models are also run at the end of the process and are included in the schedule unless the person chooses not to include them. This may be accomplished via the user interface. Please see the Data Mart User Guide for details on creating schedules.

Sometimes your workflow calls for a rebuild or partial update of previously built detail data in which you would prefer to save on build time by not rebuilding all security details. This may be particularly true if you use a dynamic workflow in Process Manager to update detail data as granular as one entity on one date. The presence of an extensive list of security details fields would increase your preference not to rebuild them all.

You can limit that build by using the selective fields feature to build a bare minimum of these fields, as few as one per model. Subject to the one-field rule, selective fields does not require any particular table extension to be built.

Data Mart submissions that are scheduled using Automation Center to handle detail builds with Security and Issuer data differently than adoc submissions. When running builds with Process Manager there are a couple of things to keep in mind in order to minimize the processing cycle times. Since Security Detail and Issuer Detail builds generally occur with all other detail model builds at the end, it is generally wise to ensure that these security and issuer builds occur only once at the end. This is especially important when process manager is set up to handle builds for adjustments and and “as of” changes. The best practice for these events is to create a initiator that will trigger these detail model builds there may be many that going on at different times). Once all these builds are complete then an second event should be created to run the security details and issuer details model once at the end. This will require separate schedules for the details models and the security/issuer models.

If You Build Tax Lot Data

A proper build of the Data Mart Lot Level Details Table requires that the lot_number field of the holdingdbo.lot_level_details table be unique within a given security_alias. This field should be the sequence number of the portfolio’s buy lot for the security in question. However, some accounting systems including STAR do not populate this field in the way that Data Mart’s underlying OLAP process requires.

If your lot_number field is not unique within each security, you may load rows to this table while populating the lot_number field with the value of the lot_level_position field, which is unique within the entire table.