...
...
...
...
...
...
...
Develop a Data Mart
...
You may have built your presentation-quality reports using PACE Active reports. Now that Data Mart is available as an information delivery tool, should you consider converting existing reports to source data against the Mart? This is a complex question, and involves several considerations discussed in the following sections.
Field Standardization
Each Advanced report built with underlying OLAP reports has a list of field attributes associated with it. Advanced reports often involve one or more individual field rules for each underlying OLAP report. Standardization and re-use of field attributes and field rules is possible, but many organizations find that it is all too easy to duplicate fields. Data Mart, by contrast, encourages business definition standardization and sharing of field attributes since duplication of data fields is much more easily recognizable in the Mart and thus avoidable.
Preserve Advanced Report Functionality
You will not lose the benefits of using Advanced reports under a conversion to Data Mart. You can still schedule Active reports built against the Mart to send to Portal, and still develop them as a Client reporting batch. Report ordering functionality and passing of parameters is the same as when actual OLAP reports underly the Advanced report.
If you implement Data Mart to adopt a reporting tool not supported by Advanced reports, you forego the benefits of Advanced reports related to Portal and Client Reporting.
Data Retrieval Staging versus Data Processing
When you run an Advanced report with underlying OLAPs, all of the underlying processing is done on the fly. Some calculations, especially performance calculations, can be time consuming and lead to long rendering times for reports. With Data Mart as the data source, all computations are completed before reports are run, so that reports simply retrieve field values from one or more tables.
However, if reporting in the organization is a matter of running reports in the overnight cycle and not during normal business hours rather than running reports during the day, Data Mart does not offer as much of a total production time savings.
You can create multiple snapshots allowing you to build and save your data more than once per day and use more than one source rule in the build process. Creating multiple snapshots means an organization could maintain a custodian source and a manager source for the same accounting data. By saving multiple snapshots, you could report on intra-day comparisons of data values.
Potential for Sharing Fields
Your reports share many common fields. Data Mart provides the efficiency of computing each one of them just once rather than once per report. Mart data is meant to be shared among reports. For this reason it is critical to make a careful plan of the data fields in the mart to assure that all reporting needs are covered by a minimum of fields.
Report Conversion
There is no utility that automates report conversion. It is difficult to design such a capability that would be useful in a large number of cases.
An Active reports report sources data from PACE field attributes. Data Mart can store data for the same field attributes with the exception of a short list of field types that Data Mart does not support. Refer to the Data Mart User Guide for additional information. The SQL code used by an Advanced report must be reworked in terms of tables accessed and joins used which changes in a Mart conversion.
However, Eagle Mart, which sources from Data Mart, comes with a set of standard reports including report source code for the non-performance set. These reports provide examples of building Advanced reports against the Mart that may be useful to you during report conversion.
Report Development
The Mart offers efficiencies in report development:
Data Mart simplifies the process of building SQL code for bringing your data together in a report. Data Mart places data fields of multiple types, such as position, performance, and transaction on the same line of a data source table at the group and fund levels of aggregation. This saves you the effort of creating multiple OLAP report rules and profiles and then joining their results in SQL that you must develop.
If you report data using time series, you can do so for some performance fields using the Performance Query Tool. For other types of data, OLAP processes are limiting. Through regular daily or monthly builds, Data Mart establishes a history of all data, making time series reporting easy.
Data Mart encourages standardization of field definitions and helps reduce unintended duplication of field attributes that sometimes accompanies OLAP report development. See “Field Standardization” for additional information.
Support Non-OLAP Data in the Mart
All data built by PACE in Data Mart tables comes from the action of an OLAP process. There are strong arguments for maximizing the potential of the OLAPs to populate as much required reportable data as possible, given the powerful and flexible tools available in the Data Mart for data management of the Mart. However, it is sometimes necessary to support data in the Mart not placed there by the OLAPs.
When to Move Non-OLAP Data to the Mart
Some types of data either should or must bypass the OLAP processes that build data into mart tables:
Data in structures not supported by OLAPs. The Eagle data model is very flexible and supports a wide range of data structures. You can build custom tables and join them to core tables to permit access by the OLAPs. You can also create database views and build field attributes to report their contents. However, there may be instances where data is loaded to the Eagle warehouse for which OLAP access is not readily available and cannot easily be provided.
Report-ready, single-use data loaded in high volume. Even in the case of single use data, Eagle recommends as a best practice that all data is stored in the warehouse followed by Data Mart running the OLAP processes to move a set of values from one warehouse schema to the Data Mart schema. This is the best approach if you want to load data from an external source such as report-ready performance data.
Data Moved in Anticipation of a Future Release Enhancement. You may need to deliver a type of information that is supported in Data Mart only in a future release due to a product enhancement. Release upgrade may not be practical before the information is required. If the nature of the enhancement is to enable a new type of field to be built into existing Mart tables, a temporary workaround strategy may be available. Refer to “Direct Loads to Future Table Extensions” for additional information.
How to Move Non-OLAP Data to the Mart
The following sections describe how to deploy non-OLAP data to the Data Mart schema.
Non-Materialized Views
A non-materialized view is simply a SQL query against other database tables that is exposed to reports and applications as if it were a physical table. This simplifies access since joins between tables and views are supported. The main drawback of such a view is that its use involves the latency of execution of a database query. Reserve this type of view for the simplest views in use.
Materialized Views
Materialized views are physically built from data returned by underlying SQL. They are a sort of database copy. Queries against them perform much better than those against non-materialized views of the same datasets, since data is physically present and does not have to be assembled on the fly. They are best suited to views of larger numbers of fields. A drawback of materialized views is that they must be updated and maintained. Oracle offers a powerful toolset, Oracle Materialized Views, for deploying materialized views.
Direct Loads to Reserved Columns of Mart Tables
The Selective Fields enhancement in V9.0 Data Mart allows you to directly load fields to Data Mart tables as long as you can populate all rows of those tables by a regular Data Mart Submit. You could reserve some fields in the tables that need supplemental data from direct loads. To do this, create and select a set of dummy field attributes for the model to create columns in the table to receive the external load. Selective fields are then used to avoid populating those fields in the regular build, reserving them for the external load. Eagle recommends you create and populate an extension since the Mart does not delete data, but simply updates it.
This practice introduces complexities, and you should attempt it only in consultation with our Professional Services Consulting Group. It may be simpler just to add a custom table to the Data Mart schema and join to appropriate tables in reporting.
Direct Loads to Future Table Extensions
...
Implementation Plan
With a clear and complete functional specification, a knowledgeable resource can configure a Data Mart very quickly. This makes it worth the time to create a good implementation plan at the start.
Inventory the Fields and Groupings to be Supported
The first step in planning a Data Mart is to take as complete an inventory as possible of all the fields and groupings you plan to support in your information delivery project. The more complete the inventory, the better you can create a logical and economical design of models, model extensions and groupings. This does not mean that use of the Mart is an all or nothing or big bang step, only that within the scope of the data to be supported in the Mart, a complete inventory of it makes for a better plan.
This inventory process should include an effort to standardize where possible on a minimal set of definitions of common fields like market value and security price. Standardization makes possible a compact Mart where data elements are widely shared among consumers of the data.
Start Small and Build on Your Successes
While it is best practice to create a complete inventory of fields you will need for a specific information delivery goal, that goal may be a modest one, especially if it is your first Data Mart project. The mart is architected for future growth. So doing a complete analysis of a present known need does not require foreknowledge of all future reporting requirements.
Identify and Create any Missing Meta Data Elements
Every Data Mart field requires a field attribute, and every Data Mart grouping requires some set of field attributes, range rules and dictionaries (regular and performance). The Mart implementation plan should identify these meta data elements quite specifically, and indicate where something must be created.
Use the Data Mart Planning Template
Global Professional Services has a template that is very useful for storing all of the Data Mart and regular meta data elements involved in a Mart, and can serve as the basis of an implementation plan. It is available as Knowledge Base article #8619.
Planning a Sources Strategy
A sources strategy for Data Mart is a design that uses the minimal list of source rules and source-specific fields required to provide the entire source perspectives needed in reporting. Often, this implies a single source rule if data of every type, such as positions and trades, can correctly be reported using just a single hierarchy of sources. More than a single source rule is required only if certain individual positions, trades, or securities must be viewed in more than one way within the scope of information delivery.
Common use cases for multiple source rules or source-specific fields include:
Multiple providers of bond ratings or other security analytic data whose viewpoints must all be considered in the security evaluation process;
Multiple accounting bases, such as GAAP and statutory, that require accounting fields to be computed and reported in different ways.
When more than one type of data must be reported in more than one source hierarchy, the number of required source rules can proliferate and lead to several snapshots and heavy duplication of data. For this reason, you should make choices to limit the required list of snapshots. See “Sharing of Security Details Among Snapshots” for additional information.
A best practice is to build best data in PACE where possible, and then use the resulting Eagle PACE source as the main source for those data types built.
Planning for Good System Performance
Building Only the Data You Need
The single most important aspect of Mart planning directed at good system performance in the build process is a realistic assessment of what data you really need to send to the Mart. The best way to answer this question is to consider all of the processes that will use Mart data, and understand their demands in detail. Requirements are likely to expand over time, but you can add fields. It is better to add later than to over-build your Mart from the start.
Which fields are needed daily? Many organizations maintain a monthly reporting cycle that generates demand for a comprehensive list of monthly fields. Daily values may be needed for a much shorter list. If this is true, use the Selective Fields capability to limit the daily build to those fields needed daily, with a full build conducted at month end.
Some types of performance returns, such as returns for specific years or quarters, do not change every day. If you report such returns, you can avoid building them every day or even every month by building them only once on their true effective date, for example on December 31, 2020 for a “Year 2020” return. Then in reporting, you would always select them as of their effective date.
If you plan to deliver a piece of information from Eagle Portal, you can source it from Data Mart or use the dynamic query approach where results are computed on the fly. The dynamic query approach may have limitations as to breadth of data, and the calculation may take somewhat longer than if Portal is queried directly from the Mart. On the other hand, it is not best practice to build out your Mart in an overly comprehensive way just to feed Portal with some types of data that you think might be accessed, but for which no clear current demand exists. Portal dynamic queries are best suited in those cases where you need to support access to particular types of data or calculations, but you expect access frequency to be very low.
Indexing Warehouse Trade and Cash Date Fields
When you select trade and cash activity fields for Data Mart you can specify which of several different date fields, such as trade date and settlement date, you want to use as the effective date. Whichever date you choose should have an index built on it in the appropriate warehouse table to speed processing of trades or cash events. The effective_date field itself is indexed already. Statistics should be built frequently enough, ideally daily.
Table indexing does not really apply to building performance analysis fields, since they are compute-intensive.
Using Extensions to Exploit Parallel Processing
System performance in building Mart data is enhanced when your design takes advantage of parallel processing by multiple concentration (OLAP) engine instances. Each Data Mart table extension is populated under the direction of a separate Model Manager instance that generates its own set of underlying OLAP processes. This means that the use of extensions can increase parallel processing and speed population of the Mart.
A Global Professional Services study has recorded the advantages of extensions in populating performance measurement data in particular. See the Data Mart White Paper in Knowledge Base article 8459. Performance Analysis OLAP processes underlying a Data Mart model are among the most I/O and compute-intensive of any, so extensions can be particularly helpful in minimizing run time as you build performance linked returns, risk fields and other complex calculations. When you assign performance fields to an extension, it is best to group fields together that you would logically assign to the same Performance Analysis OLAP report, such as a single attribution group or a series of performance link analysis fields that span either a short or a long-term time span.
If you install Eagle Mart and the Performance Toolkit, you can see examples of best practices for using extensions to support different types of performance fields. The toolkit uses separate Fund Summary extensions for the following field types:
Performance link fields for periods up to one year
Performance link fields for periods longer than one year
Performance risk analysis fields
GIPS Disclosure fields
GIPS Composite Performance Analysis fields
There are restrictions on building only one type of field attribute in a single extension.
Sharing Security Details Among Snapshots
If you build data for multiple accounting sources as different snapshots, you may not want to build the same Security Details model data for each snapshot, as this would be repetitive. You may find it preferable to build Security Details for just one snapshot, such as the Default snapshot.
Since Data Mart requires you to build Security Details whenever you build any detail-level table, you cannot completely avoid building Security Details in each snapshot. However, you can use Selective Fields to build just one field in Security Details in all but the Default snapshot, and save considerable time. Data Mart offers the option of building detail-level tables without also building any Security Details.
If you have different snapshots with different securities, you should build all securities. However, if you have different snapshots with the same securities you would only need to build once.
Limiting the Number of Entities per OLAP Process
Experience shows that performance OLAP processes typically run best when you change the No. of Funds per Report option in the Configuration dialog box in Manage Mart to about 20. The option is set to 50 by default. You can experiment with this setting to achieve optimal results, especially if you are building a significant amount of performance data.
Limiting the Number of Entities per Model Manager
Similar to the number of entities per OLAP process, you can also control the number of entities per Model Manager. You can increase the default value of 1,000 to find the value that works best for you.
Minimizing Log Level when Not Troubleshooting
Logging at a high level such as 8 or 10 can occupy material time in a build. During normal production you can set the level to 1 to save time and then raise it again if necessary for troubleshooting purposes.
Selecting Settings for Process Manager
If you use Process Manager to build your Mart, pay special attention to the settings described in the following sections.
Avoid the Sequential Processing Setting
You can use Process Manager to manage a variety of Eagle processes. For some processes, each engine should launch in sequence. For others, parallel processing is preferable.
Parallel processing is required for Data Mart to build data with the greatest possible efficiency. To use parallel processing if you use Process Manager to build your Mart, when you configure the Process Manager Workflows, deselect the Process Effective Date Change by Batch checkbox in the Workflow Definition tab on the Workflows Setup wizard. Only deselect this option when you meet the needs of the Funds Per Model Manager and Funds per Mart conditions. This allows the system to "flood the cluster" and take full advantage of parallel processing.
Ensure Parallel Processing
You must complete the following configuration step for parallel processing for Data Mart whether or not you use Process Manager with the Mart. In Eagle Navigator, type PACE Engines and select the PACE Engines (System Management Center) link. Set a Limit Ratio value of 0.25 or other value less than 1.0 as directed by Eagle Technical Services for three engine types: Data Mart Manager Engine, Model Manager Engine and OLAP reports. The Diagnostic checks that you run in Data Mart check that these Limit Ratios are below 1.0.
Find the Best Process Manager Run Group Batch Size
When you set up a Process Manager workflow, you see the Run Groups tab in the Wizard. On the Run Groups tab, you define events within the workflow. The Batch Size setting on the Run Groups tab indicates the number of entities that are batched together for purposes of your Mart build. The default setting is 1. This indicates that the workflow will start as soon as any eligible entity is found. This setting may be inefficient because the minimal overhead of starting a Data Mart build is potentially repeated for every entity built in the Mart. A setting in the range of 25-50 is preferable. You can try various settings to determine the one that gives you the best results.
Special Considerations for Performance Measurement Fields
Performance measurement fields can involve complex database queries and extensive processing requirements. Data Mart builds that require many of these fields are the most challenging for achieving good system performance. There are several ways to improve efficiency as described in the following sections.
Limit the Lowest Level of Performance OLAP Builds
Performance returns are typically stored in the database at the security, group and fund levels. When a Data Mart performance OLAP process runs to build performance analysis fields, it computes returns for all levels represented in the performance dictionary that is linked to the model by default. For the Fund Summary and group models, you can use the Performance Report Options link in the Filters/Mappings tab of the model user interface to limit computing of returns to just what is required by the model. For example, Entity level for Fund Summary. Note the following:
If you are building performance attribution fields where your fund and/or group returns are based on rollup of lower-level contributions, you must not limit the rollup process or you will not get the expected attribution data at a rollup level. Note that this does not apply to models where you use dynamic performance rather than a performance dictionary.
If you plan to report security-level performance returns linked to your groups, you can use the Create Mapping for Details option on the group model’s Filters/Mappings tab to create a convenient table of pointers from group table groups to detail table rows. However, if you do this, you cannot also limit the OLAP build to exclude the detail level. You must choose one or the other. If the choice is to avoid building the OLAP to security level, you can still associate groups and their underlying detail members by including grouping fields as data fields in the detail model, and using reporting to sort on those fields shared by groups and detail, for example, Country and Sector.
Economize on Associated-Benchmark Fields where Possible
You can associate one or more index entities with a fund as its custom list of benchmarks. Then you can define performance analysis field attributes for the fund that point to any of its associated benchmarks. In Data Mart, all of these associated-benchmark fields are built on the same table rows as the fund, which is convenient for reporting purposes. Further, if the benchmark is of the custom blended variety and fund-specific, or if the field links benchmark returns since the inception date of the fund, only these associated-benchmark fields will do the job.
However, you might need to report benchmark returns whose values do not depend upon association with a particular fund. For example, the QTD return of the “vanilla” S&P 500 index as of a given date is the same number for all funds. Obviously you can reduce your Data Mart build requirements by building that number just once rather than once per fund. Benchmark returns may be built on their own rows in Fund Summary. Then for each individual fund you can populate Fund Summary with the entity IDs of its associated benchmarks, and design your SQL accordingly.
Engage Eagle Technical Services
Database tuning can almost always make a well-designed Mart run even better. In addition, if your Data Mart requirements involve hundreds of fields or thousands of funds, hardware sizing also deserves expert attention. Eagle Technical Services offers the benefits of their experience with a variety of significant Data Mart implementations, and can add substantial value to your Mart planning process.
Assign Roles and Access Privileges
Mart configuration is usually considered an administrator function, and is not a business user function. Data Mart’s privileges in User Manager are granular enough to support the differentiation among configuration roles that most organizations need.
Where there is more than one Data Mart role required, the roles usually fall under the following categories:
Administration, with all privileges.
Model management, with Maintain, Configuration, Migrate and View privileges.
Production, with Submit and View privileges.
Data quality assurance, with Audit, View and Validate privileges.
Adopt an Archive Policy
Archiving Data Mart tables can improve index rebuilding and query performance simply by reducing the number of rows to be navigated. A logical policy is one that preserves each data frequency (daily and monthly) for the time span that reporting may need it. Many clients preserve monthend data indefinitely, but keep daily data for a few months or a year.
Remember that all as-of reporting against Data Mart requires that data be kept for all possible prior dates you might want to report. If you need data for time periods already purged, the lost rows can be rebuilt, but there will likely be restatements of numbers previously reported.
In V9.0 under Archive Engine support of Data Mart, if you want to archive to a table and save each year’s rows together with a labeling effect, you can simply assign names to archive tables with a naming convention that includes the desired date information.
You should purge the Data Mart execution tables and leave them with no more than 1 months’ worth of data. Your specific needs may be different. Not purging can result in system performance issues.