Customer-controlled Optimization in D365 Customer Insights Data

CharlesO

Like(0)

Report

Introduction

Enterprises want to get insights from data as fast as possible. But the scale of enterprise data with millions to billions of records that need to be consolidated, compared, unified and processed for insights presents challenges. This blog outlines best practices to improve Dynamics 365 Customer Insights Data (CID) performance, helping you expedite derivation of actionable insights and enhance overall system efficiency.

Core Operations in D365 Customer Insights Data

Understanding key operations in Customer Insights Data is critical for identifying performance bottlenecks and opportunities for optimization:
Please refer to Understanding Job Execution Flow in Customer Insights - Data Batch Runs for additional details.
This information will also help you prioritize the optimizations by indicating which ones are likely to have a broader performance impact on downstream processes within the batch execution.

Recommendations to improve processing time and get insights faster

The following guidance provides specific actions you can take to reduce runtimes and get your results faster.

Optimized Data Ingestion

Data ingestion is the foundation of your Customer Insights Data pipeline. Efficient ingestion practices directly influence performance. Please refer to Unlocking the Power of D365 Customer Insights: Best Practices for Data Modelling and Data Quality for additional details.

Limit the data ingested into CI-D

Steps to Take

Identify key data elements—such as data tables and attributes—that directly support business objectives and bring only those into CI-D. Exclude attributes that are rarely used in analysis to optimize performance and storage.
Limit the volume of historical transaction data ingested into CI-D to what is necessary for meeting business needs, thereby reducing unnecessary data load and enhancing processing efficiency.

Example: A retailer reduced data volume by 40% by ingesting only critical transaction and loyalty data, improving overall end-to-end batch performance.
Expected Benefits: Reduced ingestion processing time due to smaller data volume. Faster analytics due to streamlined data.

Decouple profile and activity/transactional data sources

Steps to Take:

Where possible, bring the profile and transactional data as separate data sources into CI-D.

Expected Benefits: As seen in the Understanding Job Execution Flow in Customer Insights - Data Batch Runs blog, match job is dependent on successful completion of all the data sources bringing in the profile data. If you have profile and activity data in the same data source, the match job will start once the entire data source is ingested successful. However, if you decouple then match can start as soon as profile sources are ingested while the activities ingestion can run longer.

Denormalize the data model

Steps to Take:

Denormalize the data before ingestion into CI-D as per guidelines provided in the Unlocking the Power of D365 Customer Insights: Best Practices for Data Modelling and Data Quality blog. A common pitfall to avoid is treating a table containing profile attributes as a supporting table, rather than denormalizing those attributes directly into the primary profile source.
If data is ingested from a Data warehouse layer, then chances are that your data come from various business source systems have the same data model for profiles. Where possible, bring the profiles as single data source in CI-D (as opposed to one profile data source per business source system).

Example: A financial services firm directly ingested compliance flags into several customer demographic tables to use as ‘supporting tables’ instead of denormalizing these attributes and making it part of the primary profile source.
Expected Benefits: Efficient and performant execution of measures and segments, without joining ‘supporting tables’.

Data modelling: Primary Keys

Steps to Take:

Identifying a ‘true’ primary key for each dataset is critical for data ingested into Customer Insights Data as this is not only the backbone of the data model but also used to identify changes between previous and current batch execution.
If you don’t have a primary and have a need to assign a surrogate key, then ensure that the surrogate key is ‘stable’. In other words, the surrogate keys do not change with each assignment.

Example: A multi-brand retail organization reduced the online activity hydration time by over an hour by allocating a stable surrogate key to one of their transactional sources which didn’t have a ‘true’ primary key.
Expected Benefits: Less data to insert/update when hydrating to Dataverse.

Periodic reset/full refresh of data

Steps to Take:

If you are using incremental ingestion with parquet or CSV files, then please set up a process to reset/full load every 6-9 months.

Expected Benefits: CI-D backend processing is Spark which works in the most optimized way with large files. Hence, periodically resetting to full load optimizes performance since Spark needs to process fewer smaller delta/incremental files.

Alignment with Microsoft’s roadmap

Where possible, ingest the data using ‘Connect to Delta Tables’ as CI-D end-to-end processes will be optimized to process only the changes (new and changed data) when source data is ingested using this method.

Optimizing Profile Unification (Match and Merge)

Profile unification combines data from multiple sources to create accurate, single-view customer profiles. The profile table will be used in all downstream processes including segments and measures (excluding business measures). Hence, keeping the size of profile small will have a positive impact on your overall batch run time.

Refine Matching Logic

Steps to Take:

Prioritize stable identifiers like emails and customer IDs phone numbers for matching.
Limit the use of fuzzy matching on unreliable attributes.
Where possible, use an exact condition with fuzzy condition instead of using fuzzy matching rule on its own.

Regularly review matching rules to identify ineffective matching criteria.

Example: An e-commerce brand previously matched customers on email, name, and address fields with extensive fuzzy matching, causing delays and inaccurate profile merges. They revised their logic to prioritize email and phone number fields, removing fuzzy matches on addresses, cutting processing time by 50% and improving accuracy.
Expected Benefits: Faster profile unification. Increased profile accuracy.

Limit the attributes in customer profile

Steps to Take:

Keep only the necessary attributes in the unified customer profile.

Expected Benefits: Customer profile table is used in most processes like activity unification, measures, segments and so on. Limiting the number of attributes in this table will limit the size of the customer profile table, improving efficiency of all the processes using it.

Improving Search Performance

Efficient search performance enables analysts and marketers to quickly access customer profile data, enhancing analytics accuracy.

Index Essential Fields Only

Steps to Take:

Index only frequently queried fields like customer ID, email, and phone number.
Avoid indexing infrequently searched attributes.

Example: A hospitality chain improved query speeds dramatically by indexing only customer IDs and email addresses.
Expected Benefits: Faster search response times. Reduced indexing overhead and cost. Enhanced user productivity.

Optimizing Customer Segmentation

Effective segmentation ensures that you target the right customer groups quickly and efficiently.

Segmentation governance

Steps to Take:

Governance strategy: A governance strategy is required to manage the number of active segments.

Expected Benefits: Lower number of segments to be refreshed in each batch would optimize the end-to-end batch performance.

Complex segment creation

For complex segments that rely on the same underlying data, consider creating a base segment to consolidate heavy processing. You can then build additional segments with simpler conditions using this base segment.
For example, if multiple segments require filtering transactional data from the last 6 months, create a base segment that captures this filter, and then use it as the foundation for the others.
Keep in mind that segments built on other segments execute sequentially, which may impact performance—especially when many dependent segments are involved. While there’s no golden rule for the ideal ratio, a good practice is to ensure that no more than 20% of your total segments are dependent on other segments. Following this guideline typically strikes a healthy balance between optimized processing and parallel execution.
Steps to Take:

Break down complex segments relying on same underlying data into a base segment and then build additional segments with simpler conditions using this base segment.
Ensure that no more than 20% of your total segments are dependent on other segments.
Schedule segments on slower refresh cadence to align with journey/activation schedule. For example: Segments used in weekly newsletter can be refreshed the day before and/or on the day the newsletter is sent out.

Expected Benefits: Faster segment refresh. Faster end-to-end batch refresh time.

Optimizing Customer Measures

Customer measures provide important KPIs that influence strategic business decisions.
Customer Measures Optimization
Steps to Take:

Aggregate data before processing heavy computations.
Limit calculations involving high-cardinality dimensions. (e.g., product-level calculations across millions of products).
Schedule measures with intensive calculations or ones that don’t change often on slower cadence.

Example: An electronics retailer previously calculated average product sales by analyzing millions of individual product records daily. They shifted to monthly aggregated product category calculations, reducing calculation times from hours to minutes.
Expected Benefits: Faster processing. Improved system performance and stability. Timely, actionable insights.

Conclusion

Optimizing performance in Dynamics 365 Customer Insights Data requires a comprehensive understanding of system operations and a strategic approach to configuration and data management. Each operation within the pipeline offers opportunities to improve speed, reduce costs, and enhance scalability. By consistently applying the recommended best practices throughout the data lifecycle, businesses can achieve faster insights, improved efficiency, and more effective decision-making.

Community site session details

Customer-controlled Optimization in D365 Customer Insights Data

Introduction

Core Operations in D365 Customer Insights Data

Recommendations to improve processing time and get insights faster

Optimized Data Ingestion

Limit the data ingested into CI-D

Decouple profile and activity/transactional data sources

Denormalize the data model

Data modelling: Primary Keys

Periodic reset/full refresh of data

Alignment with Microsoft’s roadmap

Optimizing Profile Unification (Match and Merge)

Refine Matching Logic

Limit the attributes in customer profile

Improving Search Performance

Index Essential Fields Only

Optimizing Customer Segmentation

Segmentation governance

Complex segment creation

Optimizing Customer Measures

Comments

Ramesh Kumar – Community Spotlight

Congratulations to the May Top 10 Community Leaders!

Announcing the Engage with the Community forum!