Overview
The Convert to Case Attributes enrichment is an intelligent data optimization operator that automatically identifies and converts event-level attributes to case-level attributes when their values remain constant throughout each case. This powerful cleanup tool analyzes your entire dataset to find event attributes that never change within a case - such as customer IDs, product categories, or region codes that are unnecessarily repeated at the event level - and elevates them to case attributes for improved performance and cleaner data models.
This enrichment solves a common data quality issue in process mining where source systems export redundant data at the event level, creating bloated datasets and complicating analysis. By automatically detecting and converting these stable attributes to the case level, the enrichment reduces data redundancy, improves query performance, and creates a more logical data structure. The conversion process is completely automatic and requires no configuration, making it an essential first step in data preparation that can significantly reduce dataset size while maintaining all information integrity.
Common Uses
- Optimize imported ERP data where customer information is repeated in every event but never changes within an order
- Convert static product attributes like category, family, or type from event to case level in manufacturing processes
- Elevate fixed project attributes such as project manager, budget, or department in project management datasets
- Move constant patient demographics like age group, insurance type, or admission type to case level in healthcare data
- Convert stable financial attributes like loan type, interest rate, or branch code in banking process data
- Clean up procurement data by moving vendor information, contract numbers, and payment terms to case level
- Optimize logistics data by converting shipment properties like destination country, service level, or carrier to case attributes
Settings
This enrichment operates automatically without requiring any configuration. It analyzes all event attributes in your dataset and intelligently determines which ones can be safely converted to case attributes based on value consistency within each case.
Examples
Example 1: Optimizing Order Processing Data
Scenario: An e-commerce company's order processing system exports data where customer information, shipping details, and order properties are unnecessarily repeated in every event, creating a dataset that's 60% larger than needed.
Event Data Before Enrichment: | Case ID | Activity | Customer_Name | Customer_Region | Order_Priority | Product_Category | Timestamp | |---------|----------|---------------|-----------------|----------------|------------------|-----------| | ORD-001 | Create Order | John Smith | North America | High | Electronics | 2024-01-10 08:00 | | ORD-001 | Verify Payment | John Smith | North America | High | Electronics | 2024-01-10 08:15 | | ORD-001 | Pick Items | John Smith | North America | High | Electronics | 2024-01-10 09:00 | | ORD-001 | Ship Order | John Smith | North America | High | Electronics | 2024-01-10 14:00 | | ORD-002 | Create Order | Jane Doe | Europe | Normal | Clothing | 2024-01-10 08:30 | | ORD-002 | Verify Payment | Jane Doe | Europe | Normal | Clothing | 2024-01-10 08:45 |
Case Attributes After Enrichment: | Case ID | Customer_Name | Customer_Region | Order_Priority | Product_Category | |---------|---------------|-----------------|----------------|------------------| | ORD-001 | John Smith | North America | High | Electronics | | ORD-002 | Jane Doe | Europe | Normal | Clothing |
Event Data After Enrichment: | Case ID | Activity | Timestamp | |---------|----------|-----------| | ORD-001 | Create Order | 2024-01-10 08:00 | | ORD-001 | Verify Payment | 2024-01-10 08:15 | | ORD-001 | Pick Items | 2024-01-10 09:00 | | ORD-001 | Ship Order | 2024-01-10 14:00 | | ORD-002 | Create Order | 2024-01-10 08:30 | | ORD-002 | Verify Payment | 2024-01-10 08:45 |
Output: The enrichment identified that Customer_Name, Customer_Region, Order_Priority, and Product_Category never change within each case and automatically converted them to case attributes. The event table is now 60% smaller, containing only the essential event-specific information.
Insights: After conversion, dashboard queries run 3x faster due to the reduced data volume. Case-level filtering for customer segments and product categories is now more intuitive, and the data model clearly distinguishes between case properties and event details, making it easier for analysts to understand and work with the data.
Example 2: Healthcare Patient Journey Optimization
Scenario: A hospital's patient management system exports admission data where patient demographics, insurance information, and medical classifications are repeated in every treatment event, making the dataset unnecessarily complex and slow to analyze.
Event Data Before Enrichment: | Case ID | Activity | Patient_Age_Group | Insurance_Type | Admission_Type | Department | Diagnosis_Code | Resource | |---------|----------|------------------|----------------|----------------|------------|---------------|----------| | PAT-501 | Registration | 45-60 | Private | Emergency | ER | CARD-01 | Nurse Smith | | PAT-501 | Triage | 45-60 | Private | Emergency | ER | CARD-01 | Dr. Jones | | PAT-501 | Treatment | 45-60 | Private | Emergency | ER | CARD-01 | Dr. Jones | | PAT-501 | Discharge | 45-60 | Private | Emergency | ER | CARD-01 | Nurse Brown |
After Enrichment:
Case Attributes: | Case ID | Patient_Age_Group | Insurance_Type | Admission_Type | Diagnosis_Code | |---------|------------------|----------------|----------------|---------------| | PAT-501 | 45-60 | Private | Emergency | CARD-01 |
Event Attributes (varying values remain): | Case ID | Activity | Department | Resource | |---------|----------|------------|----------| | PAT-501 | Registration | ER | Nurse Smith | | PAT-501 | Triage | ER | Dr. Jones | | PAT-501 | Treatment | ER | Dr. Jones | | PAT-501 | Discharge | ER | Nurse Brown |
Output: Patient demographics and fixed medical information are moved to case level, while Department and Resource remain as event attributes since they could potentially vary (patients might move between departments). The dataset is now 40% smaller and more logically organized.
Insights: The optimized data structure enables faster patient cohort analysis, with insurance type and age group filters running instantly at the case level. Diagnosis-based process mining is now more efficient, and the hospital can quickly identify treatment patterns for specific patient segments without processing redundant event data.
Example 3: Manufacturing Process Data Cleanup
Scenario: A manufacturing plant's MES system exports production data where product specifications, order details, and quality standards are duplicated across every production step, creating performance issues in process analysis.
Before Enrichment: Every production event contains: Product_ID, Product_Type, Material_Grade, Quality_Standard, Customer_Code, Order_Size, Target_Date
After Enrichment:
- Converted to Case Attributes: Product_ID, Product_Type, Material_Grade, Quality_Standard, Customer_Code, Order_Size, Target_Date (all constant within each production run)
- Remaining Event Attributes: Activity, Timestamp, Machine_ID, Operator, Temperature, Pressure (varying values)
Output: Seven attributes that never change within a production run are automatically elevated to case level. The event table now focuses solely on process execution details that vary between activities.
Insights: The conversion reduced the dataset size by 65%, enabling real-time process monitoring that was previously impossible due to data volume. Quality analysis by product type and material grade is now straightforward using case-level filters, and the plant can efficiently track KPIs across different product categories.
Example 4: Financial Loan Processing Simplification
Scenario: A bank's loan processing system exports application data where loan parameters, customer profiles, and regulatory classifications are repeated in every workflow step, complicating compliance reporting and process optimization.
Event Data Sample (Before): Each event includes: Loan_Type, Interest_Rate, Loan_Amount, Credit_Score_Range, Branch, Region, Product_Code, Regulatory_Class, Customer_Segment
After Enrichment:
- Case Level: All loan parameters and customer classifications (9 attributes) moved to case table
- Event Level: Only Activity, Timestamp, Approver, Decision, and Comments remain
Output: The enrichment detected that loan parameters and customer information never change during the application process and converted them to case attributes. The event table is reduced to essential workflow information only.
Insights: Regulatory compliance reports that previously took hours now run in minutes. The bank can instantly analyze approval patterns by credit score range and loan type using case-level data, and process mining reveals bottlenecks specific to certain customer segments without the overhead of redundant event data.
Example 5: Supply Chain Data Optimization
Scenario: A logistics company's tracking system records shipment details at every scan point, with fixed shipment properties like service level, destination, weight class, and customer account repeated millions of times across tracking events.
Before Enrichment: 500,000 shipments × 15 scan points × 8 static attributes = 60 million redundant data points
After Enrichment:
- Case Attributes: Service_Level, Origin_Country, Destination_Country, Weight_Class, Customer_Account, Declared_Value, Shipment_Type, Contract_ID
- Event Attributes: Activity (Scan Location), Timestamp, Scanner_ID, Location_Code, Exception_Flag
Output: Eight shipment properties are converted to case level, stored once per shipment instead of repeated at every scan. The event table size is reduced by 70%, containing only dynamic tracking information.
Insights: Route analysis by destination and service level is now 10x faster using case-level queries. The company can efficiently identify delivery patterns for different customer segments and optimize routes based on shipment characteristics without processing massive amounts of duplicate data. Real-time tracking performance improved dramatically, enabling live dashboard updates that were previously impossible.
Output
The Convert to Case Attributes enrichment modifies your dataset structure by intelligently moving attributes from the event level to the case level. The enrichment performs a comprehensive analysis to identify event attributes whose values never change within each case, then automatically converts these to case attributes for optimal data organization.
Conversion Process:
- Analyzes all event-level data columns except system columns (Activity, Timestamp, Resource)
- For each attribute, checks if values remain constant within every case in the dataset
- Only converts attributes that have identical values across all events within each case
- Preserves the original attribute names and data types during conversion
- Maintains data integrity by using the last non-null value when present
Attributes That Are Converted:
- Event attributes with constant values throughout each case (customer IDs, product codes, categories)
- Static properties that are unnecessarily repeated at event level (regions, types, classifications)
- Reference data that logically belongs at case level (contract numbers, project codes, order properties)
Attributes That Remain at Event Level:
- System columns (Activity, Timestamp, Start Time, Resource, Expected Order)
- Attributes with varying values within cases (different resources, changing statuses, measurements)
- Hidden system attributes that should not be modified
- Attributes that already exist at the case level with the same name
Impact on Your Dataset: The enrichment creates a cleaner, more efficient data structure where each piece of information exists at its logical level. Case-level filtering and aggregation become more intuitive since case properties are properly organized. Query performance improves significantly due to reduced data redundancy, and the dataset size typically decreases by 30-70% depending on the amount of redundant event data.
The converted attributes integrate seamlessly with all mindzieStudio features. Filters can efficiently query case attributes without scanning event data, calculators can reference case attributes directly without aggregation functions, and other enrichments benefit from the optimized data structure. Process discovery and conformance checking operate more efficiently on the streamlined event data, while maintaining full access to case properties when needed.
This documentation is part of the mindzie Studio process mining platform.