Upper Case

Overview

The Upper Case enrichment is a data standardization operator that converts all text values in selected attributes to uppercase letters throughout your dataset. This transformation ensures consistent text formatting across your process data, enabling reliable case-insensitive matching, filtering, and analysis operations. When working with data from multiple sources where text case varies inconsistently - such as customer names entered differently across systems or product codes with mixed capitalization - this enrichment creates uniform uppercase formatting that eliminates case-related data quality issues.

By standardizing text to uppercase, this enrichment addresses common challenges in process mining where the same entity appears different due to capitalization variations. For example, customer names like "Acme Corp", "ACME CORP", and "acme corp" would be treated as three distinct values without standardization, fragmenting your analysis. The Upper Case enrichment ensures these variations are unified, providing accurate metrics for customer analysis, product categorization, and resource utilization. This standardization is particularly critical when preparing data for conformance checking, where consistent activity names and attributes are essential for pattern recognition.

The enrichment processes string attributes at the case level, transforming every text value while preserving the original data structure. Unlike manual text manipulation that risks errors and inconsistencies, this automated approach ensures every instance of the selected attribute is transformed uniformly across all cases in your dataset.

Common Uses

  • Standardize customer names and company identifiers for accurate customer journey analysis and segmentation
  • Normalize product codes and SKUs that may have inconsistent capitalization across different systems
  • Prepare text attributes for case-insensitive matching when joining data from multiple sources
  • Create consistent activity names for process discovery when source systems use different capitalization conventions
  • Standardize location codes, department names, and organizational units for accurate resource analysis
  • Format reference numbers and identifiers consistently for reliable filtering and grouping operations
  • Prepare text data for integration with external systems that require uppercase formatting

Settings

Attribute Name: Select the text attribute whose values you want to convert to uppercase. The dropdown list shows all available text (string) attributes in your dataset, excluding hidden columns. You must select exactly one attribute to transform. The enrichment will process every value in the selected attribute across all cases, converting lowercase and mixed-case text to uppercase while leaving already uppercase text unchanged. Only attributes with string data type are available for selection.

Examples

Example 1: Standardizing Customer Names in Order Processing

Scenario: A distribution company's order management system contains customer names with inconsistent capitalization from different data entry points - web orders, phone orders, and EDI transmissions - causing fragmented customer analysis and inaccurate order volume calculations.

Settings:

  • Attribute Name: Customer_Name

Before Enrichment: | Case ID | Customer_Name | Order_Value | Region | |---------|--------------|-------------|---------| | ORD-001 | Acme Corporation | 15000 | North | | ORD-002 | ACME CORPORATION | 22000 | North | | ORD-003 | acme corporation | 18500 | North | | ORD-004 | Beta Industries | 9500 | South | | ORD-005 | BETA INDUSTRIES | 11000 | South |

After Enrichment: | Case ID | Customer_Name | Order_Value | Region | |---------|--------------|-------------|---------| | ORD-001 | ACME CORPORATION | 15000 | North | | ORD-002 | ACME CORPORATION | 22000 | North | | ORD-003 | ACME CORPORATION | 18500 | North | | ORD-004 | BETA INDUSTRIES | 9500 | South | | ORD-005 | BETA INDUSTRIES | 11000 | South |

Output: All values in the Customer_Name attribute are converted to uppercase. The three variations of "Acme Corporation" are now unified as "ACME CORPORATION", and both variations of "Beta Industries" are standardized to "BETA INDUSTRIES".

Insights: After standardization, the company discovered that Acme Corporation actually represented 55,500 in total orders (not three separate customers with individual orders), making them the largest account. This accurate view enabled proper account prioritization and revealed that 30% of revenue came from customers whose names had capitalization variations.

Example 2: Normalizing Product Codes in Manufacturing

Scenario: A manufacturing plant's quality control system tracks defects by product code, but codes are entered with different capitalization patterns by operators across three shifts, preventing accurate defect rate analysis by product.

Settings:

  • Attribute Name: Product_Code

Before Enrichment: | Case ID | Product_Code | Defect_Type | Shift | Severity | |---------|-------------|-------------|-------|----------| | QC-001 | prd-A1234 | Surface | Day | Minor | | QC-002 | PRD-A1234 | Surface | Night | Minor | | QC-003 | Prd-A1234 | Dimension | Evening | Major | | QC-004 | prd-b5678 | Assembly | Day | Critical | | QC-005 | PRD-B5678 | Assembly | Night | Critical |

After Enrichment: | Case ID | Product_Code | Defect_Type | Shift | Severity | |---------|-------------|-------------|-------|----------| | QC-001 | PRD-A1234 | Surface | Day | Minor | | QC-002 | PRD-A1234 | Surface | Night | Minor | | QC-003 | PRD-A1234 | Dimension | Evening | Major | | QC-004 | PRD-B5678 | Assembly | Day | Critical | | QC-005 | PRD-B5678 | Assembly | Night | Critical |

Output: All Product_Code values are converted to uppercase. The three variations of product A1234 are unified as "PRD-A1234", and both variations of product B5678 are standardized as "PRD-B5678".

Insights: Standardization revealed that product PRD-A1234 had a 60% defect rate across all shifts (3 defects from 5 production runs), triggering an immediate quality investigation. Previously, each capitalization variant appeared to have acceptable defect rates when analyzed separately.

Example 3: Standardizing Department Codes in Healthcare

Scenario: A hospital's patient flow system uses department codes that staff enter with inconsistent capitalization, making it impossible to accurately track patient wait times and department utilization across the facility.

Settings:

  • Attribute Name: Department_Code

Before Enrichment: | Case ID | Patient_ID | Department_Code | Wait_Time | Priority | |---------|-----------|----------------|-----------|----------| | ADM-001 | P1234 | ER-main | 45 | High | | ADM-002 | P1235 | er-Main | 38 | High | | ADM-003 | P1236 | ER-MAIN | 52 | Critical | | ADM-004 | P1237 | icu-west | 15 | Medium | | ADM-005 | P1238 | ICU-West | 18 | Low |

After Enrichment: | Case ID | Patient_ID | Department_Code | Wait_Time | Priority | |---------|-----------|----------------|-----------|----------| | ADM-001 | P1234 | ER-MAIN | 45 | High | | ADM-002 | P1235 | ER-MAIN | 38 | High | | ADM-003 | P1236 | ER-MAIN | 52 | Critical | | ADM-004 | P1237 | ICU-WEST | 15 | Medium | | ADM-005 | P1238 | ICU-WEST | 18 | Low |

Output: All Department_Code values are standardized to uppercase. The three variations of the emergency room code are unified as "ER-MAIN", and ICU west variations become "ICU-WEST".

Insights: After standardization, the hospital identified that the ER-MAIN department had an average wait time of 45 minutes across all patients, exceeding the 30-minute target. This accurate departmental view enabled resource reallocation that reduced wait times by 25%.

Example 4: Unifying Region Codes in Logistics

Scenario: A logistics company's shipment tracking system contains region codes with mixed capitalization from different booking channels, preventing accurate regional performance analysis and route optimization.

Settings:

  • Attribute Name: Region_Code

Before Enrichment: | Case ID | Shipment_ID | Region_Code | Delivery_Days | Service_Type | |---------|------------|-------------|---------------|--------------| | SHP-001 | S1234 | na-west | 3 | Express | | SHP-002 | S1235 | NA-WEST | 2 | Express | | SHP-003 | S1236 | Na-West | 4 | Standard | | SHP-004 | S1237 | eu-central | 5 | Standard | | SHP-005 | S1238 | EU-Central | 6 | Economy |

After Enrichment: | Case ID | Shipment_ID | Region_Code | Delivery_Days | Service_Type | |---------|------------|-------------|---------------|--------------| | SHP-001 | S1234 | NA-WEST | 3 | Express | | SHP-002 | S1235 | NA-WEST | 2 | Express | | SHP-003 | S1236 | NA-WEST | 4 | Standard | | SHP-004 | S1237 | EU-CENTRAL | 5 | Standard | | SHP-005 | S1238 | EU-CENTRAL | 6 | Economy |

Output: All Region_Code values are converted to uppercase, unifying the different capitalizations into consistent region identifiers.

Insights: Standardization revealed that NA-WEST region averaged 3 days for all deliveries, meeting SLA requirements. Previously scattered data suggested some regions were underperforming due to the fragmented analysis from capitalization variants.

Example 5: Normalizing Status Codes in Financial Processing

Scenario: A bank's loan processing system has status codes that agents enter with varying capitalization, making it difficult to track loan pipeline stages and identify process bottlenecks accurately.

Settings:

  • Attribute Name: Status_Code

Before Enrichment: | Case ID | Loan_ID | Status_Code | Amount | Days_In_Status | |---------|---------|------------|--------|----------------| | LN-001 | L1234 | approved | 50000 | 2 | | LN-002 | L1235 | APPROVED | 75000 | 3 | | LN-003 | L1236 | Approved | 45000 | 2 | | LN-004 | L1237 | pending | 100000 | 5 | | LN-005 | L1238 | PENDING | 85000 | 7 |

After Enrichment: | Case ID | Loan_ID | Status_Code | Amount | Days_In_Status | |---------|---------|------------|--------|----------------| | LN-001 | L1234 | APPROVED | 50000 | 2 | | LN-002 | L1235 | APPROVED | 75000 | 3 | | LN-003 | L1236 | APPROVED | 45000 | 2 | | LN-004 | L1237 | PENDING | 100000 | 5 | | LN-005 | L1238 | PENDING | 85000 | 7 |

Output: All Status_Code values are standardized to uppercase, consolidating status variations into consistent values for accurate pipeline analysis.

Insights: After standardization, the bank discovered that 170,000 in loans (not 50,000 as previously thought) were in approved status, requiring immediate funding arrangement. The pending status showed 185,000 in applications averaging 6 days in review, highlighting the need for additional underwriting resources.

Output

The Upper Case enrichment modifies the selected text attribute in-place, converting all string values to uppercase letters. The transformation affects only the chosen attribute while preserving all other attributes unchanged. The enrichment handles all standard text characters, converting lowercase letters (a-z) to their uppercase equivalents (A-Z) while leaving uppercase letters, numbers, special characters, and symbols unchanged.

The modified attribute retains its original column name and position in your dataset structure. All case-level data relationships are preserved, and the attribute remains available for use in filters, calculators, and other enrichments. Empty strings and null values are handled appropriately - null values remain null, while empty strings remain empty strings.

After applying this enrichment, the standardized uppercase text enables reliable case-insensitive operations throughout mindzie Studio. You can confidently use the transformed attribute in conformance checking, where consistent text matching is critical. The uppercase values work seamlessly with other text-based enrichments like Trim Text or Replace Text, and support accurate grouping in calculators and filters.

See Also

  • Trim Text - Remove leading and trailing whitespace from text attributes
  • Text Start - Extract a specified number of characters from the beginning of text values
  • Text End - Extract a specified number of characters from the end of text values
  • Replace Text - Replace specific text patterns within attribute values
  • Limit Text Length - Truncate text attributes to a maximum character length
  • Categorize Attribute Values - Group text values into categories based on patterns or rules

This documentation is part of the mindzie Studio process mining platform.

An error has occurred. This application may no longer respond until reloaded. Reload ??