Overview
The Limit Text Length enrichment is a data cleanup operator that automatically truncates text values in your dataset to a specified maximum number of characters. This essential data standardization tool helps manage text fields that exceed desired length limits, ensuring consistency across your process mining dataset and preventing issues with downstream analysis, visualization, and system integrations. When working with data from various sources, text fields often contain excessively long values that can impact performance, readability, and compatibility with other systems.
This enrichment intelligently processes both case-level and event-level text attributes, preserving the original meaning while enforcing length constraints. Unlike manual truncation approaches that risk data corruption or inconsistency, this operator applies uniform truncation rules across your entire dataset. The enrichment is particularly valuable when preparing data for dashboards where long text values can disrupt layouts, or when integrating with systems that have strict character limits for certain fields.
Common Uses
- Standardize description fields that contain verbose text from ERP systems or ticketing platforms
- Prepare data for visualization in dashboards where long text values break table layouts or chart readability
- Enforce character limits before exporting data to systems with strict field length requirements
- Truncate lengthy comment fields while preserving the most important initial information
- Standardize product names, customer names, or reference codes to consistent maximum lengths
- Improve performance of process mining analysis by reducing memory usage from excessively long text values
- Create uniform text fields for better alignment in reports and exported documents
Settings
Attribute Name: Select the text attribute you want to limit. The dropdown displays all available text attributes from both case-level and event-level data. Only string/text type attributes are shown as valid selections. This is a required field that determines which column in your dataset will have its values truncated.
Maximum Length: Specify the maximum number of characters to retain. Any text value exceeding this length will be truncated to exactly this number of characters. The value must be greater than 0. Default value is 100 characters. Common values include:
- 50 characters for short descriptions or codes
- 100 characters for standard text fields
- 255 characters for compatibility with many database systems
- 500 characters for longer descriptions while still maintaining readability
Examples
Example 1: Standardizing Product Descriptions in Manufacturing
Scenario: A manufacturing company's product catalog contains detailed technical descriptions that can exceed 1000 characters, causing issues in their process mining dashboards and making reports difficult to read.
Settings:
- Attribute Name: Product_Description
- Maximum Length: 150
Before Enrichment: | Case ID | Product_Description | Order_Value | |---------|-------------------|-------------| | ORD-001 | "High-precision CNC machined aluminum component with aerospace-grade 7075-T6 alloy, featuring complex 5-axis milling patterns, anodized finish in matte black, tolerances within 0.001 inches, designed for critical aviation applications requiring maximum strength-to-weight ratio and corrosion resistance in extreme environmental conditions including salt spray, temperature variations from -60C to 150C, and high vibration environments typical of turbine engine mounting applications" | $12,500 | | ORD-002 | "Standard steel bracket, zinc plated" | $45 | | ORD-003 | "Custom fabricated stainless steel assembly with multiple welded joints, polished to mirror finish, designed for pharmaceutical clean room applications with full FDA compliance and documentation package included" | $3,200 |
After Enrichment: | Case ID | Product_Description | Order_Value | |---------|-------------------|-------------| | ORD-001 | "High-precision CNC machined aluminum component with aerospace-grade 7075-T6 alloy, featuring complex 5-axis milling patterns, anodized finis" | $12,500 | | ORD-002 | "Standard steel bracket, zinc plated" | $45 | | ORD-003 | "Custom fabricated stainless steel assembly with multiple welded joints, polished to mirror finish, designed for pharmaceutical clean room ap" | $3,200 |
Output: Product descriptions are truncated to exactly 150 characters. Short descriptions remain unchanged while longer ones are cut at the character limit.
Insights: After standardizing description lengths, dashboard performance improved by 40%, and product categorization reports became more readable. The team discovered that 85% of critical product information appeared in the first 150 characters, making this truncation suitable for analysis while maintaining full descriptions in the source system.
Example 2: Managing Customer Feedback Comments in Service Processes
Scenario: A telecommunications company's customer service system captures detailed customer complaints that can be several paragraphs long, making it difficult to analyze patterns in their service process mining.
Settings:
- Attribute Name: Customer_Feedback
- Maximum Length: 200
Event Data Before: | Case ID | Activity | Customer_Feedback | Timestamp | |---------|----------|------------------|-----------| | TICKET-001 | Create Ticket | "Internet connection has been extremely unreliable for the past three weeks. Speed drops to almost nothing during evening hours between 7-10 PM. Have restarted modem multiple times, checked all cables, even replaced the router with my own but problem persists. This is affecting my ability to work from home and my children cannot complete their online homework. Previous technician visit on March 15 did not resolve the issue. Need immediate resolution as I'm considering switching providers if this continues. Very frustrated with the lack of consistent service despite paying for the premium package." | 2024-03-20 14:30 | | TICKET-002 | Create Ticket | "Bill incorrect - charged twice" | 2024-03-20 15:15 |
Event Data After: | Case ID | Activity | Customer_Feedback | Timestamp | |---------|----------|------------------|-----------| | TICKET-001 | Create Ticket | "Internet connection has been extremely unreliable for the past three weeks. Speed drops to almost nothing during evening hours between 7-10 PM. Have restarted modem multiple times, checked all ca" | 2024-03-20 14:30 | | TICKET-002 | Create Ticket | "Bill incorrect - charged twice" | 2024-03-20 15:15 |
Output: Customer feedback is limited to 200 characters, preserving the beginning of each message where the main issue is typically stated.
Insights: Text mining on the truncated feedback revealed that 92% of issues could be categorized from the first 200 characters. Process analysis showed that tickets with feedback over 200 characters had 35% longer resolution times, indicating complex issues requiring escalation.
Example 3: Preparing Purchase Order Data for System Integration
Scenario: A procurement department needs to export purchase order data to a legacy accounting system that has a 50-character limit for vendor names, but their current data contains full legal company names that can exceed 200 characters.
Settings:
- Attribute Name: Vendor_Name
- Maximum Length: 50
Before Enrichment: | Case ID | Vendor_Name | PO_Amount | |---------|------------|-----------| | PO-2024-001 | "International Business Machines Corporation (IBM) Global Technology Services Division" | $125,000 | | PO-2024-002 | "Acme Inc." | $3,500 | | PO-2024-003 | "Johnson & Johnson Consumer Healthcare Products Manufacturing and Distribution Limited Partnership" | $45,750 |
After Enrichment: | Case ID | Vendor_Name | PO_Amount | |---------|------------|-----------| | PO-2024-001 | "International Business Machines Corporation (IBM" | $125,000 | | PO-2024-002 | "Acme Inc." | $3,500 | | PO-2024-003 | "Johnson & Johnson Consumer Healthcare Products Ma" | $45,750 |
Output: Vendor names are truncated to 50 characters to meet system requirements while maintaining enough information for identification.
Insights: The truncation allowed successful integration with the legacy system while maintaining vendor identifiability. Analysis showed that 78% of vendor names were already under 50 characters, and the truncated names still retained enough information for unique identification in procurement reports.
Example 4: Optimizing Activity Names in Process Mining
Scenario: An insurance claims process has activity names that include detailed sub-process information, making process maps cluttered and difficult to read.
Settings:
- Attribute Name: Activity_Name
- Maximum Length: 30
Event Data Before: | Case ID | Activity_Name | Resource | Timestamp | |---------|--------------|----------|-----------| | CLAIM-001 | "Initial Claim Review and Documentation Verification by Senior Adjuster" | John Smith | 2024-03-15 09:00 | | CLAIM-001 | "Medical Records Request Sent to Healthcare Provider via Secure Portal" | Sarah Johnson | 2024-03-15 10:30 | | CLAIM-001 | "Approve" | Mark Davis | 2024-03-15 14:00 |
Event Data After: | Case ID | Activity_Name | Resource | Timestamp | |---------|--------------|----------|-----------| | CLAIM-001 | "Initial Claim Review and Docu" | John Smith | 2024-03-15 09:00 | | CLAIM-001 | "Medical Records Request Sent " | Sarah Johnson | 2024-03-15 10:30 | | CLAIM-001 | "Approve" | Mark Davis | 2024-03-15 14:00 |
Output: Activity names are limited to 30 characters, creating more concise labels for process visualization.
Insights: The shortened activity names improved process map readability by 60% while retaining the essential information about each step. Process analysts could now identify bottlenecks more quickly, and the standardized lengths made activity frequency analysis more accurate.
Example 5: Standardizing Reference Numbers Across Systems
Scenario: A logistics company consolidates shipment data from multiple carriers, each using different reference number formats with varying lengths, causing issues in their unified tracking dashboard.
Settings:
- Attribute Name: Tracking_Reference
- Maximum Length: 25
Before Enrichment: | Case ID | Tracking_Reference | Carrier | Status | |---------|-------------------|---------|--------| | SHIP-001 | "UPS1Z9999999999999999-EXPEDITED-INTERNATIONAL-PRIORITY" | UPS | In Transit | | SHIP-002 | "FEDEX777888999000" | FedEx | Delivered | | SHIP-003 | "DHL-EXPR-WORLDWIDE-DOC-999888777666555-PREPAID-MORNING-DELIVERY" | DHL | Processing |
After Enrichment: | Case ID | Tracking_Reference | Carrier | Status | |---------|-------------------|---------|--------| | SHIP-001 | "UPS1Z9999999999999999-EXP" | UPS | In Transit | | SHIP-002 | "FEDEX777888999000" | FedEx | Delivered | | SHIP-003 | "DHL-EXPR-WORLDWIDE-DOC-99" | DHL | Processing |
Output: Tracking references are standardized to a maximum of 25 characters while preserving the most important identifying information.
Insights: Standardizing reference lengths enabled creation of a unified tracking dashboard that could display all carriers' information consistently. The company found that the core tracking number always appeared within the first 25 characters, making this truncation ideal for their reporting needs.
Output
The Limit Text Length enrichment modifies text attribute values directly in your dataset without creating new attributes. The enrichment operates on the selected attribute whether it's a case attribute or an event attribute:
For Case Attributes: Each unique case in your dataset has its selected text attribute value checked and truncated if it exceeds the specified maximum length. The truncation happens at exactly the character limit specified, potentially cutting words mid-way.
For Event Attributes: Every event row in your dataset has its selected text attribute value checked and truncated if necessary. This means the same attribute might be truncated differently across different events depending on the original values.
Important Characteristics:
- Original attribute names remain unchanged
- Data type remains as string/text
- Values shorter than or equal to the maximum length remain completely unchanged
- Null or empty values are not affected
- Truncation occurs at the exact character position without considering word boundaries
- Special characters, spaces, and punctuation count toward the character limit
- No ellipsis (...) or other indicators are added to show truncation
The modified attribute values are immediately available for use in filters, calculators, and other enrichments. This in-place modification ensures that all subsequent operations in your process mining analysis use the standardized text lengths.
See Also
- Trim Text - Remove leading and trailing whitespace from text attributes
- Upper Case - Convert text attributes to uppercase for standardization
- Text Start - Extract a specified number of characters from the beginning of text values
- Text End - Extract a specified number of characters from the end of text values
- Find and Replace - Replace specific text patterns within attribute values
- Concatenate Attributes - Combine multiple text attributes into a single field
This documentation is part of the mindzie Studio process mining platform.