Overview
Categorize Duration is a performance categorization enrichment that transforms continuous duration values into discrete performance categories, enabling immediate visual identification of process bottlenecks and performance patterns. This enrichment takes any duration attribute in your event log and classifies each value into one of five performance categories: Fast, Normal, Slow, Extreme, or Negative. By converting numeric duration data into meaningful business categories, teams can quickly identify outliers, filter on performance criteria, and create performance-based visualizations without requiring complex mathematical filters.
This enrichment is essential for process mining analysis because it translates technical timing measurements into business-relevant performance indicators. Rather than analyzing raw duration numbers, users can immediately see which cases are performing well, which need attention, and which represent extreme outliers requiring immediate investigation. The categorization uses intelligent statistical defaults based on percentile analysis of your actual data, ensuring that categories reflect real performance distribution patterns in your processes.
The enrichment works with any duration attribute, whether it measures case-level throughput times, activity-level processing times, or custom duration calculations. This flexibility makes it a fundamental tool for performance analysis across all process mining scenarios, from order-to-cash cycle time analysis to manufacturing lead time categorization and service request handling time assessment.
Common Uses
- Label order fulfillment cases as fast, normal, or slow based on total cycle time to prioritize shipping operations and identify delayed orders requiring expedited handling
- Categorize invoice approval durations to identify exceptional approval times that may indicate missing documentation, escalation requirements, or bottlenecks in the approval workflow
- Classify manufacturing production times by performance category to separate standard production runs from delayed batches requiring root cause analysis and corrective action
- Segment customer service ticket resolution times into performance categories for SLA monitoring, enabling quick identification of tickets at risk of SLA breach
- Analyze procurement cycle times by categorizing purchase order processing durations to identify both efficient procurement cycles and problematic delays requiring vendor follow-up
- Create performance-based process variants by categorizing activity durations, enabling comparison between fast-path and slow-path executions to understand what drives performance differences
- Monitor healthcare patient wait times by categorizing durations between appointment scheduling and treatment to identify capacity issues and optimize resource allocation
Settings
Attribute Name: Select the duration attribute you want to categorize. This must be a TimeSpan attribute in your dataset, such as case duration, time between activities, or any custom duration calculation. The enrichment will analyze this attribute and assign performance categories to each case or event based on the duration values. Common selections include Case Duration for overall case performance analysis, or specific activity pair durations for targeted bottleneck identification.
New Attribute Name: Specify the name for the new categorical attribute that will be created. The default format is "[Attribute Name] - Category", which clearly indicates the source duration and the categorical nature of the new attribute. This new attribute will contain text values (Fast, Normal, Slow, Extreme, or Negative) and can be used in filters, color coding, and dashboard visualizations. Choose a descriptive name that makes the performance categorization immediately recognizable in your analysis.
Fast Duration: Define the upper threshold for the "Fast" performance category. Any duration less than or equal to this value will be labeled as "Fast". You can specify the threshold value and select the time unit (Days, Hours, Minutes, or Seconds). When you first select an attribute, this threshold is automatically calculated as the 20th percentile of all duration values in your dataset, representing the fastest 20% of cases or events. Adjust this threshold based on your business requirements and performance targets.
Normal Duration: Define the upper threshold for the "Normal" performance category. Durations greater than the Fast threshold but less than or equal to this value will be labeled as "Normal". This represents typical, expected performance in your process. The default is automatically calculated as the 80th percentile of your data, meaning 80% of cases fall within the Fast and Normal categories combined. This threshold should align with your standard operating procedures and expected service levels.
Slow Duration: Define the upper threshold for the "Slow" performance category. Durations greater than the Normal threshold but less than or equal to this value will be labeled as "Slow". These cases require attention but may not yet be critical outliers. The default is automatically calculated as the 90th percentile, identifying the slowest 10% of cases for investigation. Slow cases often indicate process inefficiencies, resource constraints, or minor complications that add delay without being exceptional.
Extreme Category: Any duration greater than the Slow Duration threshold is automatically categorized as "Extreme". This category represents exceptional outliers that require immediate investigation. Extreme cases often indicate process failures, system errors, extended waiting periods, or unusual circumstances. No threshold setting is required as this category captures all durations beyond the Slow threshold, typically representing less than 10% of cases but often accounting for significant process variation.
Negative Category: Any duration with a negative value is automatically categorized as "Negative". Negative durations typically indicate data quality issues, timestamp errors, or cases where activities occurred out of expected sequence. This category helps identify data anomalies that may require data cleanup or process investigation. No threshold setting is required as this category is automatically applied to all negative duration values.
Reset Button: Click this button to recalculate the Fast, Normal, and Slow duration thresholds based on the current dataset using the default percentile method (20th, 80th, and 90th percentiles). This is useful when you've manually adjusted thresholds and want to return to statistically derived defaults, or when analyzing a new dataset with different performance characteristics. The reset function ensures thresholds always reflect the actual distribution of your data.
Examples
Example 1: Order Fulfillment Performance Classification
Scenario: An e-commerce company wants to categorize order fulfillment performance to identify slow orders requiring expedited shipping and fast orders that can serve as best practice examples. They have calculated case duration representing time from order placement to shipment completion and need to classify 10,000 daily orders into performance categories for operational dashboards and automated alerting.
Settings:
- Attribute Name: Case Duration
- New Attribute Name: Fulfillment Performance
- Fast Duration: 4 Hours
- Normal Duration: 12 Hours
- Slow Duration: 24 Hours
Output:
The enrichment creates a new case attribute named "Fulfillment Performance" with text values:
- "Fast" for orders completed within 4 hours (approximately 2,000 orders per day representing same-day processing)
- "Normal" for orders completed between 4 and 12 hours (approximately 7,000 orders representing standard overnight processing)
- "Slow" for orders completed between 12 and 24 hours (approximately 800 orders requiring attention)
- "Extreme" for orders taking more than 24 hours (approximately 200 orders requiring immediate investigation)
- "Negative" for any orders with timestamp errors (rare data quality issues)
Sample data:
| Order ID | Case Duration | Fulfillment Performance |
|---|---|---|
| ORD-10234 | 2h 15m | Fast |
| ORD-10235 | 8h 30m | Normal |
| ORD-10236 | 18h 45m | Slow |
| ORD-10237 | 36h 20m | Extreme |
| ORD-10238 | 3h 50m | Fast |
Insights: The categorization reveals that 80% of orders meet expected performance targets (Fast and Normal), while 10% are slow and need attention. The 2% of Extreme cases can be immediately filtered for root cause analysis, often revealing inventory issues, shipping carrier delays, or address verification problems. Fast orders can be analyzed to identify success patterns such as specific product types, warehouse locations, or order characteristics that enable rapid fulfillment.
Example 2: Invoice Approval Cycle Time Analysis
Scenario: A finance department processes 5,000 invoices monthly and wants to understand approval performance. They have calculated the duration between invoice receipt and final approval, with concerns that slow approvals cause late payment penalties and vendor relationship issues. The team needs to categorize approval times to create performance-based filters and identify exceptional delays requiring escalation.
Settings:
- Attribute Name: Approval Duration
- New Attribute Name: Approval Performance Category
- Fast Duration: 2 Days
- Normal Duration: 5 Days
- Slow Duration: 10 Days
Output:
A new case attribute "Approval Performance Category" is created with performance classifications:
- "Fast" for invoices approved within 2 business days (approximately 1,000 invoices representing pre-approved vendors or low-value purchases)
- "Normal" for invoices approved within 5 business days (approximately 3,500 invoices meeting payment terms)
- "Slow" for invoices requiring 5-10 days (approximately 400 invoices approaching payment deadlines)
- "Extreme" for invoices taking more than 10 days (approximately 100 invoices at risk of late payment penalties)
Sample data:
| Invoice ID | Amount | Approval Duration | Approval Performance Category |
|---|---|---|---|
| INV-45001 | $1,250 | 1d 8h | Fast |
| INV-45002 | $45,000 | 4d 12h | Normal |
| INV-45003 | $8,500 | 7d 18h | Slow |
| INV-45004 | $125,000 | 15d 6h | Extreme |
| INV-45005 | $950 | 1d 2h | Fast |
Insights: The categorization enables the finance team to create performance dashboards showing real-time approval status distribution. Extreme cases are automatically escalated to senior management for investigation, often revealing missing purchase orders, multi-department approval requirements, or disputed invoice amounts. Analysis of Fast cases reveals that vendor master data quality and pre-approved vendor status are key drivers of rapid approval, leading to a vendor onboarding improvement initiative.
Example 3: Manufacturing Production Batch Timing
Scenario: A pharmaceutical manufacturing facility produces batches of medication with strict quality control requirements. Production planners need to categorize batch production times to identify both efficiency wins and problematic delays. With 200 batches per month, they want to classify production duration from batch start to final quality approval to optimize production scheduling and capacity planning.
Settings:
- Attribute Name: Batch Production Time
- New Attribute Name: Production Performance
- Fast Duration: 18 Hours
- Normal Duration: 26 Hours
- Slow Duration: 36 Hours
Output:
The enrichment creates "Production Performance" with categories applied to each batch:
- "Fast" for batches completed within 18 hours (approximately 40 batches representing optimal conditions)
- "Normal" for batches completed in 18-26 hours (approximately 140 batches meeting production targets)
- "Slow" for batches requiring 26-36 hours (approximately 15 batches with minor delays)
- "Extreme" for batches exceeding 36 hours (approximately 5 batches with significant issues)
Sample data:
| Batch ID | Product Code | Batch Production Time | Production Performance |
|---|---|---|---|
| B-2024-0456 | MED-XR-500 | 16h 45m | Fast |
| B-2024-0457 | MED-AB-250 | 24h 30m | Normal |
| B-2024-0458 | MED-XR-500 | 32h 15m | Slow |
| B-2024-0459 | MED-CD-100 | 48h 20m | Extreme |
| B-2024-0460 | MED-AB-250 | 22h 10m | Normal |
Insights: The production performance categories reveal that 90% of batches meet expected timelines, providing confidence in capacity planning. Slow and Extreme batches undergo detailed root cause analysis, identifying equipment maintenance issues, raw material quality variations, and environmental control problems as primary delay factors. Fast batches are studied to understand optimal production conditions, leading to improved standard operating procedures and reduced average production time by 8%.
Example 4: Customer Service Ticket Resolution Time
Scenario: A software company's support team handles 8,000 support tickets monthly with various SLA commitments based on customer tier. Support management wants to categorize ticket resolution times to monitor SLA performance, identify tickets at risk of breach, and analyze resolution efficiency patterns. They need performance categories that align with their 48-hour standard SLA target.
Settings:
- Attribute Name: Resolution Time
- New Attribute Name: Resolution Performance
- Fast Duration: 12 Hours
- Normal Duration: 36 Hours
- Slow Duration: 72 Hours
Output:
A new "Resolution Performance" attribute classifies each ticket:
- "Fast" for tickets resolved within 12 hours (approximately 3,200 tickets representing excellent service)
- "Normal" for tickets resolved within 36 hours (approximately 4,000 tickets meeting SLA)
- "Slow" for tickets requiring 36-72 hours (approximately 600 tickets approaching SLA limit)
- "Extreme" for tickets exceeding 72 hours (approximately 200 tickets representing SLA failures)
Sample data:
| Ticket ID | Priority | Resolution Time | Resolution Performance |
|---|---|---|---|
| TKT-89234 | High | 4h 25m | Fast |
| TKT-89235 | Medium | 28h 15m | Normal |
| TKT-89236 | Low | 52h 40m | Slow |
| TKT-89237 | High | 96h 30m | Extreme |
| TKT-89238 | Medium | 8h 10m | Fast |
Insights: Performance categorization enables automated alerting when tickets enter the Slow category, allowing proactive escalation before SLA breach. Analysis reveals that Fast resolution is strongly correlated with clear problem descriptions and availability of diagnostic information, leading to improved ticket submission templates. Extreme cases are reviewed weekly, identifying knowledge gaps and training opportunities for support engineers, resulting in a 15% improvement in average resolution time over six months.
Example 5: Healthcare Patient Wait Time Monitoring
Scenario: A hospital emergency department treats 500 patients daily and needs to monitor wait times between patient registration and initial physician assessment. Department leadership wants to categorize wait times to ensure compliance with quality standards, optimize staffing levels, and identify capacity bottlenecks during peak hours. Performance categories will drive real-time dashboards and historical trend analysis.
Settings:
- Attribute Name: Registration to Assessment Duration
- New Attribute Name: Wait Time Category
- Fast Duration: 15 Minutes
- Normal Duration: 45 Minutes
- Slow Duration: 90 Minutes
Output:
The enrichment creates "Wait Time Category" for each patient visit:
- "Fast" for patients assessed within 15 minutes (approximately 150 patients with immediate triage)
- "Normal" for patients assessed within 45 minutes (approximately 280 patients meeting standards)
- "Slow" for patients waiting 45-90 minutes (approximately 60 patients experiencing delays)
- "Extreme" for patients waiting over 90 minutes (approximately 10 patients with unacceptable delays)
Sample data:
| Visit ID | Triage Level | Registration to Assessment Duration | Wait Time Category |
|---|---|---|---|
| ED-20240615-001 | Critical | 3m 15s | Fast |
| ED-20240615-002 | Urgent | 28m 45s | Normal |
| ED-20240615-003 | Standard | 62m 20s | Slow |
| ED-20240615-004 | Standard | 125m 10s | Extreme |
| ED-20240615-005 | Urgent | 12m 30s | Fast |
Insights: Real-time performance monitoring reveals that 86% of patients receive assessment within acceptable timeframes, but Slow and Extreme cases cluster during evening shift changes and weekend peak hours. This leads to adjusted staffing models with overlapping shift coverage during high-risk periods. Analysis of Fast assessments identifies efficient triage protocols and optimal physician-to-patient ratios, which are implemented as department-wide standards, reducing average wait times by 22%.
Output
The Categorize Duration enrichment creates a single new attribute in your dataset with a text data type containing performance category labels. This attribute appears as a case attribute if you selected a case-level duration attribute, or as an event attribute if you selected an event-level duration attribute. The new attribute is automatically categorized under the Performance attribute type in mindzieStudio, ensuring it appears in performance-related visualizations and analysis tools.
The output attribute contains one of five possible text values for each case or event:
Fast: Duration is less than or equal to the Fast Duration threshold. Represents best-in-class performance, often suitable as benchmark examples for process improvement initiatives. Fast cases typically represent 15-25% of your dataset when using default percentile settings.
Normal: Duration is greater than the Fast Duration threshold but less than or equal to the Normal Duration threshold. Represents typical, expected performance that meets business standards and service level agreements. Normal cases typically represent 55-65% of your dataset, forming the core of your standard process execution.
Slow: Duration is greater than the Normal Duration threshold but less than or equal to the Slow Duration threshold. Represents below-average performance requiring attention, investigation, or process improvement. Slow cases typically represent 8-12% of your dataset and often indicate minor bottlenecks or inefficiencies.
Extreme: Duration is greater than the Slow Duration threshold. Represents exceptional outliers requiring immediate investigation and potentially representing process failures, system errors, or unusual circumstances. Extreme cases typically represent 2-10% of your dataset but often account for significant process variation and customer dissatisfaction.
Negative: Duration has a negative value, indicating data quality issues such as timestamp errors, out-of-sequence events, or data extraction problems. Negative cases should trigger data validation and cleanup processes. These cases are rare in well-maintained event logs but provide important data quality indicators.
The categorical attribute can be used throughout mindzieStudio for:
Filtering: Create filters to isolate specific performance categories, such as showing only Extreme cases for detailed investigation or excluding Slow cases from benchmark analysis.
Color Coding: Apply color-based visualizations in process maps, dashboards, and charts where Fast appears green, Normal appears blue, Slow appears yellow, and Extreme appears red, providing immediate visual performance identification.
Variant Analysis: Segment process variants by performance category to understand how case paths differ between fast and slow executions, identifying bottleneck activities and inefficient routing patterns.
Dashboard Metrics: Display performance distribution showing percentage of cases in each category, trend analysis showing category changes over time, and real-time monitoring of cases entering Slow or Extreme categories.
Drill-Down Analysis: Use performance categories as entry points for detailed case analysis, enabling quick navigation from high-level performance summaries to specific case details requiring investigation.
Automated Alerting: Configure alerts when cases enter specific performance categories, such as notifications when Extreme cases exceed threshold counts or when Slow case percentages increase beyond acceptable limits.
Comparative Analysis: Compare performance categories across process dimensions such as organizational units, product types, customer segments, or time periods to identify performance patterns and improvement opportunities.
The categorical nature of the output makes it significantly more user-friendly than working with raw duration numbers, enabling business users to quickly understand process performance without requiring technical expertise in duration calculations or statistical analysis. The attribute integrates seamlessly with all mindzieStudio calculators, enrichments, and visualization components.
This documentation is part of the mindzie Studio process mining platform.