Overview
The Remove Repeated Activities enrichment simplifies your process by consolidating consecutive duplicate activities into single occurrences while preserving important information about how many times each activity was repeated. This powerful data cleanup tool is essential for analyzing processes where the same activity may be executed multiple times in succession, either due to system behavior, user actions, or process design.
When activities repeat consecutively in a case, they can obscure the true process flow and make it difficult to identify meaningful patterns. This enrichment removes the noise by collapsing repeated activities while creating a count attribute that tracks how many times the activity occurred. You can also choose to preserve event-level attribute values from the repeated activities by concatenating them, ensuring no critical information is lost during the consolidation.
The enrichment offers two modes of operation: strict consecutive repetition (where activities must follow each other directly) or flexible repetition (where all instances of an activity are collapsed regardless of intervening activities). This flexibility allows you to tailor the enrichment to your specific process analysis needs.
Common Uses
- Simplify process flows by removing stutter patterns caused by automated retry logic
- Clean up event logs where users repeatedly click buttons or refresh pages
- Consolidate polling or status-check activities that occur consecutively
- Reduce process complexity when analyzing high-frequency monitoring activities
- Prepare data for process discovery by eliminating repetitive noise
- Track how many times activities were repeated before progressing to the next step
- Preserve attribute values from repeated activities through concatenation for audit trails
Settings
Activity Name: Select the activity you want to consolidate when it repeats consecutively. The enrichment will identify all instances where this activity occurs multiple times and collapse them into a single event. Choose activities that are known to repeat in your process, such as retry attempts, status checks, or user interactions.
Count Column Name: Specify the name of the new attribute that will store the count of how many times the activity was repeated. This attribute is automatically populated with the number of consecutive occurrences that were consolidated. The default naming pattern is "[Activity Name]_Count", but you can customize this to match your organization's naming conventions. For example, if you're removing repeated "Payment Retry" activities, you might name this "Payment_Retry_Attempts".
Concatenate Attributes (Optional): Select one or more event-level string attributes whose values you want to preserve from the repeated activities. When multiple instances are collapsed, the values from these attributes will be concatenated together with comma separation. This is particularly useful when each repetition contains different contextual information, such as error messages, timestamps, or user IDs. Only string-type event attributes that are not calculated and not hidden are available for concatenation.
Must Follow Directly: Control how the enrichment identifies repeated activities:
- Enabled (default): Only removes activities that occur consecutively without any intervening activities. For example, in the sequence "A, B, B, B, C", it would collapse the three consecutive B's into one. This is the most common and conservative approach.
- Disabled: Removes all instances of the selected activity throughout the case, keeping only the first occurrence regardless of whether other activities occur in between. For example, in the sequence "A, B, C, B, D, B", it would keep only the first B and remove the others. Use this mode with caution as it fundamentally changes the process flow.
Examples
Example 1: Payment Processing Retry Logic
Scenario: An e-commerce platform has automated retry logic for payment processing. When a payment fails due to network issues or temporary card authorization problems, the system automatically retries up to 5 times before giving up. These retry attempts clutter the process map and make it difficult to see the actual customer journey.
Settings:
- Activity Name: "Process Payment"
- Count Column Name: "Payment_Retry_Count"
- Concatenate Attributes: "Error_Message", "Gateway_Response"
- Must Follow Directly: Enabled
Output: The enrichment consolidates consecutive payment processing attempts into a single "Process Payment" activity with additional context:
- New attribute: "Payment_Retry_Count" containing values like 1 (no retries), 2 (one retry), or 5 (four retries)
- Event attribute "Error_Message" contains all error messages concatenated: "Network timeout, Network timeout, Card declined"
- Event attribute "Gateway_Response" contains all responses: "503, 503, 402"
Sample case transformation:
- Before: Process Payment (failed) -> Process Payment (failed) -> Process Payment (failed) -> Process Payment (success)
- After: Process Payment (success) with Payment_Retry_Count = 4
Insights: The business can now analyze payment success rates more accurately by seeing how many retry attempts were needed. Cases with high retry counts may indicate integration issues with specific payment gateways or problems during peak traffic periods.
Example 2: Customer Service Status Checks
Scenario: A customer service ticketing system has an automated process that checks ticket status every 5 minutes while waiting for customer response. These status checks create hundreds of events in long-running cases, making process analysis nearly impossible.
Settings:
- Activity Name: "Check Ticket Status"
- Count Column Name: "Status_Check_Count"
- Concatenate Attributes: (none selected)
- Must Follow Directly: Enabled
Output: Consecutive status check activities are consolidated into single events. A case that had 50 status checks between "Send Email to Customer" and "Customer Response Received" now shows just one "Check Ticket Status" activity with Status_Check_Count = 50.
Insights: Analysts can now see the actual customer interaction flow without the noise of automated polling. The status check count reveals how long tickets typically wait for customer response, which can be correlated with ticket resolution times and customer satisfaction.
Example 3: Manufacturing Quality Inspection Retests
Scenario: In a pharmaceutical manufacturing process, quality inspection failures trigger immediate retests up to 3 times before the batch is rejected. The company wants to track how many retests occur while maintaining clean process flows for analysis.
Settings:
- Activity Name: "Quality Inspection"
- Count Column Name: "Inspection_Attempts"
- Concatenate Attributes: "Inspector_ID", "Test_Results", "Failure_Reason"
- Must Follow Directly: Enabled
Output: Multiple consecutive quality inspections are consolidated with complete audit information:
- Inspection_Attempts: Number of times the batch was inspected (1-4)
- Inspector_ID concatenated: "INSP_001, INSP_001, INSP_002" (shows if different inspectors were involved)
- Test_Results concatenated: "FAIL, FAIL, PASS" (shows the progression)
- Failure_Reason concatenated: "pH out of range, pH out of range, " (shows what was wrong)
Insights: The company can analyze first-pass yield rates (Inspection_Attempts = 1) versus rework rates (Inspection_Attempts > 1) while maintaining complete traceability of who inspected and why tests failed.
Example 4: IT Support Ticket Reassignment
Scenario: An IT helpdesk has a problem with tickets being reassigned multiple times between support agents before resolution. Each reassignment creates a "Reassign Ticket" activity, making it hard to analyze the actual resolution steps.
Settings:
- Activity Name: "Reassign Ticket"
- Count Column Name: "Reassignment_Count"
- Concatenate Attributes: "Assigned_To", "Reassignment_Reason"
- Must Follow Directly: Enabled
Output: Multiple consecutive reassignments are consolidated:
- Reassignment_Count: Total number of reassignments (indicates ticket bouncing)
- Assigned_To concatenated: "Agent_A, Agent_B, Agent_C, Agent_D" (shows the escalation path)
- Reassignment_Reason concatenated: "Wrong department, Requires senior agent, Requires system admin" (shows why)
Insights: High reassignment counts indicate poor initial ticket routing or unclear responsibility assignments. The concatenated agent names reveal common escalation patterns, helping optimize ticket distribution rules.
Example 5: Document Approval Workflow Revisions
Scenario: A document management system allows reviewers to send documents back for revision multiple times. The organization wants to track revision cycles while keeping process maps focused on the overall approval workflow.
Settings:
- Activity Name: "Request Revisions"
- Count Column Name: "Revision_Cycles"
- Concatenate Attributes: "Reviewer_Comments"
- Must Follow Directly: Enabled
Output: Consecutive revision requests are consolidated:
- Revision_Cycles: Number of times the document was sent back (quality indicator)
- Reviewer_Comments concatenated: "Fix formatting, Update references, Correct calculations" (complete feedback history)
Insights: Documents requiring many revision cycles may indicate unclear requirements or inadequate initial quality checks. The concatenated comments provide a complete audit trail of the review process while keeping the process map clean and analyzable.
Output
The Remove Repeated Activities enrichment modifies your event log in two significant ways:
Event Reduction: Consecutive occurrences of the selected activity are consolidated into a single event. The enrichment keeps the first occurrence and hides all subsequent repetitions, reducing the total number of events in your dataset. This consolidation happens at the case level, so different cases may have different numbers of events removed depending on their repetition patterns.
New Count Attribute: A new event-level integer attribute is created with the name you specified in "Count Column Name". This attribute is populated on the consolidated event with the total number of occurrences that were collapsed together. For events that had no repetition, the value is 1. For consolidated events, the value indicates how many times the activity occurred consecutively (for example, 4 means the activity happened 4 times in a row).
Concatenated Attribute Values: If you selected attributes to concatenate, the values from all repeated events are combined into a single comma-separated string and stored in the consolidated event. This preserves important contextual information that might differ between repetitions, such as error messages, user IDs, or timestamps. The concatenation occurs in chronological order, so you can see the progression of values across repetitions.
Process View Impact: After applying this enrichment, your process maps and variants will show simplified flows without the repetitive loops caused by consecutive identical activities. Cases that previously showed loops like "A -> B -> B -> B -> C" will now display as "A -> B -> C", making it easier to identify the core process structure. However, you retain the ability to analyze repetition patterns using the count attribute in filters and calculators.
Use Cases for the Count Attribute: The new count attribute can be used in:
- Filters: "Show only cases where Payment_Retry_Count > 3" to find problematic payment processing
- Calculators: Average or sum the count across cases to measure overall retry rates
- Performance analysis: Correlate high counts with longer processing times
- Quality metrics: Track first-pass success rates by counting events where count = 1
- Visualizations: Create histograms showing the distribution of retry attempts
Data Integrity: The enrichment maintains full data integrity by preserving timestamp information (uses the timestamp of the first occurrence) and allowing concatenation of important attribute values. No data is permanently deleted; instead, repeated events are marked as hidden and can be revealed if needed by removing the enrichment.
This documentation is part of the mindzie Studio process mining platform.