Overview
The Remove Activity Loops filter simplifies process traces by eliminating duplicate activity occurrences within cases. This filter helps clean your event data by removing repetitive activities that can make process analysis more complex. You can choose to remove only consecutive duplicates (activities that repeat back-to-back) or all duplicates throughout the entire case, keeping only the first occurrence of each unique activity.
This filter is particularly valuable when analyzing processes where activities are recorded multiple times due to system logging behavior, data quality issues, or genuine process loops that you want to simplify for clearer analysis.
Common Uses
- Process Visualization: Simplify process maps by removing activity loops that make process flows difficult to understand and analyze.
- Data Quality Improvement: Clean event logs where system errors or integration issues cause duplicate activity recordings.
- Variant Analysis: Reduce the number of process variants by eliminating loop-based variations and focusing on core process paths.
- Performance Measurement: Get more accurate duration metrics by removing duplicate activities that artificially inflate processing times.
- Compliance Analysis: Identify the essential sequence of activities without repetitions to check conformance to standard processes.
- Process Mining Preparation: Prepare cleaner datasets for process discovery and conformance checking algorithms.
Settings
Deduplication Method: Choose how duplicate activities should be identified and removed:
Directly Follows Deduplication: Removes only consecutive duplicate activities. If the same activity appears multiple times in a row, only the first occurrence is kept. Non-consecutive duplicates are preserved. For example, "A, B, B, C, D, D, D" becomes "A, B, C, D".
Global Deduplication: Removes all duplicate activities throughout the entire case, keeping only the first occurrence of each unique activity. For example, "A, C, B, C, D" becomes "A, C, B, D" (the second "C" is removed even though it wasn't consecutive).
Examples
Example 1: Simplifying Consecutive System Logs
Scenario: Your order fulfillment system logs the "Check Inventory" activity multiple times consecutively when processing large orders, creating noise in your process analysis. You want to simplify these consecutive duplicates while preserving the overall process flow.
Settings:
- Deduplication Method: Directly Follows Deduplication
Result: Cases with consecutive duplicate activities are simplified. For example, if a case had the sequence "Create Order, Check Inventory, Check Inventory, Check Inventory, Pick Items, Pack Order", it becomes "Create Order, Check Inventory, Pick Items, Pack Order". Non-consecutive duplicates remain unchanged.
Insights: This approach is ideal when you want to clean up logging artifacts while preserving genuine process loops. If "Check Inventory" appears again later in the process (not consecutively), it would be kept because it represents a different process step.
Example 2: Finding True Process Paths
Scenario: Your customer service process has multiple review activities, and cases often loop back through the same activities. You want to identify the unique activities performed in each case without considering how many times they were repeated.
Settings:
- Deduplication Method: Global Deduplication
Result: All cases are reduced to their unique activity sequences. A case like "Open Ticket, Assign Agent, Review, Escalate, Review, Resolve" becomes "Open Ticket, Assign Agent, Review, Escalate, Resolve". Every duplicate activity is removed, keeping only the first occurrence.
Insights: This is useful for understanding which activities were performed in each case without considering frequency. It helps identify the essential process steps and can reveal the minimal path through your process.
Example 3: Cleaning Data Entry Errors
Scenario: Manual data entry in your approval process sometimes results in the same approval activity being recorded twice in succession due to user error or system refresh issues.
Settings:
- Deduplication Method: Directly Follows Deduplication
Result: Consecutive duplicate approvals are removed while preserving cases where genuine re-approvals occurred after other activities. For example, "Submit, Approve, Approve" becomes "Submit, Approve", but "Submit, Approve, Modify, Approve" remains unchanged.
Insights: This cleans data quality issues without distorting the process. Legitimate multiple approvals (separated by other activities) are preserved, while obvious errors are removed.
Example 4: Variant Reduction for Analysis
Scenario: Your process has many variants due to activity loops, making it difficult to identify core process patterns. You want to reduce variant count by focusing on which activities occur rather than how many times.
Settings:
- Deduplication Method: Global Deduplication
Result: Process variants are consolidated based on unique activity sequences. Multiple variants that differ only in loop iterations collapse into single variants. This can dramatically reduce your variant count from hundreds to dozens.
Insights: Simplifying to unique activities helps identify core process patterns and makes process discovery more meaningful. You can later analyze the filtered-out cases separately to understand loop behavior.
Output
The filter returns a modified dataset containing the same cases with deduplicated event sequences. Each case maintains its original case-level attributes and metadata, but the events within each case are filtered according to the selected deduplication method.
Directly Follows Deduplication: Returns cases with consecutive duplicates removed. Events are preserved unless they immediately follow an event with the same activity name.
Global Deduplication: Returns cases with only the first occurrence of each unique activity. All subsequent events with the same activity name are removed, regardless of their position in the case.
All removed events are excluded from the result, and the remaining events maintain their original timestamps and attributes.
This documentation is part of the mindzieStudio process mining platform.