Data Architecture

The data architecture shows how process data flows into mindzieStudio, gets transformed through enrichment, and can be exported or used to trigger automated actions. Understanding these data pathways helps you design effective data integration strategies.

Data Architecture

Overview

mindzieStudio supports multiple data input methods, a centralized API layer, and various output options. This flexible architecture allows you to integrate process mining into your existing data ecosystem.


Data Input Sources

There are several ways to bring process data into mindzieStudio:

Manual Upload

The simplest approach is to upload files directly through the mindzieStudio interface. Supported formats include:

  • CSV files: Standard comma-separated values
  • Excel files: .xlsx spreadsheets
  • Parquet files: Columnar storage format for large datasets
  • ZIP archives: Compressed packages containing multiple files

Manual upload is ideal for ad-hoc analysis, proof-of-concept projects, or when data is already exported from source systems.

mindzie Data Designer

mindzie Data Designer is a visual tool that connects directly to your source databases and systems. It allows you to:

  • Define data schemas visually
  • Map source columns to event log format
  • Schedule automated data refreshes
  • Transform data during extraction

Data Designer supports connections to major databases including SQL Server, Oracle, PostgreSQL, MySQL, SAP HANA, and many others.

3rd Party ETL Tools

If your organization already has ETL (Extract, Transform, Load) infrastructure, you can integrate with mindzieStudio through standard data pipelines. This approach leverages your existing data engineering capabilities and governance processes.

Developer Data Upload

For programmatic access, the mindzieStudio API allows developers to:

  • Upload datasets via HTTP endpoints
  • Automate data refresh from custom applications
  • Integrate with CI/CD pipelines
  • Build custom data connectors

This is ideal for organizations building automated data pipelines or integrating process mining into larger systems.

Mulesoft Integration

Enterprise organizations using Mulesoft for integration can connect mindzieStudio as an API endpoint in their integration flows. This enables process data to flow as part of your broader enterprise integration strategy.


API

The API serves as the central gateway for all data movement in mindzieStudio. All data - whether uploaded manually or via automation - flows through the API layer.

The API provides:

  • Authentication: Secure access through bearer tokens
  • Validation: Data format and schema validation
  • Routing: Directing data to the appropriate processing components
  • Access Control: Tenant and project-level permissions

The API is available in Enterprise Server and SaaS editions of mindzieStudio.


Dataset

Once data enters mindzieStudio, it is stored as a Dataset. Datasets are:

  • Compressed: Efficient binary storage format
  • Validated: Checked for required columns and data types
  • Versioned: Previous uploads can be retained for comparison

Every dataset must include the three core event log columns:

  • Case ID (identifier for each process instance)
  • Activity (name of each step)
  • Timestamp (when each step occurred)

Additional attribute columns can include any business-relevant data.


Enrichment with Python

The enrichment layer transforms raw datasets into analysis-ready data. Enrichments can include:

  • Built-in transformation operators
  • Custom Python scripts for advanced logic
  • Business rule calculations
  • Data quality corrections

Python Integration allows you to:

  • Write custom transformation logic
  • Leverage Python data science libraries
  • Create reusable transformation scripts
  • Handle complex data manipulation scenarios

Enrichments run in the background and cache their results for fast access during analysis.


Investigation and Analysis

The Investigation layer is where analysis happens. Within investigations, you:

  • Apply investigation filters to focus on specific data subsets
  • Create analysis notebooks with ordered blocks
  • Generate insights through calculators
  • Build visualizations

Analysis results are cached and can be refreshed when source data updates.


Output and Integration

mindzieStudio provides multiple ways to export data and integrate with external systems:

Actions

Actions are automated workflows that execute based on schedules or triggers. Actions can:

  • Run Python scripts for custom processing
  • Call external HTTP APIs
  • Export data to external systems
  • Chain multiple steps together
  • Handle errors with fallback actions

Actions enable operational integration, where process insights trigger real-world responses.

API Export

External systems can query mindzieStudio via API to:

  • Retrieve analysis results programmatically
  • Pull dashboard data into other applications
  • Integrate process metrics into reporting systems
  • Power operational dashboards in external tools

CSV Export

For simple data export, you can download analysis results as CSV files. This is useful for:

  • Sharing data with stakeholders who don't have mindzieStudio access
  • Loading data into spreadsheet tools
  • Creating backup copies of analysis results

Data Flow Summary

The complete data flow through mindzieStudio:

  1. Input: Data enters via Manual Upload, Data Designer, ETL tools, API, or Mulesoft
  2. Gateway: The API validates, authenticates, and routes the data
  3. Storage: Data is stored as compressed Datasets
  4. Transformation: Enrichments (with optional Python) prepare the data
  5. Analysis: Investigations and notebooks generate insights
  6. Output: Results flow to Actions, API consumers, or CSV exports

This architecture supports both interactive analysis and automated operational workflows, making mindzieStudio suitable for both ad-hoc exploration and production process monitoring.