Datasets

Data Management API

Upload, manage, and update datasets with support for multiple file formats including CSV, ZIP packages, and binary formats.

Features

Dataset Creation

Create new datasets from CSV, ZIP packages, or binary files.

Create Datasets

Data Import

Import data with column mapping for process mining analysis.

Import Data

Dataset Updates

Update existing datasets with new data while preserving configurations.

Update Datasets

File Formats

Supported file formats and data structures.

View Formats

Available Endpoints

Connectivity Testing

  • GET /api/{tenantId}/{projectId}/dataset/unauthorized-ping - Public connectivity test (no auth required)
  • GET /api/{tenantId}/{projectId}/dataset/ping - Authenticated connectivity test

Dataset Operations

  • GET /api/{tenantId}/{projectId}/dataset - List all datasets in a project

Dataset Creation

  • POST /api/{tenantId}/{projectId}/dataset/csv - Create dataset from CSV file
  • POST /api/{tenantId}/{projectId}/dataset/package - Create dataset from ZIP package
  • POST /api/{tenantId}/{projectId}/dataset/binary - Create dataset from binary file

Dataset Updates

  • PUT /api/{tenantId}/{projectId}/dataset/{datasetId}/csv - Update dataset from CSV
  • PUT /api/{tenantId}/{projectId}/dataset/{datasetId}/package - Update dataset from ZIP package
  • PUT /api/{tenantId}/{projectId}/dataset/{datasetId}/binary - Update dataset from binary file

Supported File Formats

mindzieStudio supports multiple data formats for process mining:

CSV Files

Comma-separated values with flexible column mapping.

  • Event logs with case ID, activity, timestamp
  • Custom culture settings for date/number parsing
  • UTF-8 encoding support

ZIP Packages

Compressed packages containing multiple related files.

  • Complex datasets with multiple tables
  • Metadata and configuration files
  • mindzie dataset packaging standards

Binary Files

Native binary format for efficient data transfer.

  • Pre-processed event log data
  • Optimized for large datasets
  • Column mappings required

Dataset Structure

Understanding the expected data structure for process mining analysis:

Required Columns

Column Description
Case ID Unique identifier for each process instance
Activity Name of the activity or event
Timestamp When the activity occurred

Optional Columns

Column Description
Resource User or system that performed the activity
Start Time Activity start time (for duration calculations)
Expected Order Sequence ordering column

Response Structure

{
  "datasetId": "550e8400-e29b-41d4-a716-446655440000",
  "datasetName": "Purchase Order Process",
  "datasetDescription": "Event log from SAP procurement",
  "projectId": "660e8400-e29b-41d4-a716-446655440000",
  "caseIdColumnName": "CaseID",
  "activityColumnName": "Activity",
  "timeColumnName": "Timestamp",
  "resourceColumnName": "Resource",
  "beginTimeColumnName": "StartTime",
  "useDateOnlySorting": false,
  "useOnlyEventColumns": false,
  "dateCreated": "2024-01-15T10:30:00Z",
  "dateModified": "2024-01-15T14:45:00Z",
  "createdBy": "user@example.com",
  "modifiedBy": "user@example.com"
}

Upload Response Structure

Dataset creation and update endpoints return import statistics:

{
  "datasetId": "550e8400-e29b-41d4-a716-446655440000",
  "caseCount": 5200,
  "eventCount": 150000,
  "invalidValueCount": 12,
  "skippedRowsCount": 3,
  "errors": [],
  "rowIssues": [],
  "statusCode": 200
}

Common Use Cases

  • Event Log Import: Upload process event data from ERP, CRM, or BPM systems
  • Data Refresh: Update existing datasets with new data while preserving analysis configurations
  • Multi-Format Support: Import data from CSV exports or proprietary binary formats
  • Batch Processing: Upload large datasets up to 1GB with progress tracking

File Size Limits

All upload endpoints support files up to 1GB in size. For larger datasets, consider:

  • Breaking data into multiple uploads
  • Using the binary format for efficiency
  • Contacting support for enterprise data solutions

Authentication

All Dataset API endpoints (except unauthorized-ping) require valid authentication with appropriate permissions for the target project and tenant.

Getting Started

Begin with Dataset Creation to learn how to create datasets, then explore Data Import for column mapping details.