Datasets
Data Management API
Upload, manage, and update datasets with support for multiple file formats including CSV, ZIP packages, and binary formats.
Features
Dataset Creation
Create new datasets from CSV, ZIP packages, or binary files.
Data Import
Import data with column mapping for process mining analysis.
Dataset Updates
Update existing datasets with new data while preserving configurations.
File Formats
Supported file formats and data structures.
Available Endpoints
Connectivity Testing
- GET
/api/{tenantId}/{projectId}/dataset/unauthorized-ping- Public connectivity test (no auth required) - GET
/api/{tenantId}/{projectId}/dataset/ping- Authenticated connectivity test
Dataset Operations
- GET
/api/{tenantId}/{projectId}/dataset- List all datasets in a project
Dataset Creation
- POST
/api/{tenantId}/{projectId}/dataset/csv- Create dataset from CSV file - POST
/api/{tenantId}/{projectId}/dataset/package- Create dataset from ZIP package - POST
/api/{tenantId}/{projectId}/dataset/binary- Create dataset from binary file
Dataset Updates
- PUT
/api/{tenantId}/{projectId}/dataset/{datasetId}/csv- Update dataset from CSV - PUT
/api/{tenantId}/{projectId}/dataset/{datasetId}/package- Update dataset from ZIP package - PUT
/api/{tenantId}/{projectId}/dataset/{datasetId}/binary- Update dataset from binary file
Supported File Formats
mindzieStudio supports multiple data formats for process mining:
CSV Files
Comma-separated values with flexible column mapping.
- Event logs with case ID, activity, timestamp
- Custom culture settings for date/number parsing
- UTF-8 encoding support
ZIP Packages
Compressed packages containing multiple related files.
- Complex datasets with multiple tables
- Metadata and configuration files
- mindzie dataset packaging standards
Binary Files
Native binary format for efficient data transfer.
- Pre-processed event log data
- Optimized for large datasets
- Column mappings required
Dataset Structure
Understanding the expected data structure for process mining analysis:
Required Columns
| Column | Description |
|---|---|
| Case ID | Unique identifier for each process instance |
| Activity | Name of the activity or event |
| Timestamp | When the activity occurred |
Optional Columns
| Column | Description |
|---|---|
| Resource | User or system that performed the activity |
| Start Time | Activity start time (for duration calculations) |
| Expected Order | Sequence ordering column |
Response Structure
{
"datasetId": "550e8400-e29b-41d4-a716-446655440000",
"datasetName": "Purchase Order Process",
"datasetDescription": "Event log from SAP procurement",
"projectId": "660e8400-e29b-41d4-a716-446655440000",
"caseIdColumnName": "CaseID",
"activityColumnName": "Activity",
"timeColumnName": "Timestamp",
"resourceColumnName": "Resource",
"beginTimeColumnName": "StartTime",
"useDateOnlySorting": false,
"useOnlyEventColumns": false,
"dateCreated": "2024-01-15T10:30:00Z",
"dateModified": "2024-01-15T14:45:00Z",
"createdBy": "user@example.com",
"modifiedBy": "user@example.com"
}
Upload Response Structure
Dataset creation and update endpoints return import statistics:
{
"datasetId": "550e8400-e29b-41d4-a716-446655440000",
"caseCount": 5200,
"eventCount": 150000,
"invalidValueCount": 12,
"skippedRowsCount": 3,
"errors": [],
"rowIssues": [],
"statusCode": 200
}
Common Use Cases
- Event Log Import: Upload process event data from ERP, CRM, or BPM systems
- Data Refresh: Update existing datasets with new data while preserving analysis configurations
- Multi-Format Support: Import data from CSV exports or proprietary binary formats
- Batch Processing: Upload large datasets up to 1GB with progress tracking
File Size Limits
All upload endpoints support files up to 1GB in size. For larger datasets, consider:
- Breaking data into multiple uploads
- Using the binary format for efficiency
- Contacting support for enterprise data solutions
Authentication
All Dataset API endpoints (except unauthorized-ping) require valid authentication with appropriate permissions for the target project and tenant.
Getting Started
Begin with Dataset Creation to learn how to create datasets, then explore Data Import for column mapping details.