Dataset Creation

Create New Datasets

Create datasets from CSV files, ZIP packages, or binary files. Each format requires specific column mappings for process mining analysis.

Connectivity Testing

Unauthorized Ping

GET /api/{tenantId}/{projectId}/dataset/unauthorized-ping

Test endpoint that does not require authentication.

Response

Ping Successful

Authenticated Ping

GET /api/{tenantId}/{projectId}/dataset/ping

Authenticated ping endpoint to verify API access.

Response (200 OK)

Ping Successful (tenant id: {tenantId})

List All Datasets

GET /api/{tenantId}/{projectId}/dataset

Retrieves all datasets within the specified project.

Response (200 OK)

{
  "datasets": [
    {
      "datasetId": "550e8400-e29b-41d4-a716-446655440000",
      "datasetName": "Purchase Order Process",
      "datasetDescription": "Event log from SAP",
      "projectId": "660e8400-e29b-41d4-a716-446655440000",
      "caseIdColumnName": "CaseID",
      "activityColumnName": "Activity",
      "timeColumnName": "Timestamp",
      "resourceColumnName": "Resource",
      "beginTimeColumnName": null,
      "expectedOrderColumnName": null,
      "useDateOnlySorting": false,
      "useOnlyEventColumns": false,
      "dateCreated": "2024-01-15T10:30:00Z",
      "dateModified": "2024-01-15T14:45:00Z",
      "createdBy": "user@example.com",
      "modifiedBy": "user@example.com"
    }
  ]
}

Create Dataset from CSV

POST /api/{tenantId}/{projectId}/dataset/csv

Creates a new dataset from a CSV file upload with column mappings.

Request (multipart/form-data)

Field Type Required Description
file file Yes CSV file to upload (max 1GB)
datasetName string Yes Name for the new dataset
caseIdColumn string Yes Column name containing case IDs
activityNameColumn string Yes Column name containing activity names
activityTimeColumn string Yes Column name containing timestamps
resourceColumn string No Column name containing resource/performer
startTimeColumn string No Column name for activity start times
cultureInfo string No Culture for parsing (default: "en-US")

Response (200 OK)

{
  "datasetId": "550e8400-e29b-41d4-a716-446655440000",
  "caseCount": 5200,
  "eventCount": 150000,
  "invalidValueCount": 0,
  "skippedRowsCount": 0,
  "errors": [],
  "rowIssues": [],
  "statusCode": 200
}

Error Response (422 Unprocessable Entity)

{
  "errors": ["Column 'CaseID' not found in CSV file"],
  "statusCode": 422
}

Create Dataset from ZIP Package

POST /api/{tenantId}/{projectId}/dataset/package

Creates a new dataset from a ZIP package containing data files.

Request (multipart/form-data)

Field Type Required Description
file file Yes ZIP package file (max 1GB)
datasetName string Yes Name for the new dataset
cultureInfo string No Culture for parsing (default: "en-US")

Response (200 OK)

{
  "datasetId": "550e8400-e29b-41d4-a716-446655440000",
  "caseCount": 5200,
  "eventCount": 150000,
  "invalidValueCount": 0,
  "skippedRowsCount": 0,
  "errors": [],
  "rowIssues": [],
  "statusCode": 200
}

Create Dataset from Binary

POST /api/{tenantId}/{projectId}/dataset/binary

Creates a new dataset from a binary format file with column mappings.

Request (multipart/form-data)

Field Type Required Description
file file Yes Binary file to upload (max 1GB)
datasetName string Yes Name for the new dataset
caseIdColumn string Yes Column name containing case IDs
activityNameColumn string Yes Column name containing activity names
activityTimeColumn string Yes Column name containing timestamps
resourceColumn string No Column name containing resource/performer
startTimeColumn string No Column name for activity start times

Response (200 OK)

Same structure as CSV creation response.

Implementation Examples

cURL - CSV Upload

curl -X POST "https://your-mindzie-instance.com/api/12345678-1234-1234-1234-123456789012/87654321-4321-4321-4321-210987654321/dataset/csv" \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -F "file=@event_log.csv" \
  -F "datasetName=Purchase Orders" \
  -F "caseIdColumn=CaseID" \
  -F "activityNameColumn=Activity" \
  -F "activityTimeColumn=Timestamp" \
  -F "resourceColumn=User" \
  -F "cultureInfo=en-US"

cURL - ZIP Package Upload

curl -X POST "https://your-mindzie-instance.com/api/12345678-1234-1234-1234-123456789012/87654321-4321-4321-4321-210987654321/dataset/package" \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -F "file=@data_package.zip" \
  -F "datasetName=SAP Export" \
  -F "cultureInfo=en-US"

Python

import requests

TENANT_ID = '12345678-1234-1234-1234-123456789012'
PROJECT_ID = '87654321-4321-4321-4321-210987654321'
BASE_URL = 'https://your-mindzie-instance.com'

class DatasetUploader:
    def __init__(self, token):
        self.headers = {'Authorization': f'Bearer {token}'}

    def create_from_csv(self, file_path, dataset_name, case_id_col, activity_col, time_col,
                        resource_col=None, start_time_col=None, culture='en-US'):
        """Create dataset from CSV file."""
        url = f'{BASE_URL}/api/{TENANT_ID}/{PROJECT_ID}/dataset/csv'

        with open(file_path, 'rb') as f:
            files = {'file': (file_path, f, 'text/csv')}
            data = {
                'datasetName': dataset_name,
                'caseIdColumn': case_id_col,
                'activityNameColumn': activity_col,
                'activityTimeColumn': time_col,
                'cultureInfo': culture
            }
            if resource_col:
                data['resourceColumn'] = resource_col
            if start_time_col:
                data['startTimeColumn'] = start_time_col

            response = requests.post(url, headers=self.headers, files=files, data=data)

        if response.ok:
            return response.json()
        else:
            raise Exception(f'Upload failed: {response.text}')

    def create_from_package(self, file_path, dataset_name, culture='en-US'):
        """Create dataset from ZIP package."""
        url = f'{BASE_URL}/api/{TENANT_ID}/{PROJECT_ID}/dataset/package'

        with open(file_path, 'rb') as f:
            files = {'file': (file_path, f, 'application/zip')}
            data = {
                'datasetName': dataset_name,
                'cultureInfo': culture
            }

            response = requests.post(url, headers=self.headers, files=files, data=data)

        if response.ok:
            return response.json()
        else:
            raise Exception(f'Upload failed: {response.text}')

    def list_datasets(self):
        """List all datasets in the project."""
        url = f'{BASE_URL}/api/{TENANT_ID}/{PROJECT_ID}/dataset'
        response = requests.get(url, headers=self.headers)
        return response.json()

# Usage
uploader = DatasetUploader('your-auth-token')

# Create from CSV
result = uploader.create_from_csv(
    'event_log.csv',
    'Purchase Order Process',
    'CaseID',
    'Activity',
    'Timestamp',
    resource_col='User'
)
print(f"Created dataset: {result['datasetId']}")
print(f"Cases: {result['caseCount']}, Events: {result['eventCount']}")

# List all datasets
datasets = uploader.list_datasets()
for ds in datasets['datasets']:
    print(f"- {ds['datasetName']} ({ds['datasetId']})")

JavaScript/Node.js

const TENANT_ID = '12345678-1234-1234-1234-123456789012';
const PROJECT_ID = '87654321-4321-4321-4321-210987654321';
const BASE_URL = 'https://your-mindzie-instance.com';

class DatasetUploader {
  constructor(token) {
    this.token = token;
  }

  async createFromCsv(file, datasetName, caseIdCol, activityCol, timeCol, options = {}) {
    const url = `${BASE_URL}/api/${TENANT_ID}/${PROJECT_ID}/dataset/csv`;

    const formData = new FormData();
    formData.append('file', file);
    formData.append('datasetName', datasetName);
    formData.append('caseIdColumn', caseIdCol);
    formData.append('activityNameColumn', activityCol);
    formData.append('activityTimeColumn', timeCol);
    formData.append('cultureInfo', options.culture || 'en-US');

    if (options.resourceColumn) {
      formData.append('resourceColumn', options.resourceColumn);
    }
    if (options.startTimeColumn) {
      formData.append('startTimeColumn', options.startTimeColumn);
    }

    const response = await fetch(url, {
      method: 'POST',
      headers: { 'Authorization': `Bearer ${this.token}` },
      body: formData
    });

    if (response.ok) {
      return await response.json();
    } else {
      throw new Error(`Upload failed: ${await response.text()}`);
    }
  }

  async listDatasets() {
    const url = `${BASE_URL}/api/${TENANT_ID}/${PROJECT_ID}/dataset`;
    const response = await fetch(url, {
      headers: { 'Authorization': `Bearer ${this.token}` }
    });
    return await response.json();
  }
}

// Usage (browser)
const uploader = new DatasetUploader('your-auth-token');
const fileInput = document.getElementById('csvFile');

fileInput.addEventListener('change', async (e) => {
  const file = e.target.files[0];

  const result = await uploader.createFromCsv(
    file,
    'My Dataset',
    'CaseID',
    'Activity',
    'Timestamp',
    { resourceColumn: 'User' }
  );

  console.log(`Created: ${result.datasetId}`);
  console.log(`Cases: ${result.caseCount}, Events: ${result.eventCount}`);
});

Response Fields

Field Type Description
datasetId GUID ID of the created dataset
caseCount integer Number of unique cases imported
eventCount integer Total number of events imported
invalidValueCount integer Number of invalid values encountered
skippedRowsCount integer Number of rows skipped due to errors
errors array List of error messages
rowIssues array Detailed information about row-level issues
statusCode integer HTTP status code

Best Practices

  • Validate Column Names: Ensure column names match exactly (case-sensitive)
  • Check Culture Settings: Use appropriate culture for date/number formats
  • Handle Large Files: Monitor upload progress for files approaching 1GB
  • Review Row Issues: Check rowIssues array for data quality problems
  • Unique Dataset Names: Dataset names must be unique within a project