Python Notebooks

Integratie van Jupyter Notebook

Integreer Jupyter notebooks voor aangepaste verrijkingen, data-analyse en machine learning-workflows.

Notebookgegevens Ophalen

GET /api/{tenantId}/{projectId}/notebook/{notebookId}

Haalt uitgebreide informatie op over een Jupyter notebook, inclusief de cellen, uitvoeringsstatus en integratieparameters.

Parameters

Parameter Type Locatie Beschrijving
tenantId GUID Pad De tenant identificatie
projectId GUID Pad De project identificatie
notebookId GUID Pad De notebook identificatie

Response

{
  "notebookId": "aa0e8400-e29b-41d4-a716-446655440000",
  "projectId": "660e8400-e29b-41d4-a716-446655440000",
  "notebookName": "Process Mining Analysis",
  "notebookDescription": "Custom analysis for customer journey optimization",
  "notebookVersion": "1.3.2",
  "kernelType": "python3",
  "status": "Ready",
  "integration": {
    "enrichmentMode": true,
    "datasetBinding": "880e8400-e29b-41d4-a716-446655440000",
    "outputFormat": "enriched_dataframe",
    "autoExecution": false
  },
  "cells": [
    {
      "cellId": "cell-001",
      "cellType": "code",
      "executionCount": 15,
      "hasOutput": true,
      "lastExecuted": "2024-01-20T10:30:00Z",
      "executionStatus": "Success"
    },
    {
      "cellId": "cell-002",
      "cellType": "markdown",
      "lastModified": "2024-01-19T14:20:00Z"
    }
  ],
  "environment": {
    "pythonVersion": "3.9.18",
    "packages": ["pandas", "numpy", "scikit-learn", "mindzie-sdk"],
    "customLibraries": ["process_mining_utils", "customer_analytics"]
  },
  "dateCreated": "2024-01-15T10:30:00Z",
  "dateModified": "2024-01-20T10:30:00Z",
  "createdBy": "user123",
  "lastExecutionDate": "2024-01-20T10:30:00Z",
  "executionCount": 47
}

Alle Notebooks Lijst

GET /api/{tenantId}/{projectId}/notebooks

Haalt een lijst op van alle Jupyter notebooks in het project met basis metadata en uitvoeringsstatus.

Query Parameters

Parameter Type Beschrijving
status string Filter op status: Ready, Running, Error, Kernel_Dead
kernelType string Filter op kernel type: python3, r, scala
enrichmentMode boolean Filter notebooks geconfigureerd voor data verrijking
page integer Paginanummer voor paginering (standaard: 1)
pageSize integer Aantal items per pagina (standaard: 20, max: 100)

Response

{
  "notebooks": [
    {
      "notebookId": "aa0e8400-e29b-41d4-a716-446655440000",
      "notebookName": "Process Mining Analysis",
      "kernelType": "python3",
      "status": "Ready",
      "enrichmentMode": true,
      "cellCount": 12,
      "lastExecutionDate": "2024-01-20T10:30:00Z",
      "dateCreated": "2024-01-15T10:30:00Z"
    }
  ],
  "totalCount": 8,
  "page": 1,
  "pageSize": 20,
  "hasNextPage": false
}

Nieuwe Notebook Aanmaken

POST /api/{tenantId}/{projectId}/notebook

Maakt een nieuwe Jupyter notebook aan met opgegeven configuratie en optionele template. De notebook wordt automatisch geconfigureerd voor mindzie data integratie.

Request Body

{
  "notebookName": "Advanced Customer Analytics",
  "notebookDescription": "Machine learning models for customer behavior prediction",
  "kernelType": "python3",
  "template": "process_mining_starter",
  "integration": {
    "enrichmentMode": true,
    "datasetBinding": "880e8400-e29b-41d4-a716-446655440000",
    "outputFormat": "enriched_dataframe",
    "autoExecution": false
  },
  "environment": {
    "packages": ["pandas", "numpy", "scikit-learn", "matplotlib", "seaborn"],
    "customLibraries": ["process_mining_utils"]
  },
  "initialCells": [
    {
      "cellType": "markdown",
      "content": "# Customer Analytics Notebook\n\nThis notebook analyzes customer journey data using process mining techniques."
    },
    {
      "cellType": "code",
      "content": "import pandas as pd\nimport numpy as np\nfrom mindzie_sdk import ProcessMiningClient\n\n# Initialize mindzie client\nclient = ProcessMiningClient()"
    }
  ]
}

Response

Returned 201 Created met het volledige notebook object inclusief gegenereerde notebook ID en initiƫle sessie-informatie.

Notebook Uitvoeren

POST /api/{tenantId}/{projectId}/notebook/{notebookId}/execute

Voert alle cellen in de notebook uit of het opgegeven celbereik. De uitvoering verloopt asynchroon en resultaten worden opgeslagen voor ophalen.

Request Body

{
  "executionMode": "all",
  "cellRange": {
    "startCell": "cell-001",
    "endCell": "cell-010"
  },
  "parameters": {
    "dataset_id": "880e8400-e29b-41d4-a716-446655440000",
    "analysis_period": "2024-01",
    "include_weekends": false
  },
  "outputOptions": {
    "captureOutputs": true,
    "saveIntermediateResults": true,
    "generateReport": true
  },
  "timeout": 1800,
  "priority": "Normal"
}

Response

{
  "executionId": "bb0e8400-e29b-41d4-a716-446655440000",
  "notebookId": "aa0e8400-e29b-41d4-a716-446655440000",
  "status": "Running",
  "startTime": "2024-01-20T10:30:00Z",
  "estimatedDuration": "15-20 minutes",
  "currentCell": "cell-003",
  "progress": {
    "totalCells": 12,
    "completedCells": 2,
    "currentCellIndex": 3,
    "percentComplete": 17
  },
  "parameters": {
    "dataset_id": "880e8400-e29b-41d4-a716-446655440000",
    "analysis_period": "2024-01",
    "include_weekends": false
  }
}

Uitvoeringsstatus Ophalen

GET /api/{tenantId}/{projectId}/notebook/{notebookId}/execution/{executionId}

Haalt de huidige status en voortgang op van een notebook uitvoering, inclusief cel-per-cel uitvoeringsdetails en eventuele fouten.

Response

{
  "executionId": "bb0e8400-e29b-41d4-a716-446655440000",
  "notebookId": "aa0e8400-e29b-41d4-a716-446655440000",
  "status": "Completed",
  "startTime": "2024-01-20T10:30:00Z",
  "endTime": "2024-01-20T10:47:00Z",
  "totalDuration": "17 minutes",
  "progress": {
    "totalCells": 12,
    "completedCells": 12,
    "successfulCells": 11,
    "failedCells": 1,
    "percentComplete": 100
  },
  "cellResults": [
    {
      "cellId": "cell-001",
      "status": "Success",
      "executionTime": "0.5 seconds",
      "hasOutput": false
    },
    {
      "cellId": "cell-002",
      "status": "Success",
      "executionTime": "3.2 seconds",
      "hasOutput": true,
      "outputType": "display_data"
    },
    {
      "cellId": "cell-003",
      "status": "Error",
      "executionTime": "1.1 seconds",
      "errorType": "KeyError",
      "errorMessage": "'customer_id' column not found in dataset"
    }
  ],
  "outputs": {
    "dataFrames": 3,
    "plots": 5,
    "models": 2,
    "enrichedData": {
      "recordCount": 15420,
      "newColumns": ["customer_segment", "journey_score", "anomaly_flag"]
    }
  },
  "resources": {
    "peakMemoryUsage": "2.3 GB",
    "cpuTime": "8.5 minutes",
    "diskUsage": "450 MB"
  }
}

Uitvoeringsresultaten Ophalen

GET /api/{tenantId}/{projectId}/notebook/{notebookId}/execution/{executionId}/results

Haalt de outputs en resultaten op van een voltooide notebook uitvoering, inclusief gegenereerde data, grafieken en verrijkte datasets.

Query Parameters

Parameter Type Beschrijving
outputType string Filter op output type: all, data, plots, models, reports
format string Response formaat: summary, detailed, download
cellId string Resultaten alleen ophalen van specifieke cel

Response

{
  "executionId": "bb0e8400-e29b-41d4-a716-446655440000",
  "status": "Completed",
  "outputs": [
    {
      "cellId": "cell-002",
      "outputType": "display_data",
      "contentType": "text/html",
      "title": "Dataset Overview",
      "content": "<div>Dataset contains 15,420 records...</div>",
      "downloadUrl": "https://api.mindzie.com/downloads/cell-002-bb0e8400.html"
    },
    {
      "cellId": "cell-005",
      "outputType": "image/png",
      "title": "Customer Journey Flow Chart",
      "dimensions": {"width": 800, "height": 600},
      "downloadUrl": "https://api.mindzie.com/downloads/cell-005-bb0e8400.png"
    },
    {
      "cellId": "cell-008",
      "outputType": "application/json",
      "title": "Process Mining Metrics",
      "content": {
        "avgCycleTime": "4.2 hours",
        "bottleneckActivities": ["Review Application", "Manager Approval"],
        "processEfficiency": 78.5,
        "customerSatisfactionScore": 8.2
      },
      "downloadUrl": "https://api.mindzie.com/downloads/cell-008-bb0e8400.json"
    }
  ],
  "enrichedDatasets": [
    {
      "name": "customer_journey_enhanced",
      "recordCount": 15420,
      "newColumns": ["customer_segment", "journey_score", "anomaly_flag"],
      "format": "pandas_dataframe",
      "downloadUrl": "https://api.mindzie.com/downloads/enriched-bb0e8400.csv"
    }
  ],
  "models": [
    {
      "name": "customer_churn_predictor",
      "modelType": "RandomForestClassifier",
      "accuracy": 0.87,
      "features": ["journey_score", "cycle_time", "touchpoint_count"],
      "downloadUrl": "https://api.mindzie.com/downloads/model-bb0e8400.pkl"
    }
  ],
  "reports": [
    {
      "name": "Customer Analytics Summary",
      "format": "html",
      "downloadUrl": "https://api.mindzie.com/downloads/report-bb0e8400.html"
    }
  ]
}

Notebook Bijwerken

PUT /api/{tenantId}/{projectId}/notebook/{notebookId}

Werk notebookconfiguratie, cellen of integratie-instellingen bij. Wijzigingen in cellen veroorzaken een nieuwe notebookversie.

Request Body

{
  "notebookName": "Advanced Customer Analytics v2",
  "notebookDescription": "Enhanced ML models with real-time prediction capabilities",
  "integration": {
    "enrichmentMode": true,
    "datasetBinding": "880e8400-e29b-41d4-a716-446655440000",
    "outputFormat": "enriched_dataframe",
    "autoExecution": true,
    "scheduleExecution": "0 2 * * *"
  },
  "environment": {
    "packages": ["pandas", "numpy", "scikit-learn", "tensorflow", "matplotlib"],
    "customLibraries": ["process_mining_utils", "ml_models"]
  }
}

Response

Returned het bijgewerkte notebook object met verhoogd versienummer en bewerkings-timestamps.

Notebook Verwijderen

DELETE /api/{tenantId}/{projectId}/notebook/{notebookId}

Verwijdert permanent een notebook en alle geschiedenis van uitvoeringen. Deze actie kan niet ongedaan gemaakt worden en stopt lopende uitvoeringen.

Response Codes

  • 204 No Content - Notebook succesvol verwijderd
  • 404 Not Found - Notebook niet gevonden of toegang geweigerd
  • 409 Conflict - Notebook is momenteel in uitvoering en kan niet worden verwijderd

Bestaande Notebook Uploaden

POST /api/{tenantId}/{projectId}/notebook/upload

Upload een bestaande Jupyter notebook (.ipynb-bestand) en configureer deze voor mindzie-integratie. De notebook wordt geparsed en cellen worden gevalideerd.

Request (Multipart Form Data)

Content-Type: multipart/form-data

--boundary
Content-Disposition: form-data; name="file"; filename="analysis.ipynb"
Content-Type: application/json

{notebook content}
--boundary
Content-Disposition: form-data; name="notebookName"

Customer Journey Analysis
--boundary
Content-Disposition: form-data; name="enrichmentMode"

true
--boundary
Content-Disposition: form-data; name="datasetBinding"

880e8400-e29b-41d4-a716-446655440000
--boundary--

Response

Returned 201 Created met het geüploade notebook object inclusief parsingresultaten en eventuele validatiewaarschuwingen.

Voorbeeld: Complete Notebook Workflow

Dit voorbeeld laat zien hoe een Jupyter notebook aangemaakt, uitgevoerd en resultaten opgehaald worden:

// 1. Maak een nieuwe notebook aan
const createNotebook = async () => {
  const response = await fetch('/api/{tenantId}/{projectId}/notebook', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${token}`
    },
    body: JSON.stringify({
      notebookName: 'Process Mining Analysis',
      notebookDescription: 'Advanced analytics for process optimization',
      kernelType: 'python3',
      template: 'process_mining_starter',
      integration: {
        enrichmentMode: true,
        datasetBinding: '880e8400-e29b-41d4-a716-446655440000',
        outputFormat: 'enriched_dataframe',
        autoExecution: false
      },
      environment: {
        packages: ['pandas', 'numpy', 'scikit-learn', 'matplotlib'],
        customLibraries: ['process_mining_utils']
      },
      initialCells: [
        {
          cellType: 'markdown',
          content: '# Process Mining Analysis\n\nAnalyzing process efficiency and bottlenecks.'
        },
        {
          cellType: 'code',
          content: 'import pandas as pd\nfrom mindzie_sdk import ProcessMiningClient\n\nclient = ProcessMiningClient()'
        }
      ]
    })
  });

  return await response.json();
};

// 2. Voer de notebook uit
const executeNotebook = async (notebookId) => {
  const response = await fetch(`/api/{tenantId}/{projectId}/notebook/${notebookId}/execute`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${token}`
    },
    body: JSON.stringify({
      executionMode: 'all',
      parameters: {
        dataset_id: '880e8400-e29b-41d4-a716-446655440000',
        analysis_period: '2024-01',
        include_weekends: false
      },
      outputOptions: {
        captureOutputs: true,
        saveIntermediateResults: true,
        generateReport: true
      },
      timeout: 1800,
      priority: 'High'
    })
  });

  return await response.json();
};

// 3. Volg de uitvoeringsvoortgang
const monitorNotebookExecution = async (notebookId, executionId) => {
  const checkStatus = async () => {
    const response = await fetch(`/api/{tenantId}/{projectId}/notebook/${notebookId}/execution/${executionId}`, {
      headers: {
        'Authorization': `Bearer ${token}`
      }
    });

    const execution = await response.json();
    console.log(`Status: ${execution.status}, Progress: ${execution.progress.percentComplete}%`);

    if (execution.status === 'Running') {
      setTimeout(() => checkStatus(), 15000);
    } else if (execution.status === 'Completed') {
      console.log('Notebook uitvoering voltooid!');
      await getNotebookResults(notebookId, executionId);
    } else if (execution.status === 'Error') {
      console.log('Uitvoering mislukt:', execution.cellResults.filter(c => c.status === 'Error'));
    }
  };

  await checkStatus();
};

// 4. Resultaten ophalen
const getNotebookResults = async (notebookId, executionId) => {
  const response = await fetch(`/api/{tenantId}/{projectId}/notebook/${notebookId}/execution/${executionId}/results?format=detailed`, {
    headers: {
      'Authorization': `Bearer ${token}`
    }
  });

  const results = await response.json();
  console.log('Uitvoeringsresultaten:', results);
  console.log('Verrijkte datasets:', results.enrichedDatasets);
  console.log('Gegenereerde modellen:', results.models);

  return results;
};

// Voer de workflow uit
createNotebook()
  .then(notebook => {
    console.log(`Notebook aangemaakt: ${notebook.notebookId}`);
    return executeNotebook(notebook.notebookId);
  })
  .then(execution => {
    console.log(`Uitvoering gestart: ${execution.executionId}`);
    return monitorNotebookExecution(execution.notebookId, execution.executionId);
  })
  .catch(error => console.error('Notebook workflow mislukt:', error));

Python Voorbeeld

import requests
import time
import json
from pathlib import Path

class NotebookManager:
    def __init__(self, base_url, tenant_id, project_id, token):
        self.base_url = base_url
        self.tenant_id = tenant_id
        self.project_id = project_id
        self.headers = {
            'Authorization': f'Bearer {token}',
            'Content-Type': 'application/json'
        }

    def create_notebook(self, name, description, kernel_type="python3", template=None, integration=None):
        """Maak een nieuwe Jupyter notebook aan"""
        url = f"{self.base_url}/api/{self.tenant_id}/{self.project_id}/notebook"
        payload = {
            'notebookName': name,
            'notebookDescription': description,
            'kernelType': kernel_type,
            'template': template,
            'integration': integration or {
                'enrichmentMode': True,
                'outputFormat': 'enriched_dataframe',
                'autoExecution': False
            }
        }
        response = requests.post(url, json=payload, headers=self.headers)
        return response.json()

    def upload_notebook(self, file_path, name, dataset_binding=None):
        """Upload een bestaand notebookbestand"""
        url = f"{self.base_url}/api/{self.tenant_id}/{self.project_id}/notebook/upload"

        with open(file_path, 'rb') as file:
            files = {'file': (Path(file_path).name, file, 'application/json')}
            data = {
                'notebookName': name,
                'enrichmentMode': 'true',
                'datasetBinding': dataset_binding or ''
            }
            # Verwijder Content-Type header voor multipart upload
            headers = {k: v for k, v in self.headers.items() if k != 'Content-Type'}
            response = requests.post(url, files=files, data=data, headers=headers)

        return response.json()

    def execute_notebook(self, notebook_id, parameters=None, timeout=1800):
        """Voer alle cellen van een notebook uit"""
        url = f"{self.base_url}/api/{self.tenant_id}/{self.project_id}/notebook/{notebook_id}/execute"
        payload = {
            'executionMode': 'all',
            'parameters': parameters or {},
            'outputOptions': {
                'captureOutputs': True,
                'saveIntermediateResults': True,
                'generateReport': True
            },
            'timeout': timeout,
            'priority': 'Normal'
        }
        response = requests.post(url, json=payload, headers=self.headers)
        return response.json()

    def get_execution_status(self, notebook_id, execution_id):
        """Haal uitvoeringsstatus op van notebook"""
        url = f"{self.base_url}/api/{self.tenant_id}/{self.project_id}/notebook/{notebook_id}/execution/{execution_id}"
        response = requests.get(url, headers=self.headers)
        return response.json()

    def wait_for_completion(self, notebook_id, execution_id, poll_interval=15, timeout=3600):
        """Wacht tot notebookuitvoering voltooid is"""
        start_time = time.time()

        while time.time() - start_time < timeout:
            status = self.get_execution_status(notebook_id, execution_id)
            print(f"Notebook {notebook_id}: {status['status']} ({status['progress']['percentComplete']}%)")

            if status['status'] in ['Completed', 'Error', 'Cancelled']:
                return status

            time.sleep(poll_interval)

        raise TimeoutError(f"Notebookuitvoering {execution_id} is niet voltooid binnen {timeout} seconden")

    def get_execution_results(self, notebook_id, execution_id, output_type="all", format_type="detailed"):
        """Haal uitvoeringsresultaten van notebook op"""
        url = f"{self.base_url}/api/{self.tenant_id}/{self.project_id}/notebook/{notebook_id}/execution/{execution_id}/results"
        params = {
            'outputType': output_type,
            'format': format_type
        }
        response = requests.get(url, params=params, headers=self.headers)
        return response.json()

    def list_notebooks(self, status=None, enrichment_mode=None, page=1, page_size=20):
        """Lijst alle notebooks met optionele filtering"""
        url = f"{self.base_url}/api/{self.tenant_id}/{self.project_id}/notebooks"
        params = {'page': page, 'pageSize': page_size}

        if status:
            params['status'] = status
        if enrichment_mode is not None:
            params['enrichmentMode'] = str(enrichment_mode).lower()

        response = requests.get(url, params=params, headers=self.headers)
        return response.json()

# Voorbeeld van gebruik
manager = NotebookManager(
    'https://your-mindzie-instance.com',
    'tenant-guid',
    'project-guid',
    'your-auth-token'
)

try:
    # Maak een process mining notebook
    notebook = manager.create_notebook(
        'Advanced Process Analytics',
        'Machine learning-based process analysis with anomaly detection',
        'python3',
        'process_mining_starter',
        {
            'enrichmentMode': True,
            'datasetBinding': 'dataset-guid',
            'outputFormat': 'enriched_dataframe',
            'autoExecution': False
        }
    )

    print(f"Notebook aangemaakt: {notebook['notebookId']}")

    # Voer uit met aangepaste parameters
    execution_params = {
        'dataset_id': 'dataset-guid',
        'analysis_type': 'full_analysis',
        'time_window': '30_days',
        'ml_models': ['anomaly_detection', 'process_prediction'],
        'generate_visualizations': True
    }

    execution = manager.execute_notebook(
        notebook['notebookId'],
        execution_params,
        timeout=2400  # 40 minuten
    )

    print(f"Uitvoering gestart: {execution['executionId']}")
    print(f"Geschatte duur: {execution['estimatedDuration']}")

    # Wacht op voltooiing
    final_status = manager.wait_for_completion(
        notebook['notebookId'],
        execution['executionId']
    )

    if final_status['status'] == 'Completed':
        # Haal gedetailleerde resultaten op
        results = manager.get_execution_results(
            notebook['notebookId'],
            execution['executionId'],
            'all',
            'detailed'
        )

        print("Notebookuitvoering succesvol afgerond!")
        print(f"Gegenereerde outputs: {len(results['outputs'])}")
        print(f"Verrijkte datasets: {len(results['enrichedDatasets'])}")
        print(f"Gemaakte ML-modellen: {len(results['models'])}")

        # Download verrijkte data
        for dataset in results['enrichedDatasets']:
            print(f"Download verrijkte data: {dataset['downloadUrl']}")

        # Download modellen
        for model in results['models']:
            print(f"Download model '{model['name']}': {model['downloadUrl']}")

    else:
        print(f"Notebookuitvoering mislukt met status: {final_status['status']}")
        failed_cells = [cell for cell in final_status['cellResults'] if cell['status'] == 'Error']
        for cell in failed_cells:
            print(f"Cel {cell['cellId']} is mislukt: {cell['errorMessage']}")

except Exception as e:
    print(f"Fout in notebook workflow: {e}")