Overview

Data Hub

Purpose: Centralize and streamline data ingestion, processing, and management for seamless integration and actionable insights.

Data Connectors

  • Seamless Integration: Import and export data from enterprise tools such as databases, CRM systems, and ERP platforms.
  • Data Consistency: Keep your enterprise data synchronized across all sources.

Document Processing

  • Unstructured Data Extraction: Parse and process documents like PDFs, images, and scanned files.
  • Automation: Enhance workflows with intelligent data extraction capabilities.

Dataset Management

  • Effortless Organization: Upload, group, and manage datasets efficiently.
  • Flexible Formats: Support for CSV, Delta, and more for logical categorization.

Automation Hub

Purpose: Streamline the design and execution of workflows with advanced automation capabilities, enabling scalable and efficient data and computational processes.

Transform Node Workflows

  • Versatile Data Processing: Filter, transform, enrich, and aggregate data beyond traditional SQL capabilities.
  • Complex Data Manipulations: Handle reshaping, validation, and multi-source data integration.
  • Advanced Features: Includes partitioning, ranking, dynamic enrichment, and support for batch and real-time processing.
  • Key Benefits: Streamline ETL pipelines, improve data quality, and drive scalable analytics.

Custom Code Node Workflows

  • Custom Logic Integration: Execute tailored workflows with configurable custom code nodes.
  • Seamless Development: Leverage an integrated VS Code server for code creation and version control.
  • Automated Builds: Trigger Docker builds managed by GitHub Actions for efficiency and reliability.

Compute Node Workflow

  • Document Processing Power: Automate document analysis with advanced machine learning techniques.
  • Core Features: Perform classification, feature extraction, and embedding generation for structured and unstructured data.
  • Optimized IDP: Accelerate Intelligent Document Processing with scalable workflows.

Spark Node Workflows

  • Distributed Computing: Harness Apache Spark for high-performance batch and stream processing.
  • Big Data Analytics: Enable ETL, machine learning (via MLlib), graph analytics (GraphX), and SQL queries.
  • In-Memory Speed: Process structured, semi-structured, and unstructured data rapidly.
  • Real-Time Insights: Power data science, real-time analytics, and robust data integration workflows.

Developer Hub

Purpose: Equip developers with tools for advanced data science and machine learning operations.

Notebooks

  • Develop and test machine learning models.
  • Share and manage Jupyter notebooks for collaborative projects.

MLOps

  • Register, Monitor, Analyze and Optimize ML models.

Was this page helpful?

Need help?

Check our troubleshooting guide or browse frequently asked questions.

© 2025 Vue.AI. All rights reserved.