Workflows

What is a Workflow?

A Workflow in YuzeData is a data integration process that orchestrates the flow of data between systems. Workflows connect datapoints, connectors, and master data to automate your data operations.

Each workflow is built from workflow steps - reusable steps that perform specific operations like fetching data from an API, transforming values, or producing datapoints.

Workflows and Workflow Steps

Workflows and workflow steps work together:

Workflow with workflow steps

Concept	Description
Workflow	The container that defines the overall data integration process. Created from scratch for your specific needs.
workflow step	A step within a workflow. Built from templates that define what the step can do.

Workflow Step Templates and Instances

Workflow steps follow a template/instance pattern:

workflow step template: A blueprint defining a type of operation
workflow step instance: A configured step in your workflow, created from a template with your specific settings

When you add a step to a workflow, you select a workflow step template and configure it to create an instance. This allows you to reuse the same template multiple times with different configurations.

Example Templates

Common workflow step templates include:

Template	Description
Pull data feed	Pulls data via a configured connector into the platform as a feed
Push data feed upstream	Pushes a datafeed upstream via a configured connector
Pull and push data	Pulls data from a source system and immediately pushes it to a target system without storing it on the platform
Filter	Filters datapoints and creates a new stream from them
Datapoint aggregation	Consumes datapoints and aggregates them
Deduplication	Takes data from a source bucket and adds it to a target bucket while ensuring no duplicate records
Lookup	Retrieves data from an external system using a configured connector
YuzeScript	Executes a YuzeScript expression for custom transformations
Execute Connector	Executes a connector operation as fire-and-forget (useful for one-time queries or scheduled operations)
Import master data	Imports data from a connector and stores results as master data
Export master data	Exports master data to a connector
Map Master data items	Maps master data items between different systems by adding mapping identifiers from external systems

Triggers

Workflow steps can be triggered in three ways:

Trigger	Description
Schedule	Runs automatically at specified intervals (e.g., every hour, daily at midnight)
Data Feed	Triggered when new datapoints arrive in a consumed feed
Parent workflow step	Triggered by another workflow step, enabling workflow chaining

Schedule Trigger

Schedule-based triggers run a workflow step at regular intervals. You can configure schedules like:

Every 15 minutes
Every hour
Daily at a specific time
Weekly on specific days

Data Feed Trigger

When a workflow step is configured to consume a datapoint feed, it can be set to trigger automatically when new datapoints arrive. This creates a reactive pipeline where data flows through workflow steps as it becomes available.

Parent Workflow Step Trigger

A workflow step can be configured to run after another workflow step completes. This enables chaining multiple steps together, where the output of one step feeds into the next.

Consuming and Producing Data

Workflow steps process data by consuming input and producing output. This is how data flows through your workflows.

Consuming Data

A workflow step can consume data from several sources:

Source	Description
Datapoint Feed	Reads datapoints from a bucket with a specific schema
Connector	Fetches data directly from an external system via a connector operation
Master Data	Reads reference data items for enrichment or processing
Nothing	The workflow step does not consume any data

When consuming from a datapoint feed, you configure:

Schema: Which schema's datapoints to subscribe to
Bucket: Which datapoint bucket to read from
Batch Size: How many datapoints to process per run

When consuming from a connector, you can apply schema mappings to transform the connector's input and output data to match your workflow's schemas.

Producing Data

A workflow step can either produce datapoints or produce nothing:

Strategy	Description
Datapoint Feed	Writes datapoints to a bucket with a specific schema
Nothing	The workflow step does not produce output (e.g., when writing directly to an external system)

When producing to a datapoint feed, you configure:

Schema: The structure of the output data
Bucket: Which datapoint bucket to write to

Chaining Workflow Steps

Consuming and producing enables chaining - connecting workflow steps together so data flows through multiple processing steps.

Workflow step chaining

Workflow step A produces datapoints to bucket "metrics"
Workflow step B consumes from bucket "metrics"
Workflow step B processes the data and produces to bucket "alerts"
Workflow step C consumes from bucket "alerts" and sends notifications

This pattern allows you to build complex data pipelines from simple, focused steps.

Settings

Each workflow step instance has settings that control its behavior. Settings are defined by the template and configured when you create an instance.

Template-Defined Settings

The workflow step template defines which settings are available. Common settings include:

Setting Type	Examples
Filter conditions	Field comparisons, AND/OR logic
Aggregation rules	Group by fields, aggregation operations
Deduplication fields	Which fields to use for duplicate detection
Script expressions	YuzeScript code for custom transformations

Connector Settings

When a workflow step uses a connector, you configure which connector instance and operation to use. This links the workflow step to your deployed connectors.

Capacity Settings

For workflow steps that process large amounts of data or require more resources, you can configure capacity settings to control how the workflow step executes.

Execution Preference

Mode	Description
In-Process	Runs within the standard processing infrastructure (default)
Out-of-Process	Runs in a dedicated container with configurable resources

Compute Specifications

When using out-of-process execution, you can select a compute specification:

Specification	CPU	Memory
Small	0.25 cores	0.5 GB
Medium	1 core	2 GB
Large	2 cores	4 GB
Extra Large	4 cores	8 GB

Use higher capacity settings for workflow steps that:

Process large batches of datapoints
Perform complex transformations
Call external systems with slow response times

Workflows ​

What is a Workflow? ​

Workflows and Workflow Steps ​

Workflow Step Templates and Instances ​

Example Templates ​

Triggers ​

Schedule Trigger ​

Data Feed Trigger ​

Parent Workflow Step Trigger ​

Consuming and Producing Data ​

Consuming Data ​

Producing Data ​

Chaining Workflow Steps ​

Settings ​

Template-Defined Settings ​

Connector Settings ​

Capacity Settings ​

Execution Preference ​

Compute Specifications ​