Pentaho Data Integration
InstallationBusiness AnalyticsCToolsData CatalogData QualityLLMs
  • Overview
    • Pentaho Data Integration ..
  • Data Integration
    • Getting Started
      • Configuring PDI UI
      • KETTLE Variables
    • Concepts & Terminolgy
      • Hello World
      • Logging
      • Error Handling
    • Data Sources
      • Flat Files
        • Text
          • Text File Input
          • Text File Output
        • Excel
          • Excel Writer
        • XML
          • Read XML
        • JSON
          • Read JSON
      • Databases
        • CRUID
          • Database Connections
          • Create DB
          • Read DB
          • Update DB
          • Insert / Update DB
          • Delete DB
        • SCDs
          • SCDs
      • Object Stores
        • MinIO
      • SMB
      • Big Data
        • Hadoop
          • Apache Hadoop
    • Enrich Data
      • Merge
        • Merge Streams
        • Merge Rows (diff)
      • Joins
        • Cross Join
        • Merge Join
        • Database Join
        • XML Join
      • Lookups
        • Database Lookups
      • Scripting
        • Formula
        • Modified JavaScript Value
        • User Defined Java Class
    • Enterprise Solution
      • Jobs
        • Job - Hello World
        • Backward Chaining
        • Parallel
      • Parameters & Variables
        • Parameters
        • Variables
      • Scalability
        • Run Configurations
        • Partition
      • Monitoring & Scheduling
        • Monitoring & Scheduling
      • Logging
        • Logging
      • Dockmaker
        • BA & DI Servers
      • Metadata Injection
        • MDI
    • Plugins
      • Hierarchical Data Type
  • Use Cases
    • Streaming Data
      • MQTT
        • Mosquitto
        • HiveMQ
      • AMQP
        • RabbitMQ
      • Kafka
        • Kafka
    • Machine Learning
      • Prerequiste Tasks
      • AutoML
      • Credit Card
    • RESTful API
    • Jenkins
    • GenAI
  • Reference
    • Page 1
Powered by GitBook
On this page
  1. Data Integration
  2. Enterprise Solution

Jobs

A process flow that consists of one or more steps that execute tasks such as transformations, scripts, email notifications, file transfers, etc

PreviousEnterprise SolutionNextJob - Hello World

Last updated 7 months ago

In most ETL tasks you need to be able to perform maintenance tasks, orchestrate the execution of the transformations, and handle errors and retries. These tasks are handled by Jobs. A job consists of one or more job entry that are executed in a specific order. The order of the execution is determined by the job hops between the job entries as well as the executions themselves. Job entries differ in several ways:

  • You can create shadow copies of a job entry. This allows to place the same job entry in a job on multiple locations.

  • A job entry passes a results object between job entries. This means that once a job entry has been completed all rows are transferred at once, rather than in a streaming fashion.

  • Job entries are executed in a certain sequence (except if set to parallel execution)

Besides the execution order, a hop also specifies the condition on which the next job entry will be executed. You can specify the Evaluation mode by right clicking on the job hop. A job hop is just a flow of control. Hops link to job entries and, based on the results of the previous job entry, determine what happens next.

Workshops

Start off with an overview of the components that define a Job.

Create a Job that executes the 'hello world' transformation.

  • START

  • Job Entry

n Pentaho, a job is a sequence of steps that can be executed in a specific order. A job can contain one or more transformations, which are executed in parallel or sequentially.

Backward chaining is a technique used to execute a job in which the execution of a transformation depends on the successful execution of another transformation. In other words, it is a technique used to execute transformations in reverse order.

Running Pentaho jobs in parallel can help improve performance and efficiency in data integration and ETL processes.

Job - Hello World
Backward Chaining
Parallel
Jobs