Pentaho Data Integration
InstallationBusiness AnalyticsCToolsData CatalogData QualityLLMs
  • Overview
    • Pentaho Data Integration ..
  • Data Integration
    • Getting Started
      • Configuring PDI UI
      • KETTLE Variables
    • Concepts & Terminolgy
      • Hello World
      • Logging
      • Error Handling
    • Data Sources
      • Flat Files
        • Text
          • Text File Input
          • Text File Output
        • Excel
          • Excel Writer
        • XML
          • Read XML
        • JSON
          • Read JSON
      • Databases
        • CRUID
          • Database Connections
          • Create DB
          • Read DB
          • Update DB
          • Insert / Update DB
          • Delete DB
        • SCDs
          • SCDs
      • Object Stores
        • MinIO
      • SMB
      • Big Data
        • Hadoop
          • Apache Hadoop
    • Enrich Data
      • Merge
        • Merge Streams
        • Merge Rows (diff)
      • Joins
        • Cross Join
        • Merge Join
        • Database Join
        • XML Join
      • Lookups
        • Database Lookups
      • Scripting
        • Formula
        • Modified JavaScript Value
        • User Defined Java Class
    • Enterprise Solution
      • Jobs
        • Job - Hello World
        • Backward Chaining
        • Parallel
      • Parameters & Variables
        • Parameters
        • Variables
      • Scalability
        • Run Configurations
        • Partition
      • Monitoring & Scheduling
        • Monitoring & Scheduling
      • Logging
        • Logging
      • Dockmaker
        • BA & DI Servers
      • Metadata Injection
        • MDI
    • Plugins
      • Hierarchical Data Type
  • Use Cases
    • Streaming Data
      • MQTT
        • Mosquitto
        • HiveMQ
      • AMQP
        • RabbitMQ
      • Kafka
        • Kafka
    • Machine Learning
      • Prerequiste Tasks
      • AutoML
      • Credit Card
    • RESTful API
    • Jenkins
    • GenAI
  • Reference
    • Page 1
Powered by GitBook
On this page
  1. Data Integration
  2. Enterprise Solution
  3. Monitoring & Scheduling

Monitoring & Scheduling

Schedule Pentaho Jobs / Transformations and Monitor results ..

PreviousMonitoring & SchedulingNextLogging

Last updated 1 year ago

In this guided demonstration, you will:

  • Configure a Repository connection.

  • Monitor a Job / Transformation.

  • Schedule a Job / Transformation.

One way to monitor a Pentaho Transformations / Jobs is to use the PDI Status page, which shows you the details of remotely executed and scheduled transformations, such as the date and time they were run, their status and results.

To access the PDI Status page, you need to navigate to the /pentaho/kettle/status page on your Pentaho Server, and change the host name and port to match your configuration.

Another way to monitor a Pentaho transformation is to enable logging and step performance monitoring in the PDI client. Logging provides you with summarized and detailed information about a transformation, such as the number of records inserted, the total elapsed time, and any errors or exceptions.

Step performance monitoring allows you to see how each step in your transformation is performing in terms of speed, memory usage, and input/output rates.

Monitor

Now that executed the transformation against the Pentaho server kettle engine you can remotely log into the service to monitor the tasks.

  1. Click on the following URL:

  1. Log into the service.

Username

admin

Password

password

From here you can perform a number of operations:

• RUN the Transformation / Job.

• Stop the running Transformation / Job.

• View Transformation / Job details.

• Remove Transformation / Job.

For monitoring remote servers, log in with the following URL format:

• http://[IP address / FQDN]: [Port]/kettle/status

• Default Username / Password: cluster/cluster

If you have a transformation or a job stored in the Pentaho Repository, you can use the Schedule perspective in the PDI client to create and manage schedules. You can specify the start and end date and time, the repeat frequency, the log level, and the safe mode for the transformation or job.

You can also edit, delete, enable, disable, or stop the schedules from the Schedule perspective.

Ensure that the Pentaho server is up and running ..

Pentaho Repository.

  1. Click on the Connect button (top right on canvas).

  2. Click Add and Enter the following details:

  1. Save & Close.

To connect to the Repository

  1. Click on the Connect button.

  2. Select the connection (Pentaho).

  3. Enter credentials:

Username

admin

Password

password

  1. Click on login.

Lets upload tr_hello_world.ktr and RUN.

As you're connected to the 'Pentaho' Repository, you will need to browse for the transformation locally.

  1. Select File -> Import from an XML file ..

  1. Browse to:

/home/pentaho/Workshop--Data-Integration/Labs/Module 5 - Enterprise Solution/Topic 4 - Monitor

  1. Select: tr_hello_world.ktr & Open.

Change the File type: *.ktr

  1. Click Save.

  2. Let's create a Public / Demo folder.

  1. Enter demo and Save.

Let's now RUN the transformation on the Pentaho server.

  1. Log into the Pentaho User Console.

Username

Admin

Password

password

  1. Select Browse Files -> Public -> Demo

  2. Highlight the tr_hello_world and under File Actions, click open.

  3. Click the Close window.

This indicates that the transformation has been successfully executed.

Ensure that the Pentaho server is up and running and that you have connected to the Pentaho Repository - see previous Monitor section.

Transformations /Jobs need to uploaded into the Pentaho Repository.

  1. Connect to the 'Pentaho' Repository.

  2. Open the tr_hello_world.ktr

  3. From the main menu select: Action -> Schedule

  1. Enter the following settings:

  1. Monitor the status (Periodically, refresh the browser).

  1. To manage the Schedule, switch to the Schedule perspective.

  1. Highlight the tr-hello_world schedule.

6. Disable the schedule an switch back to the Data Integration perspective.

http://localhost:8080/pentaho/kettle/statuslocalhost
Monitor - Pentaho server
http://localhost:8080/pentaho/Loginlocalhost
Log into service
PDI status
Repository connection
Pentaho Repository
Import XML file
path to tr_hello_world.ktr
Create a Public Demo folder
Schedule transformation
Scheduling options
Monitor status.
Scheduler perspective
Manage schedule