Pentaho Data Integration
InstallationBusiness AnalyticsCToolsData CatalogData QualityLLMs
  • Overview
    • Pentaho Data Integration ..
  • Data Integration
    • Getting Started
      • Configuring PDI UI
      • KETTLE Variables
    • Concepts & Terminolgy
      • Hello World
      • Logging
      • Error Handling
    • Data Sources
      • Flat Files
        • Text
          • Text File Input
          • Text File Output
        • Excel
          • Excel Writer
        • XML
          • Read XML
        • JSON
          • Read JSON
      • Databases
        • CRUID
          • Database Connections
          • Create DB
          • Read DB
          • Update DB
          • Insert / Update DB
          • Delete DB
        • SCDs
          • SCDs
      • Object Stores
        • MinIO
      • SMB
      • Big Data
        • Hadoop
          • Apache Hadoop
    • Enrich Data
      • Merge
        • Merge Streams
        • Merge Rows (diff)
      • Joins
        • Cross Join
        • Merge Join
        • Database Join
        • XML Join
      • Lookups
        • Database Lookups
      • Scripting
        • Formula
        • Modified JavaScript Value
        • User Defined Java Class
    • Enterprise Solution
      • Jobs
        • Job - Hello World
        • Backward Chaining
        • Parallel
      • Parameters & Variables
        • Parameters
        • Variables
      • Scalability
        • Run Configurations
        • Partition
      • Monitoring & Scheduling
        • Monitoring & Scheduling
      • Logging
        • Logging
      • Dockmaker
        • BA & DI Servers
      • Metadata Injection
        • MDI
    • Plugins
      • Hierarchical Data Type
  • Use Cases
    • Streaming Data
      • MQTT
        • Mosquitto
        • HiveMQ
      • AMQP
        • RabbitMQ
      • Kafka
        • Kafka
    • Machine Learning
      • Prerequiste Tasks
      • AutoML
      • Credit Card
    • RESTful API
    • Jenkins
    • GenAI
  • Reference
    • Page 1
Powered by GitBook
On this page
  1. Data Integration
  2. Data Sources
  3. Databases
  4. CRUID

Update DB

PreviousRead DBNextInsert / Update DB

Last updated 1 month ago

Workshop - Update DB

The Update step first looks up a row in a table using one or more lookup keys. If it can be found and the fields to update are all the same, no action is taken, otherwise the record is updated.

In this guided Lab, you will:

  • Update the Employees Table.

1002, Murphy, Diana,x5800,dmurphy@classicmodelcars.com,1,1000,CEO

1102, Bondur, Gerard,x5408,athompson@classicmodelcars.com,4,1056,Regional Sales Manager (EMEA)


Create a new Transformation

Any one of these actions opens a new Transformation tab for you to begin designing your transformation.

  • By clicking File > New > Transformation

  • By using the CTRL-N hot key

Text file input

The Text File Input step is used to read data from a variety of different text-file types. The most commonly used formats include Comma Separated Values (CSV files) generated by spreadsheets and fixed width flat files.

The Text File Input step provides you with the ability to specify a list of files to read, or a list of directories with wild cards in the form of regular expressions. In addition, you can accept filenames from a previous step making filename handling more even more generic.

  1. Drag the Text File Input step onto the canvas.

  2. Open the Text File properties dialog box.

  3. Ensure the following details are configured, as outlined below:

Because the sample file is located in the same directory where the transformation resides, a good approach to naming the file in a way that is location independent is to use a system variable to parameterize the directory name where the file is located. In our case, the complete filename is:

${Internal.Transformation.Filename.Directory}/employees_update.txt

  1. Click on the ‘Content’ tab and configure the following properties:

  1. Click on ‘Get Fields’ button.

Click on the ‘Fields’ tab and notice the following properties:

  1. Close the step.

Update

The Update step first looks up a row in a table using one or more lookup keys. If the row matches the lookups, then it updates the record.

  1. Drag the Update step onto the canvas.

  2. Open the Update properties dialog box.

  3. Ensure the following details are configured, as outlined below:

Set the Key to lookup the values

Select EMPLOYEENUMBER from the table to equal EMPOLYEE_NUMBER from the data stream.

You can set any number of constraints to ensure unique values are look up.

Get Update fields

Ensure the Table and Stream fields are correctly mapped.

  1. Click Ok.

RUN

Database operations rely on Keys - unique identifier - to maintain referential integrity.

If no matching record is found based on the key fields, the Update step simply skips that row without performing any action. Unlike the "Insert/Update" step, it doesn't create new records when matches aren't found.

The Update step is particularly useful for data maintenance operations, synchronizing systems, or applying changes to existing database records without disturbing their structure or unaffected fields.

  1. Click the Run button in the Canvas Toolbar.

  2. Click on the Step Metrics tab.

Notice that 2 records were updated with the Update employees step.

  1. Check the records in the database.

select * from EMPLOYEES
where EMPLOYEENUMBER in ('1002','1102');
Update Employees
Set path to employess.txt
Text file input - Content
Text File input - Fields
Update fields
Step metrics
Update Employees