Pentaho Data Integration
InstallationBusiness AnalyticsCToolsData CatalogData QualityLLMs
  • Overview
    • Pentaho Data Integration ..
  • Data Integration
    • Getting Started
      • Configuring PDI UI
      • KETTLE Variables
    • Concepts & Terminolgy
      • Hello World
      • Logging
      • Error Handling
    • Data Sources
      • Flat Files
        • Text
          • Text File Input
          • Text File Output
        • Excel
          • Excel Writer
        • XML
          • Read XML
        • JSON
          • Read JSON
      • Databases
        • CRUID
          • Database Connections
          • Create DB
          • Read DB
          • Update DB
          • Insert / Update DB
          • Delete DB
        • SCDs
          • SCDs
      • Object Stores
        • MinIO
      • SMB
      • Big Data
        • Hadoop
          • Apache Hadoop
    • Enrich Data
      • Merge
        • Merge Streams
        • Merge Rows (diff)
      • Joins
        • Cross Join
        • Merge Join
        • Database Join
        • XML Join
      • Lookups
        • Database Lookups
      • Scripting
        • Formula
        • Modified JavaScript Value
        • User Defined Java Class
    • Enterprise Solution
      • Jobs
        • Job - Hello World
        • Backward Chaining
        • Parallel
      • Parameters & Variables
        • Parameters
        • Variables
      • Scalability
        • Run Configurations
        • Partition
      • Monitoring & Scheduling
        • Monitoring & Scheduling
      • Logging
        • Logging
      • Dockmaker
        • BA & DI Servers
      • Metadata Injection
        • MDI
    • Plugins
      • Hierarchical Data Type
  • Use Cases
    • Streaming Data
      • MQTT
        • Mosquitto
        • HiveMQ
      • AMQP
        • RabbitMQ
      • Kafka
        • Kafka
    • Machine Learning
      • Prerequiste Tasks
      • AutoML
      • Credit Card
    • RESTful API
    • Jenkins
    • GenAI
  • SETUP
    • Windows 11 Pentaho Lab
  • FAQs
    • FAQs
Powered by GitBook
On this page
  1. Data Integration
  2. Enrich Data
  3. Joins

XML Join

Join XML streams ..

Workshop - XML Join

XML Join in Pentaho Data Integration

XML Join is a specialized step in Pentaho Data Integration designed to incorporate XML content into your data stream based on values from another stream.

This step accepts two input streams - the main data stream and an XML stream. It merges them by adding the XML content as a new field in your main data stream.

The XML stream must contain well-formed XML data that will be integrated into your transformation. The main stream contains the records you want to enhance with this XML content.

For each row in the main stream, PDI matches it with corresponding XML content based on a specified join key. The XML content is then added as a new field to the main stream row.

XML Join is particularly useful when dealing with web services, XML databases, or when you need to construct complex XML documents from relational data sources.

The step offers options to specify the target XML field name, the join comparison field, and the ability to encode the XML content if needed for further processing.

When configuring XML Join, you must specify which stream provides the XML content and which one serves as the main stream. The order of connecting these streams to the step is critical to its proper functioning.


PreviousDatabase JoinNextLookups

Last updated 1 month ago