JSON

JavaScript Object Notation ..

JSON

JSON (JavaScript Object Notation) is a lightweight data interchange format that's easy for humans to read and write, and simple for machines to parse and generate. It uses a text-based structure with key-value pairs and arrays to represent data. JSON is language-independent and widely used for transmitting data in web applications.

Now, to extract key-value pairs from this JSON object in Pentaho Data Integration, you would typically use the "JSON Input" step.

{
  "customer": {
    "id": 1001,
    "name": "John Doe",
    "email": "[email protected]",
    "active": true
  }
}

In the JSON Input step, the data stream field name, path and data type are defined.

Name
Path
Type

id

$.customer.id

Integer

name

$.customer.name

String

email

$.customer.email

email

active

$.customer.active

Boolean

Here's a brief explanation of the JSON Path notation used:

$ represents the root of the JSON document

.customer navigates to the "customer" object

.id, .name, .email, and .active access the respective fields within the "customer" object

Workshops

Pentaho Data Integration offers several specialized steps for working with JSON data in your ETL processes.

The JSON Input step reads JSON data from files or fields, supporting complex nested structures and JSON Path expressions for precise data extraction. It handles arrays and provides options for managing missing values.

JSON Output converts your transformation data into JSON format, with control over formatting, file output options, and the ability to create both objects and arrays.

The REST Client step connects with REST APIs that typically use JSON, handling authentication, headers, and processing the returned JSON responses for further transformation.

Common workflows include API integration, JSON file processing, and complex JSON transformations, often using these steps in combination for effective data handling.

Read JSON

In this workshop our standard customer orders file is in a JSON format. Simply going to onboard the required fileds into our data stream.

Json Input
Read JSON

Last updated

Was this helpful?