In this Lab, we're going to create the classic “Hello World” Transformation. The process will help define your own workflow for building data pipelines.
• Learn how to create a new Transformation.
• Add Transformation 'Steps'.
• A 'Hop'.
• And finally a 'Note'.
To create a new transformation
In Spoon, click File > New > Transformation.
Any one of these actions opens a new Transformation tab for you to begin designing your transformation.
By clicking New, then Transformation
By using the CTRL-N hot key
Generate rows outputs a specified number of rows. By default, the rows are empty; however, they can also contain several static fields. This step is used primarily for testing purposes. It may be useful for generating a fixed number of rows, for example, if you require exactly 12 rows for 12 months.
Sometimes you may use Generate Rows to generate one row that is an initiating point for your transformation.
To add the Generate Rows step, expand the ‘Input’ category in the Design tab, and drag the step onto the canvas.
💡Alternatively, enter ‘Generate Rows’ into the search bar.
Double-click on the Generate Rows to open step properties.
Ensure the following details are configured:
Step name
gr_hello-world
Limit
10
Name
message
Type
string
Value
hello world
Before we close this dialog and continue creating the transformation, let’s make certain the Step generates the data we expect.
Click Preview button. The ‘Enter preview size’ dialog is displayed.
In the ‘Enter preview size’ dialog, click the [OK] button.
Verify 10 rows of data with the message you entered is displayed, and then click the [OK] button to close the ‘Examine preview data’ dialog.
Click OK button to close the ‘Generate Rows’ dialog.
The Dummy step does process records. Its primary function is to be a placeholder for testing purposes. For example, to have a transformation, you need at least two steps connected to each other.
To add the Dummy step, expand the ‘Flow’ category in the Design tab, and drag the Dummy step onto the canvas:
This final part of the creating a transformation focuses exclusively on the local execution option.
In Spoon, select Action > Run This Transformation.
Or Click on the Run button in the toolbar
The Execute a transformation window appears. You can run a transformation locally, remotely, or in a clustered environment. For the purposes of this exercise, keep the default as Local Execution.
Click Run icon and select Run Options.
In the Run Options panel you can set:
• the run configuration - the server pattern (single server or across a cluster)
• set the logging level
• save the Transformation locally.
The transformation executes.
A green tick confirms the transformation's execution, but doesn't guarantee the success of the underlying operations.
Execution Results
The Execution Results section of the window contains several different tabs that help you to see how the transformation executed, pinpoint errors, and monitor performance.
Logging tab displays logging information for each of the steps in the transformation.
Step Metrics tab provides statistics for each step in your transformation including how many records were read, written, caused an error, processing speed (rows per second) and more. This tab also indicates whether an error occurred in a transformation step.
Metrics will help identify any back pressure on the Steps. In this example the transformation took 30ms to execute. Notice that the steps gr_hello-world & Dummy are initialised at the same time. Each step is executed in parallel, i.e. in their own thread, independent of each other.
Preview tab displays the records.
Viewing the transformation structure
If you click the View icon in the upper left corner of the screen, the tree will change to show the structure of the transformation currently being edited.