SMB

File sharing ..

Workshop - SMB/CIFS

SMB Server

Pentaho Data Integration

Pentaho Data Integration utilizes Virtual File System (VFS) as the abstraction layer within the kernel to expose different filesystems.

In PDI, you can add a VFS connection and then reference that connection whenever you want to access files or folders on your Virtual File System.

  1. (Optional) Download the latest jcifs driver.

Link to latest jcifs driver
  1. (OptionalCopy the JCIFS JAR file into Pentaho Data Integration "lib" folder.

Download CIFS driver

  1. Start Pentaho Data Integration.

cd
cd ~/Pentaho/design-tools/data-integration
./spoon.sh
  1. Create a new Transformation.

  2. Click on the 'View' tab.

  3. Highlight 'VFS Connections' and select 'New'.

  1. Configure with the following details:

  1. Click 'Test'.

Test connection

Transformation - SMB File Retrieval

Let's create a simple Transformation to onboard data via an SMB VFS connection.

  1. Create the following transformation:

tr_SMB_File_Retrieval
  1. Double-click on Text file input > File tab

  2. Click on Browse and ensure you select:

VFS Connections > SMB > Pentaho/design-tools/data-integration/samples/transformations/files/sales_data.csv

  1. Add the path.

  1. Click on Content tab & configure with the following settings:

Content
  1. Click on Fields tab & click on 'Get Fields'

Get Fields
  1. Preview the rows.

Preview rows
  1. Click OK.

Add the other steps to format / rename some fields, before output as a .txt in the same directory as your Transformation.

Last updated

Was this helpful?