What is Spoon ?
is graphical utility in Kettle suite to do these functionalities :
- design and run job / transformation
- job / transformation profiling
- set up database connection, variable/object sharing, configuration, etc
This article will serve as introductory tutorial for those who are new to Spoon and Kettle.
- Spoon in Windows can be executed using these 2 files :
- Binary : kettle.exe
- Batch script : spoon.bat
- Execute one of the two files
- Upon starting, you will be presented a dialog to choose a repository - a relasional database system repository to store job/transfformation object detail. Clik on No repository button for now to choose our filesystem as our storage.
- A tips dialog will pop up shortly. Click on close to continue.
- You will be presented a "Welcome to Pentaho Data Integration: Kettle Project" page in Welcome tag.
- Pulldown Menu : a main navigation for spoon. Comprising of all needed operations on designing, previewing and executing job / transformation.
- Toolbar : contain icons for basic file operation, printing, repository view and job / transformation operations.
- Left panel : contain configuration, steps and favorite steps used splitted in Main Tree, Core Objects dan Favorite Steps sections.
- Workspace Tabs : act as divider between your workspaces. Picture below show the tag collections of 1 welcome page, 2 transformations and 1 job.
Creating a Transformation File
Here I show you an example how to create a transformation that read a csv (comma separated value) file containing 33 provinces of Indonesia and dump it to another csv file. The propinsi.csv
sample file can be downloaded at the end of this article or directly through this link
. Save the file in C:\contoh_kettle
Now here are the steps to create a transformation handling the csv files :
- To create a transformation you can do it with several ways :
- Choose File | New | Transformation from the pulldown menu
- Click on New | Transformation on the toolbar
- With a CTRL + N shortcut key
- A new Transformation 1 workspace tab will show up, you rename it by saving our newly created transformation. Press CTRL + S and save file as c:\contoh_kettle\baca_propinsi.ktr.
- Notice that now our tab is changed to baca_propinsi.
- Transformation will consist of several steps. Now we will drop a step visually in our workspace to read the content of our C:\contoh_kettle\propinsi.csv file.
- In the left panel open up Core Objects | Input category. Here you will find several steps to read input from several file formats.
- For our need, we will use CSV file input step.
- Click on CSV file input icon and drag it into our workspace.
- Double click on the [CSV file input] step until a dialog shows up.
- In Filename section fill in c:\contoh_kettle\propinsi.csv. Left the other as they are now.
- Click on Get Fields to retrieve our known fields. Click OK and Close subsequently for another dialogs that pop up.
- You will a configuration with values shown like picture below. Click OK to return to our workspace.
- Save our transfomation file
- We can preview several records read from our step using preview facility in Spoon.
- To demonstrate this, click on [CSV file input] and click Preview icon in the toolbar then click on Quick Launch button that shows up.
- In seconds you will have Examine Preview Data dialog with a number data of records previewing in this window. Close it for now by clicking Close button.
Joining Step with a Hop
- From the left panel open Core Objects | Scripting and drag Modified Java Script Value step into your transformation workspace.
- Hold CTRL key, click on both [CSV File Input] step and [Modified Java Script Value] then right click and choose New Hop. Click OK on dialog that show up.
- We just created a hop that bridging the two steps we created before.
Transform our Data
- One of Modified Java Script Value
step functionality is to change our data using programmatically using several built in operator and functions.
If you know Java very well, you can also embedded Java code in this step. Double click on the step and type following code in the editor that appears.
- Make sure that Compatibility mode ? is unchecked
- Click on Get variables button.
- You will have a dialog look as below.
- Click OK.
- Evaluate this step by previewing data on it.
- Now we will dump our result from [Modified Java Script Value] to a text file, C:\contoh_kettle\propinsi.txt.
- Again, from the left panel open Core Objects | Output and drag Text file output type step to workspace.
- Joining [Modified Java Script Value] and [Text file output] with a hop.
- Double click on [Text file output]
- In the pop up Text file output dialog click on file tab and type C:\contoh_kettle\propinsi in Filename section.
- Still on the dialog, click on Fields tab and click Get Fields button to have 4 fields show up (Kode_Propinsi, Deskripsi, Deskripsi_lengkap, Text file output).
- Click OK.
- Now our transformation already has our goals : read a csv text file, change some value and put it into another 2 fields, and save the combining fields into a new csv text file.
- Run the transformation by click Run button on the toolbar.
- Click Launch on the dialog.
- You will be redirected to a log workspace with running steps detail information, for example how many rows that are read and written in the step. I will not going further by explaining parts of this workspace but please notice at the bottom panel where there are detailed logs output there. You see in the last lines that our transformation has been successfully executed.
- Now take a look at our C:\contoh_kettle folder, we will have 1 more file there. A newly created propinsi.txt. Open the file with your favorite text editor and see the change from the original file.
Congratulations, you have just created a simple transfomation designed and executed in Spoon. For more reading on transformation you can check on Pentaho wiki
. We will also frequently update our samples in Kettle section. So,
always stay tuned :)
If you have any question at this article, feel free to drop us a note at email@example.com
You can download all the sample files attached below.