Conventions UsedBold Text : represent new / important concept such as type of step and also representing interface components such as button, menu item, etc.
[Bold in Bracket] : represent step name
What is Spoon ?
Spoon is graphical utility in Kettle suite to do these functionalities :
- design and run job / transformation
- job / transformation profiling
- set up database connection, variable/object sharing, configuration, etc
This article will serve as introductory tutorial for those who are new to Spoon and Kettle.
Running Spoon
- Spoon in Windows can be executed using these 2 files :
- Binary : kettle.exe
- Batch script : spoon.bat
- Execute one of the two files
- Upon starting, you will be presented a dialog to choose a repository - a relasional database system repository to store job/transfformation object detail. Clik on No repository button for now to choose our filesystem as our storage.
- A tips dialog will pop up shortly. Click on close to continue.
- You will be presented a "Welcome to Pentaho Data Integration: Kettle Project" page in Welcome tag.
- Done
Spoon Interface
- Pulldown Menu : a main navigation for spoon. Comprising of all needed operations on designing, previewing and executing job / transformation.
- Toolbar : contain icons for basic file operation, printing, repository view and job / transformation operations.
- Left panel : contain configuration, steps and favorite steps used splitted in Main Tree, Core Objects dan Favorite Steps sections.
- Workspace Tabs : act as divider between your workspaces. Picture below show the tag collections of 1 welcome page, 2 transformations and 1 job.
Creating a Transformation File
Here I show you an example how to create a transformation that read a csv (comma separated value) file containing 33 provinces of Indonesia and dump it to another csv file. The propinsi.csv sample file can be downloaded at the end of this article or directly through this link. Save the file in C:\contoh_kettle folder.
propinsi.csvKode_Propinsi,Deskripsi P01,Bali P02,Bengkulu P03,Banten P04,Gorontalo P05,Irian Jaya Barat P06,Papua P07,Jambi P08,Jawa Barat P09,Jawa Tengah P10,Jawa Timur P11,Kalimantan Barat P12,Kalimantan Tengah P13,Kalimantan Timur P14,Kalimantan Selatan P15,Kepulauan Bangka Belitung P16,Kepulauan Riau P17,Lampung P18,Maluku P19,Maluku Utara P20,Nusa Tenggara Barat P21,Nusa Tenggara Timur P22,Riau P23,Sulawesi Barat P24,Sulawesi Tengah P25,Sulawesi Tenggara P26,Sulawesi Selatan P27,Sulawesi Utara P28,Sumatra Barat P29,Sumatra Selatan P30,Sumatra Utara P31,DI Yogyakarta P32,DKI Jakarta P33,Nanggroe Aceh Darussalam
Now here are the steps to create a transformation handling the csv files :
- To create a transformation you can do it with several ways :
- Choose File | New | Transformation from the pulldown menu
- Click on New | Transformation on the toolbar
- With a CTRL + N shortcut key
- A new Transformation 1 workspace tab will show up, you rename it by saving our newly created transformation. Press CTRL + S and save file as c:\contoh_kettle\baca_propinsi.ktr.
- Notice that now our tab is changed to baca_propinsi.
Transformation Steps
- Transformation will consist of several steps. Now we will drop a step visually in our workspace to read the content of our C:\contoh_kettle\propinsi.csv file.
- In the left panel open up Core Objects | Input category. Here you will find several steps to read input from several file formats.
- For our need, we will use CSV file input step.
- Click on CSV file input icon and drag it into our workspace.
- Double click on the [CSV file input] step until a dialog shows up.
- In Filename section fill in c:\contoh_kettle\propinsi.csv. Left the other as they are now.
- Click on Get Fields to retrieve our known fields. Click OK and Close subsequently for another dialogs that pop up.
- You will a configuration with values shown like picture below. Click OK to return to our workspace.
- Save our transfomation file
Data Preview
- We can preview several records read from our step using preview facility in Spoon.
- To demonstrate this, click on [CSV file input] and click Preview icon in the toolbar then click on Quick Launch button that shows up.
- In seconds you will have Examine Preview Data dialog with a number data of records previewing in this window. Close it for now by clicking Close button.
Joining Step with a Hop
- From the left panel open Core Objects | Scripting and drag Modified Java Script Value step into your transformation workspace.
- Hold CTRL key, click on both [CSV File Input] step and [Modified Java Script Value] then right click and choose New Hop. Click OK on dialog that show up.
- We just created a hop that bridging the two steps we created before.
Transform our Data
- One of Modified Java Script Value
step functionality is to change our data using programmatically using several built in operator and functions.
If you know Java very well, you can also embedded Java code in this step. Double click on the step and type following code in the editor that appears.
var Deskripsi_lengkap = "Propinsi " + Deskripsi; var No_urut = getProcessCount("r");
- Make sure that Compatibility mode ? is unchecked
- Click on Get variables button.
- You will have a dialog look as below.
- Click OK.
- Evaluate this step by previewing data on it.
Output Step
- Now we will dump our result from [Modified Java Script Value] to a text file, C:\contoh_kettle\propinsi.txt.
- Again, from the left panel open Core Objects | Output and drag Text file output type step to workspace.
- Joining [Modified Java Script Value] and [Text file output] with a hop.
- Double click on [Text file output]
- In the pop up Text file output dialog click on file tab and type C:\contoh_kettle\propinsi in Filename section.
- Still on the dialog, click on Fields tab and click Get Fields button to have 4 fields show up (Kode_Propinsi, Deskripsi, Deskripsi_lengkap, Text file output).
- Click OK.
Running Transformation
- Now our transformation already has our goals : read a csv text file, change some value and put it into another 2 fields, and save the combining fields into a new csv text file.
- Run the transformation by click Run button on the toolbar.
- Click Launch on the dialog.
- You will be redirected to a log workspace with running steps detail information, for example how many rows that are read and written in the step. I will not going further by explaining parts of this workspace but please notice at the bottom panel where there are detailed logs output there. You see in the last lines that our transformation has been successfully executed.
Execution logs2008/08/24 18:31:22 - baca_propinsi - Dispatching started for transformation [baca_propinsi] 2008/08/24 18:31:23 - Spoon - The transformation has finished!! 2008/08/24 18:52:54 - Spoon - Launching transformation [baca_propinsi]... 2008/08/24 18:52:54 - Spoon - Started the transformation execution. 2008/08/24 18:52:54 - baca_propinsi - Dispatching started for transformation [baca_propinsi] 2008/08/24 18:52:55 - Spoon - The transformation has finished!!
- Now take a look at our C:\contoh_kettle folder, we will have 1 more file there. A newly created propinsi.txt. Open the file with your favorite text editor and see the change from the original file.
propinsi.txtKode_Propinsi;Deskripsi;Deskripsi_lengkap;No_urut P01;Bali ;Propinsi Bali;00000000000001.00 P02;Bengkulu ;Propinsi Bengkulu;00000000000002.00 P03;Banten ;Propinsi Banten;00000000000003.00 P04;Gorontalo ;Propinsi Gorontalo;00000000000004.00 P05;Irian Jaya Barat ;Propinsi Irian Jaya Barat;00000000000005.00 P06;Papua ;Propinsi Papua;00000000000006.00 P07;Jambi ;Propinsi Jambi;00000000000007.00 P08;Jawa Barat ;Propinsi Jawa Barat;00000000000008.00 P09;Jawa Tengah ;Propinsi Jawa Tengah;00000000000009.00 P10;Jawa Timur ;Propinsi Jawa Timur;00000000000010.00 P11;Kalimantan Barat ;Propinsi Kalimantan Barat;00000000000011.00 P12;Kalimantan Tengah ;Propinsi Kalimantan Tengah;00000000000012.00 P13;Kalimantan Timur ;Propinsi Kalimantan Timur;00000000000013.00 P14;Kalimantan Selatan ;Propinsi Kalimantan Selatan;00000000000014.00 P15;Kepulauan Bangka Belitung;Propinsi Kepulauan Bangka Belitung;00000000000015.00 P16;Kepulauan Riau ;Propinsi Kepulauan Riau;00000000000016.00 P17;Lampung ;Propinsi Lampung;00000000000017.00 P18;Maluku ;Propinsi Maluku;00000000000018.00 P19;Maluku Utara ;Propinsi Maluku Utara;00000000000019.00 P20;Nusa Tenggara Barat ;Propinsi Nusa Tenggara Barat;00000000000020.00 P21;Nusa Tenggara Timur ;Propinsi Nusa Tenggara Timur;00000000000021.00 P22;Riau ;Propinsi Riau;00000000000022.00 P23;Sulawesi Barat ;Propinsi Sulawesi Barat;00000000000023.00 P24;Sulawesi Tengah ;Propinsi Sulawesi Tengah;00000000000024.00 P25;Sulawesi Tenggara ;Propinsi Sulawesi Tenggara;00000000000025.00 P26;Sulawesi Selatan ;Propinsi Sulawesi Selatan;00000000000026.00 P27;Sulawesi Utara ;Propinsi Sulawesi Utara;00000000000027.00 P28;Sumatra Barat ;Propinsi Sumatra Barat;00000000000028.00 P29;Sumatra Selatan ;Propinsi Sumatra Selatan;00000000000029.00 P30;Sumatra Utara ;Propinsi Sumatra Utara;00000000000030.00 P31;DI Yogyakarta ;Propinsi DI Yogyakarta;00000000000031.00 P32;DKI Jakarta ;Propinsi DKI Jakarta;00000000000032.00 P33;Nanggroe Aceh Darussalam ;Propinsi Nanggroe Aceh Darussalam;00000000000033.00
- Done
Conclusion
Congratulations, you have just created a simple transfomation designed and executed in Spoon. For more reading on transformation you can check on Pentaho wiki. We will also frequently update our samples in Kettle section. So , always stay tuned :)
If you have any question at this article, feel free to drop us a note at info@phi-integration.com.
You can download all the sample files attached below.
|
 Updating...
Feris Thia, Sep 1, 2008, 5:52 AM
Feris Thia, Sep 1, 2008, 5:52 AM
Feris Thia, Sep 1, 2008, 5:52 AM
|