Kettle‎ > ‎

Time Difference Calculation Based on Text Columns


Introduction


This article try to answer problem based on question from Pentaho forum - http://forums.pentaho.org/showthread.php?t=67206.

In this thread, the poster have incoming rows which have text based start time and end time information columns. And then we need a third column which is a calculation of the difference between the two columns. All columns will be in hh:mm:ss format.

Example Result

Input Meta Data Scheme

 Field Name
 Description  Data Type
 emp_id  Employee ID
 Integer / int(10)
 str_time_in
 Absence time in - text format
 Text / varchar(20)
 str_time_out
 Absense time out - text format
 Text / varchar(30)

Output Meta Data Scheme

 Field Name
 Description  Data Type
 emp_id  Employee ID
 Integer / int(10)
 str_time_in
 Absence time in - text format
 Date / hh:mm:ss
 str_time_out
 Absense time out - text format
 Date / hh:mm:ss
 delta  Difference between str_time_in & str_time_out
 Date / hh:mm:ss


Proposed Solution

Here I give a ktr sample which data is generated from a "Generate Rows" step.



  • Generate a row sample based on the input meta data scheme above using "Generate Rows" step


  • Change the meta data of str_time_in & str_time_out columns to date data type using "Select Values" step



  • Doing the calculations and formatting needed in "Modified Java Script Value" step



  • Change the resulting column data type to date again using "Select Values" step. Note that I also rename - which is not necessary - the resulting column from deltaDateStr to deltaDate.


  • Done


You can download the ktr transformation file at the end of this article

Drawbacks

  • This input scheme has a better calculation if it is based on complete date/time type rather then only time information, for some employee who has a night shift which absence is start from near midnight - for example, arrive in 11:00 pm and going home at 7:00 am - the delta calculation will give a negative value. In that case, we need to have another workaround in our Javascript step.
  • If there is an invalid character that violate the time format - this can be happen since it is text based which can accept any characters - the overall flow will stopped. You need to have an error handling in our changing meta data step which is still not available in current version.

Resources

  • answer_convert_string_to_date.ktr : sample .ktr file, can be downloaded at the attachments list below.
ċ
answer_convert_string_to_date.ktr
(16k)
Feris Thia,
Jan 10, 2009, 11:36 PM