How to convert integer/seconds to HH:MM:SS format in a Spark Dataframe using spark SQL

Ioudom Foubi Jephte
1 min readMar 22, 2021

--

At the time when this solution was developed as part of an ETL pipeline on Spark in Databricks, Spark did not provide a time data type that enables the conversion of integers directly into a time format (hh:mm:ss). The solution below was made to enable anyone with integer values to transform them in a time format.

Disclaimer: The solution provided below is in no way the only method someone can use to convert seconds to hh:mm:ss in a Spark dataframe. This solution was implemented in azure databricks on a Spark dataframe.

The target column was of integer data type containing seconds that had to be converted to time format. The first step consisted of converting the integer column to timestamp data type;

cast(column_with_seconds as timestamp)

For example, using the above, 900 seconds will become 1970–01–01T00:15:00.000+0000.

The next step consisted of parsing the timestamp column to extract the time section;

SUBSTRING(cast(column_with_seconds as timestamp), 11)

For example, 1970–01–01T00:15:00.000+0000 will become 00:15:00.

Combining the above in an SQL select statement will give;

SELECT SUBSTRING(cast(load_duration_seconds as timestamp), 11) as time_hh_mm_ss

FROM your_table

Conclusion

This solution was implemented to convert integer values to time hh:mm:ss. Feel free to use and contribute more ideas.

--

--