Skip to content

Migrating Hive Queries into Trino(Presto)

Trino(Presto) follows the ANSI SQL syntax and semantics, while Hive uses a different SQL syntax dialect. Differences include:

  • Quotation — In Trino(Presto), use double quotations for column and table names (e.g., "column1"), single quotations for strings (for example, ‘hello presto’).
  • Implicit type conversion — In Trino(Presto), SELECT CAST(5 AS DOUBLE) / 2 gives 2.5, but SELECT 5 / 2 gives 2. In Hive, both return 2.5.

To migrate Hive queries into Trino(Presto):

  1. Review the SQL syntax requirements for Trino(Presto) at Migrating from Hive.
  2. Determine which of your queries might need to have syntax updates.
  3. Make and test the changes to your queries. Or, create copies of the queries you want to migrate to Trino(Presto) and make changes in the new queries.
  4. On the Query page of the TD Console for your query, make sure that the Type is set to Trino(Presto).

image

Supported Time Formats in TD TIME FORMAT UDF

This is a summary of the time formats supported by the TD_TIME_FORMAT_UDF udf.

Formats

SyntaxDate or Time ComponentPresentationExamples
GEra designatorTextAD

yyyyYearYear1996
yyYearYear (2 digits)96
MMMMMonth in yearMonth long nameJuly
MMMMonth in yearMonth short nameJul
MM, MMonth in yearMonth number07
ww, wWeek in yearNumber06
DDD, DD, DDay in yearNumber189
dd, dDay in monthNumber10
EEEEDay in weekTextTuesday
E, EEEDay in weekText (short form)Tue
aAm/pm markerTextPM
HH, HHour in day (0-23)Number00
kk, kHour in day (1-24)Number24
KK, KHour in AM/PM (0-11)Number00
hh, hHour in AM/PM (1-12)Number12
mm, mMinute in hourNumber30
ss, sSecond in minuteNumber55
SSS, SS, SMillisecondNumber978
wyearWeek of the yearNumberTD_TIME_FORMAT(1582859260, 'wyear'
TD_TIME_FORMAT(1582859260, 'wyear', 'UTC'))
zzzzTime zoneZone long namePacific Standard Time, or GMT+01:00
zTime zoneZone short namePST, or GMT+01:00
ZTime zoneZone offset-0800
uDay number of week (1-7)Number1 (for Monday)

Example

This example formats a UNIX timestamp into a date formatted string. You can use the sample_datasets database for example to try it out:

SQL

SELECT 
  time,
  TD_TIME_FORMAT(time,
    'yyyy-MM-dd HH:mm:ss z')
FROM
  www_access LIMIT 1
;

Result

2012-01-01 00: 00: 00 UTC

SQL DDL Command Differences in Hive and Trino(Presto)

Data Definition Language (DDL) is a concept in SQL that can be used to create and modify the structure of database objects or schema objects. You can use DDL statements to create, manage, and delete tables, indexes, users, and other schema objects.

As it pertains to data-processing, Trino(Presto) is optimized for quickly processing smaller volumes of information while Hive is better for querying more expansive databases requiring large amounts of memory. Both Hive and Trino(Presto) use DDL commands but in slightly different ways.

The following table illustrates DDL commands required for Hive and Trino(Presto).

SQL DDL Commands and Their Equivalents

SQL DDL CommandsHive SQL DDLTrino(Presto) SQL DDLQuery Definition
ALTERALTERALTERModifies the structure of an existing table.
CREATECREATECREATECreates a database or objects including tables, views, stored procedures, etc.
DROPDROPDROPDeletes an existing database or an object within a database.
TRUNCATETRUNCATEDELETERemoves all the records from a table.