SlideShare a Scribd company logo
Hive Functions Cheat-sheet, by Qubole
How to create and use Hive Functions, Listing of Built-In Functions that are supported in Hive
www.qubole.com QUESTIONS? CALL US 855-HADOOP-HELP
Description
Returns the rounded BIGINT value of the double
Returns the double rounded to d decimal places
Returns the maximum BIGINT value that is equal or less than the double
Returns the minimum BIGINT value that is equal or greater than the double
Returns a random number (that changes from row to row) that is distributed uniformly from 0 to 1.
Specifiying the seed will make sure the generated random number sequence is deterministic.
Returns ea where e is the base of the natural logarithm
Returns the natural logarithm of the argument
Returns the base-10 logarithm of the argument
Returns the base-2 logarithm of the argument
Return the base "base" logarithm of the argument
Return ap
Returns the square root of a
Returns the number in binary format
If the argument is an int, hex returns the number as a string in hex format. Otherwise if the number is a string,
it converts each character into its hex representation and returns the resulting string.
Inverse of hex. Interprets each pair of characters as a hexidecimal number and converts to the character
represented by the number.
Converts a number from a given base to another
Returns the absolute value
Returns the positive value of a mod b
Returns the sine of a (a is in radians)
Returns the arc sin of x if -1<=a<=1 or null otherwise
Returns the cosine of a (a is in radians)
Returns the arc cosine of x if -1<=a<=1 or null otherwise
Returns the tangent of a (a is in radians)
Returns the arctangent of a
Converts value of a from radians to degrees
Converts value of a from degrees to radians
Returns a
Returns -a
Returns the sign of a as '1.0' or '-1.0'
Returns the value of e
Returns the value of pi
Mathematical Functions
Return Type
BIGINT
DOUBLE
BIGINT
BIGINT
double
double
double
double
double
double
double
double
string
string
string
string
double
int double
double
double
double
double
double
double
double
double
int double
int double
float
double
double
Name (Signature)
round(double a)
round(double a, int d)
floor(double a)
ceil(double a), ceiling(double a)
rand(), rand(int seed)
exp(double a)
ln(double a)
log10(double a)
log2(double a)
log(double base, double a)
pow(double a, double p), power(double a, double p)
sqrt(double a)
bin(BIGINT a)
hex(BIGINT a) hex(string a)
unhex(string a)
conv(BIGINT num, int from_base, int to_base),
conv(STRING num, int from_base, int to_base)
abs(double a)
pmod(int a, int b) pmod(double a, double b)
sin(double a)
asin(double a)
cos(double a)
acos(double a)
tan(double a)
atan(double a)
degrees(double a)
radians(double a)
positive(int a), positive(double a)
negative(int a), negative(double a)
sign(double a)
e()
pi()
Description
Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the
timestamp of that moment in the current system time zone in the format of "1970-01-01 00:00:00"
Gets current time stamp using the default time zone.
Converts time string in format yyyy-MM-dd HH:mm:ss to Unix time stamp, return 0 if fail: unix_timestamp
('2009-03-20 11:30:01') = 1237573801
Convert time string with given pattern to Unix time stamp, return 0 if fail: unix_timestamp('2009-03-20',
'yyyy-MM-dd') = 1237532400
Returns the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01"
Returns the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01")
= 1970
Returns the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month
("1970-11-01") = 11
Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1
Returns the hour of the timestamp: hour('2009-07-30 12:58:59') = 12, hour('12:58:59') = 12
Returns the minute of the timestamp
Returns the second of the timestamp
Return the week number of a timestamp string: weekofyear("1970-11-01 00:00:00") = 44, weekofyear
("1970-11-01") = 44
Return the number of days from startdate to enddate: datediff('2009-03-01', '2009-02-27') = 2
Add a number of days to startdate: date_add('2008-12-31', 1) = '2009-01-01'
Subtract a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30'
Assumes given timestamp ist UTC and converts to given timezone (as of Hive 0.8.0)
Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0)
Date Functions
Return Type
string
bigint
bigint
bigint
string
int
int
int
int
int
int
int
int
string
string
timestamp
timestamp
Name (Signature)
from_unixtime(bigint unixtime[, string format])
unix_timestamp()
unix_timestamp(string date)
unix_timestamp(string date, string pattern)
to_date(string timestamp)
year(string date)
month(string date)
day(string date) dayofmonth(date)
hour(string date)
minute(string date)
second(string date)
weekofyear(string date)
datediff(string enddate, string startdate)
date_add(string startdate, int days)
date_sub(string startdate, int days)
from_utc_timestamp(timestamp, string timezone)
to_utc_timestamp(timestamp, string timezone)
Hive Function Meta commands
SHOW FUNCTIONS– lists Hive functions and operators
DESCRIBE FUNCTION [function name]– displays short description of the function
DESCRIBE FUNCTION EXTENDED [function name]– access extended description of the function
Types of Hive Functions
UDF– is a function that takes one or more columns from a row as argument and returns a single value or object. Eg: concat(col1, col2)
UDAF- aggregates column values in multiple rows and returns a single value. Eg: sum(c1)
UDTF— takes zero or more inputs and and produces multiple columns or rows of output. Eg: explode()
Macros— a function that users other Hive functions.
How To Develop UDFs
package org.apache.hadoop.hive.contrib.udf.example;
import java.util.Date;
import java.text.SimpleDateFormat;
import org.apache.hadoop.hive.ql.exec.UDF;
@Description(name = "YourUDFName",
value = "_FUNC_(InputDataType) - using the input datatype X argument, "+
"returns YYY.",
extended = "Example:n"
+ " > SELECT _FUNC_(InputDataType) FROM tablename;")
public class YourUDFName extends UDF{
..
public YourUDFName( InputDataType InputValue ){
..;
}
public String evaluate( InputDataType InputValue ){
..;
}
}
How To Develop UDFs, GenericUDFs, UDAFs, and UDTFs
public class YourUDFName extends UDF{
public class YourGenericUDFName extends GenericUDF {..}
public class YourGenericUDAFName extends AbstractGenericUDAFResolver {..}
public class YourGenericUDTFName extends GenericUDTF {..}
How To Deploy / Drop UDFs
At start of each session:
ADD JAR /full_path_to_jar/YourUDFName.jar;
CREATE TEMPORARY FUNCTION YourUDFName AS 'org.apache.hadoop.hive.contrib.udf.example.YourUDFName';
At the end of each session:
DROP TEMPORARY FUNCTION IF EXISTS YourUDFName;

More Related Content

PDF
R-Shiny Cheat sheet
PDF
Testing in the World of Functional Programming
PDF
Advanced Tagless Final - Saying Farewell to Free
PDF
Principled Error Handling with FP
PDF
Map, Reduce and Filter in Swift
PDF
Why async and functional programming in PHP7 suck and how to get overr it?
PDF
Data transformation-cheatsheet
R-Shiny Cheat sheet
Testing in the World of Functional Programming
Advanced Tagless Final - Saying Farewell to Free
Principled Error Handling with FP
Map, Reduce and Filter in Swift
Why async and functional programming in PHP7 suck and how to get overr it?
Data transformation-cheatsheet

What's hot (20)

PDF
Oh, All the things you'll traverse
PPTX
Functional Programming in Swift
PDF
Simple IO Monad in 'Functional Programming in Scala'
PDF
Why The Free Monad isn't Free
PDF
Data structure lab manual
PPTX
A quick introduction to R
PDF
Functional programming basics
PDF
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
PDF
The Ring programming language version 1.7 book - Part 25 of 196
PDF
R Programming: Learn To Manipulate Strings In R
PDF
Monoids, monoids, monoids
PDF
Scala collections
PPTX
Tupple ware in scala
PDF
Stata Programming Cheat Sheet
PDF
Munihac 2018 - Beautiful Template Haskell
ODP
Beginning Scala Svcc 2009
PDF
Stata cheat sheet: data processing
PDF
Stata cheat sheet analysis
PDF
The Ring programming language version 1.8 book - Part 29 of 202
PDF
The Ring programming language version 1.5.1 book - Part 20 of 180
Oh, All the things you'll traverse
Functional Programming in Swift
Simple IO Monad in 'Functional Programming in Scala'
Why The Free Monad isn't Free
Data structure lab manual
A quick introduction to R
Functional programming basics
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Ring programming language version 1.7 book - Part 25 of 196
R Programming: Learn To Manipulate Strings In R
Monoids, monoids, monoids
Scala collections
Tupple ware in scala
Stata Programming Cheat Sheet
Munihac 2018 - Beautiful Template Haskell
Beginning Scala Svcc 2009
Stata cheat sheet: data processing
Stata cheat sheet analysis
The Ring programming language version 1.8 book - Part 29 of 202
The Ring programming language version 1.5.1 book - Part 20 of 180
Ad

Viewers also liked (15)

PPTX
2017.03.13 Financialisation as a Strategic Action Field: An Historically Info...
PDF
2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climat...
PDF
INTRODUCTION TO TIME SERIES ANALYSIS WITH “R” JUNE 2014
PDF
Course - Machine Learning Basics with R
PDF
Using Gradient Descent for Optimization and Learning
PDF
2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climat...
PPT
Arima model (time series)
PDF
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...
PDF
Time Series Analysis: Theory and Practice
PPTX
ML on Big Data: Real-Time Analysis on Time Series
PPTX
Demand forecasting by time series analysis
PDF
How to become a data scientist in 6 months
PDF
SQL to Hive Cheat Sheet
PDF
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
PPT
DIstinguish between Parametric vs nonparametric test
2017.03.13 Financialisation as a Strategic Action Field: An Historically Info...
2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climat...
INTRODUCTION TO TIME SERIES ANALYSIS WITH “R” JUNE 2014
Course - Machine Learning Basics with R
Using Gradient Descent for Optimization and Learning
2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climat...
Arima model (time series)
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climat...
Time Series Analysis: Theory and Practice
ML on Big Data: Real-Time Analysis on Time Series
Demand forecasting by time series analysis
How to become a data scientist in 6 months
SQL to Hive Cheat Sheet
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
DIstinguish between Parametric vs nonparametric test
Ad

Similar to Hive function-cheat-sheet (20)

PDF
Hive Functions Cheat Sheet
PPT
Oracle sql ppt2
DOCX
Functions.docx
PPTX
functions
PPTX
Ch no 4 Python Functions,Modules & packages.pptx
PPTX
matlab presentation fro engninering students
PPT
Python High Level Functions_Ch 11.ppt
PDF
Oracle sql functions
PPTX
XIX PUG-PE - Pygame game development
PPT
User deined functions cbse class xii computer science
DOC
Math and trigonometry functions
PPT
Numeric functions in SQL | Oracle
PPTX
Using-Python-Libraries.9485146.powerpoint.pptx
PDF
Laziness, trampolines, monoids and other functional amenities: this is not yo...
PDF
High-Performance Haskell
PPTX
Computer programming 2 Lesson 10
PPTX
Function for python for the embedded system
PPT
Intro to tsql unit 10
PPTX
Python programming: Anonymous functions, String operations
PPTX
Laziness, trampolines, monoids and other functional amenities: this is not yo...
Hive Functions Cheat Sheet
Oracle sql ppt2
Functions.docx
functions
Ch no 4 Python Functions,Modules & packages.pptx
matlab presentation fro engninering students
Python High Level Functions_Ch 11.ppt
Oracle sql functions
XIX PUG-PE - Pygame game development
User deined functions cbse class xii computer science
Math and trigonometry functions
Numeric functions in SQL | Oracle
Using-Python-Libraries.9485146.powerpoint.pptx
Laziness, trampolines, monoids and other functional amenities: this is not yo...
High-Performance Haskell
Computer programming 2 Lesson 10
Function for python for the embedded system
Intro to tsql unit 10
Python programming: Anonymous functions, String operations
Laziness, trampolines, monoids and other functional amenities: this is not yo...

More from Dr. Volkan OBAN (20)

PDF
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
PDF
Covid19py Python Package - Example
PDF
Object detection with Python
PDF
Python - Rastgele Orman(Random Forest) Parametreleri
DOCX
Linear Programming wi̇th R - Examples
DOCX
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
DOCX
k-means Clustering in Python
DOCX
Naive Bayes Example using R
DOCX
R forecasting Example
DOCX
k-means Clustering and Custergram with R
PDF
Data Science and its Relationship to Big Data and Data-Driven Decision Making
DOCX
Data Visualization with R.ggplot2 and its extensions examples.
PDF
Scikit-learn Cheatsheet-Python
PDF
Python Pandas for Data Science cheatsheet
PDF
Pandas,scipy,numpy cheatsheet
PPTX
ReporteRs package in R. forming powerpoint documents-an example
PPTX
ReporteRs package in R. forming powerpoint documents-an example
DOCX
R-ggplot2 package Examples
DOCX
R Machine Learning packages( generally used)
DOCX
treemap package in R and examples.
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
Covid19py Python Package - Example
Object detection with Python
Python - Rastgele Orman(Random Forest) Parametreleri
Linear Programming wi̇th R - Examples
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
k-means Clustering in Python
Naive Bayes Example using R
R forecasting Example
k-means Clustering and Custergram with R
Data Science and its Relationship to Big Data and Data-Driven Decision Making
Data Visualization with R.ggplot2 and its extensions examples.
Scikit-learn Cheatsheet-Python
Python Pandas for Data Science cheatsheet
Pandas,scipy,numpy cheatsheet
ReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an example
R-ggplot2 package Examples
R Machine Learning packages( generally used)
treemap package in R and examples.

Recently uploaded (20)

PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
Sciences of Europe No 170 (2025)
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
famous lake in india and its disturibution and importance
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
BIOMOLECULES PPT........................
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
The KM-GBF monitoring framework – status & key messages.pptx
HPLC-PPT.docx high performance liquid chromatography
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
ECG_Course_Presentation د.محمد صقران ppt
Derivatives of integument scales, beaks, horns,.pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
INTRODUCTION TO EVS | Concept of sustainability
Biophysics 2.pdffffffffffffffffffffffffff
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Sciences of Europe No 170 (2025)
Placing the Near-Earth Object Impact Probability in Context
2Systematics of Living Organisms t-.pptx
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
famous lake in india and its disturibution and importance
POSITIONING IN OPERATION THEATRE ROOM.ppt
Cell Membrane: Structure, Composition & Functions
BIOMOLECULES PPT........................

Hive function-cheat-sheet

  • 1. Hive Functions Cheat-sheet, by Qubole How to create and use Hive Functions, Listing of Built-In Functions that are supported in Hive www.qubole.com QUESTIONS? CALL US 855-HADOOP-HELP Description Returns the rounded BIGINT value of the double Returns the double rounded to d decimal places Returns the maximum BIGINT value that is equal or less than the double Returns the minimum BIGINT value that is equal or greater than the double Returns a random number (that changes from row to row) that is distributed uniformly from 0 to 1. Specifiying the seed will make sure the generated random number sequence is deterministic. Returns ea where e is the base of the natural logarithm Returns the natural logarithm of the argument Returns the base-10 logarithm of the argument Returns the base-2 logarithm of the argument Return the base "base" logarithm of the argument Return ap Returns the square root of a Returns the number in binary format If the argument is an int, hex returns the number as a string in hex format. Otherwise if the number is a string, it converts each character into its hex representation and returns the resulting string. Inverse of hex. Interprets each pair of characters as a hexidecimal number and converts to the character represented by the number. Converts a number from a given base to another Returns the absolute value Returns the positive value of a mod b Returns the sine of a (a is in radians) Returns the arc sin of x if -1<=a<=1 or null otherwise Returns the cosine of a (a is in radians) Returns the arc cosine of x if -1<=a<=1 or null otherwise Returns the tangent of a (a is in radians) Returns the arctangent of a Converts value of a from radians to degrees Converts value of a from degrees to radians Returns a Returns -a Returns the sign of a as '1.0' or '-1.0' Returns the value of e Returns the value of pi Mathematical Functions Return Type BIGINT DOUBLE BIGINT BIGINT double double double double double double double double string string string string double int double double double double double double double double double int double int double float double double Name (Signature) round(double a) round(double a, int d) floor(double a) ceil(double a), ceiling(double a) rand(), rand(int seed) exp(double a) ln(double a) log10(double a) log2(double a) log(double base, double a) pow(double a, double p), power(double a, double p) sqrt(double a) bin(BIGINT a) hex(BIGINT a) hex(string a) unhex(string a) conv(BIGINT num, int from_base, int to_base), conv(STRING num, int from_base, int to_base) abs(double a) pmod(int a, int b) pmod(double a, double b) sin(double a) asin(double a) cos(double a) acos(double a) tan(double a) atan(double a) degrees(double a) radians(double a) positive(int a), positive(double a) negative(int a), negative(double a) sign(double a) e() pi() Description Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of "1970-01-01 00:00:00" Gets current time stamp using the default time zone. Converts time string in format yyyy-MM-dd HH:mm:ss to Unix time stamp, return 0 if fail: unix_timestamp ('2009-03-20 11:30:01') = 1237573801 Convert time string with given pattern to Unix time stamp, return 0 if fail: unix_timestamp('2009-03-20', 'yyyy-MM-dd') = 1237532400 Returns the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01" Returns the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970 Returns the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month ("1970-11-01") = 11 Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1 Returns the hour of the timestamp: hour('2009-07-30 12:58:59') = 12, hour('12:58:59') = 12 Returns the minute of the timestamp Returns the second of the timestamp Return the week number of a timestamp string: weekofyear("1970-11-01 00:00:00") = 44, weekofyear ("1970-11-01") = 44 Return the number of days from startdate to enddate: datediff('2009-03-01', '2009-02-27') = 2 Add a number of days to startdate: date_add('2008-12-31', 1) = '2009-01-01' Subtract a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30' Assumes given timestamp ist UTC and converts to given timezone (as of Hive 0.8.0) Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0) Date Functions Return Type string bigint bigint bigint string int int int int int int int int string string timestamp timestamp Name (Signature) from_unixtime(bigint unixtime[, string format]) unix_timestamp() unix_timestamp(string date) unix_timestamp(string date, string pattern) to_date(string timestamp) year(string date) month(string date) day(string date) dayofmonth(date) hour(string date) minute(string date) second(string date) weekofyear(string date) datediff(string enddate, string startdate) date_add(string startdate, int days) date_sub(string startdate, int days) from_utc_timestamp(timestamp, string timezone) to_utc_timestamp(timestamp, string timezone) Hive Function Meta commands SHOW FUNCTIONS– lists Hive functions and operators DESCRIBE FUNCTION [function name]– displays short description of the function DESCRIBE FUNCTION EXTENDED [function name]– access extended description of the function Types of Hive Functions UDF– is a function that takes one or more columns from a row as argument and returns a single value or object. Eg: concat(col1, col2) UDAF- aggregates column values in multiple rows and returns a single value. Eg: sum(c1) UDTF— takes zero or more inputs and and produces multiple columns or rows of output. Eg: explode() Macros— a function that users other Hive functions. How To Develop UDFs package org.apache.hadoop.hive.contrib.udf.example; import java.util.Date; import java.text.SimpleDateFormat; import org.apache.hadoop.hive.ql.exec.UDF; @Description(name = "YourUDFName", value = "_FUNC_(InputDataType) - using the input datatype X argument, "+ "returns YYY.", extended = "Example:n" + " > SELECT _FUNC_(InputDataType) FROM tablename;") public class YourUDFName extends UDF{ .. public YourUDFName( InputDataType InputValue ){ ..; } public String evaluate( InputDataType InputValue ){ ..; } } How To Develop UDFs, GenericUDFs, UDAFs, and UDTFs public class YourUDFName extends UDF{ public class YourGenericUDFName extends GenericUDF {..} public class YourGenericUDAFName extends AbstractGenericUDAFResolver {..} public class YourGenericUDTFName extends GenericUDTF {..} How To Deploy / Drop UDFs At start of each session: ADD JAR /full_path_to_jar/YourUDFName.jar; CREATE TEMPORARY FUNCTION YourUDFName AS 'org.apache.hadoop.hive.contrib.udf.example.YourUDFName'; At the end of each session: DROP TEMPORARY FUNCTION IF EXISTS YourUDFName;