Though, Hive has a list of built in functions, in some scenarios we need user defined functions to be written in Java for some specific use cases.
We can use two interfaces which can be used to write UDFs for apache Hive.
Steps to create Hive-UDF
Step 1:-
Open your Eclipse then create a java Class Name
Step 2:-
Add Jar files to project folder
Step 3 :-
Extend UDF Abstract Class
public class classname extends UDF and you return the value.
Step 4 :-
Implement evaluate() method . This method is called once for every row of data being processed
Step 5:-
Compile and create jar file.
Step 6:-
Add jar file to hive class path.
In hive terminal – add jar <jar file path>
Step 7 :-
Create temporary function in Hive Terminal.
CREATE temporary function Convert as ‘udf.Convert′;
udf represents the package name and Convert represents the program name .
For example:
packageudf
importorg.apache.hadoop.hive.ql.exec.UDF;
importorg.apache.hadoop.io.Text;
publicclassConvertextends UDF{
private Text result =new Text();
public Text evaluate(String str){
int number;
number=Integer.parseInt(str);
float fno=(float) number;
String res=Float.toString(fno);
result.set(res);
return result;
}
}
Here, We have extended UDF abstract class.
This code converts Int to Float.
Assuming a hive table Demo contains column ID with following data:
1
2
3
5
Select Convert(ID) from Demo gives following output :
1.0
2.0
3.0
5.0
Comments are closed here.