- Hive版本与下载地址
- 项目编译
- 配置Hive
- 在Intellij IDEA中导入与调试项目
1. Hive版本与下载地址
http://archive.apache.org/dist/hive/hive-1.2.1/
2. 项目编译
依赖:Apache Maven 3.6.0、JDK 1.8.0_144、Hadoop 2.X
打包命令:mvn clean package -Phadoop-2 -DskipTests -Pdist
clean表示删除$HIVE_HOME/packaging/target目录 -Pdist表示使用pom.xml中名为dist的profile;-Phadoop-2表示支持hadoop 2;-DskipTests表示跳过测试。当命令执行完毕后,我们可以在“apache-hive-1.2.1-src/packaging/target/apache-hive-1.2.1-bin/apache-hive-1.2.1-bin”找到编译完成的完整的项目。为了能够正常使用编译好的Hive,我们对它进行相应的配置。
3. 配置Hive
为了能够中终端中使用hive命令,我们在.bashrc(Mac OS为.bash_profile文件)中追加
export HIVE_HOME=/Users/pwrliang/Projects/apache-hive-1.2.1-src/packaging/target/apache-hive-1.2.1-bin/apache-hive-1.2.1-bin export PATH=$PATH:$HIVE_HOME/bin
我们在conf/hive-site.xml中添加以下内容来配置metadata存贮的位置,下面的配置文件使用derby数据库存储metadata。mapreduce.framework.name变量表示,我们尽可能的使用local模式来执行SQL。
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse-testing</value>
<description>Local or HDFS directory where Hive keeps table contents.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/Users/pwrliang/Projects/apache-hive-1.2.1-src/packaging/target/apache-hive-1.2.1-bin/apache-hive-1.2.1-bin/metastore_db;create=true</value>
<description>The JDBC connection URL.</description>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>local</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp/hive-log/${user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
</configuration>
我们接下来修改hive-env.sh,中其中指定HADOOP_HOME与HIVE_CONF_DIR的位置
# Set HADOOP_HOME to point to a specific hadoop install directory
# HADOOP_HOME=${bin}/../../hadoop
HADOOP_HOME=/Users/pwrliang/hadoop-2.7.7
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/Users/pwrliang/Projects/apache-hive-1.2.1-src/packaging/target/apache-hive-1.2.1-bin/apache-hive-1.2.1-bin/conf
当上述配置修改完毕后,我们使用source ~/.bashrc命令使终端中的环境变量生效。接下来执行hive,创建新表来测试环境是否配置成功。
4. 在Intellij IDEA中导入与调试项目
点击File->Close关闭当前项目,接下来点击Import Project导入“apache-hive-1.2.1-src”。当导入完毕后,我们点击Idea右边的Maven,展开Profiles,勾选hadoop-2。
我们点击Run-Edit Configurations,中Templates中选择Remote。新建一个远程调试的Configuration。然后我们使用以下命令启动一个启用了远程调试的hive进程
hive --debug -hiveconf hive.root.logger=DEBUG,console
接下来,点击刚刚中Idea中创建的远程Configuration,来连接到正在等待的hive。下面,我们就可以中Idea中下断点,对Hive进行单步调试与分析了。
参考: