博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Hive SQL基础操作
阅读量:4136 次
发布时间:2019-05-25

本文共 4115 字,大约阅读时间需要 13 分钟。

创建表

hive

查看本地的文件

#Can execute local commands within CLI, place a command in between ! and ;
!cat data/text.txt;

文件的内容

user1,Funny Story,1343182026191
user2,Cool Deal,1343182133839
user4,Interesting Post,1343182154633
user5,Yet Another Blog,13431839394

建表

CREATE TABLE posts (user STRING, post STRING, time BIGINT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

hive> show tables;

hive> describe posts;

加载数据

把文件加载到表中

LOAD DATA LOCAL INPATH 'text.txt'
OVERWRITE INTO TABLE posts;

– Load data from HDFS location

hive> LOAD DATA INPATH '/training/hive/user-posts.txt'
> OVERWRITE INTO TABLE posts;

– Load data from a local file system

hive> LOAD DATA LOCAL INPATH 'data/user-posts.txt'
> OVERWRITE INTO TABLE posts;

如果在hdfs里存在大量的同类型格式的文件,可以通过已有文件来建表

hive> CREATE EXTERNAL TABLE posts

> (user STRING, post STRING, time BIGINT)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> STORED AS TEXTFILE
> LOCATION '/training/hive/';

测试一下是否在hdfs里
hdfs dfs -cat /user/hive/warehouse/posts/text.txt

查询数据

select count (1) from posts;
select * from posts where user="user2";
select * from posts where time<=1343182133839 limit 2;

DROP TABLE posts;

将会在hdfs系统里执行删除
hdfs dfs -ls /user/hive/warehouse/

带分区的表

为了提高性能,还可以使用分区,如果设置了分区,用户

在插入数据时必须指定一个分区

 

CREATE TABLE posts1 (user STRING, post STRING, time BIGINT)

PARTITIONED BY(country STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

describe posts1;

用户在插入数据时必须指定一个分区

LOAD DATA LOCAL INPATH 'text.txt'
OVERWRITE INTO TABLE posts1 PARTITION(country='US');

LOAD DATA LOCAL INPATH 'text.txt'

OVERWRITE INTO TABLE posts1 PARTITION(country='AUSTRALIA');

显示已有的分区

show partitions posts;
hdfs dfs -ls -R /user/hive/warehouse/posts

查询

select * from posts where country='US' limit 10;

Join表

CREATE TABLE posts_likes (user STRING, post STRING, likes_count INT);

INSERT OVERWRITE TABLE posts_likes
> SELECT p.user, p.post, l.count
> FROM posts p JOIN likes l ON (p.user = l.user);

Outer Join表

SELECT p.*, l.*
FROM posts p LEFT OUTER JOIN likes l ON (p.user = l.user)
limit 10;
SELECT p.*, l.*
FROM posts p RIGHT OUTER JOIN likes l ON (p.user = l.user)
limit 10;
SELECT p.*, l.*
FROM posts p FULL OUTER JOIN likes l ON (p.user = l.user)
limit 10;
Resources

JDBC Program

import java.sql.SQLException;import java.sql.Connection;import java.sql.ResultSet;import java.sql.Statement;import java.sql.DriverManager;public class HiveCreateTable {private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";public static void main(String[] args) throws SQLException {// Register driver and create driver instanceClass.forName(driverName);// get connectionConnection con = DriverManager.getConnection("jdbc:hive://localhost:10000/userdb", "", "");// create statementStatement stmt = con.createStatement();// execute statementstmt.executeQuery("CREATE TABLE IF NOT EXISTS "+" employee ( eid int, name String, "+" salary String, destignation String)"+" COMMENT ‘Employee details’"+" ROW FORMAT DELIMITED"+" FIELDS TERMINATED BY ‘\t’"+" LINES TERMINATED BY ‘\n’"+" STORED AS TEXTFILE;");System.out.println(“ Table employee created.”);con.close();}}

 

$ javac HiveCreateDb.java

$ java HiveCreateDb

import java.sql.SQLException;import java.sql.Connection;import java.sql.ResultSet;import java.sql.Statement;import java.sql.DriverManager;public class HiveQLWhere {private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";public static void main(String[] args) throws SQLException {// Register driver and create driver instanceClass.forName(driverName);// get connectionConnection con = DriverManager.getConnection("jdbc:hive://localhost:10000/userdb", "", "");// create statementStatement stmt = con.createStatement();// execute statementResultset res = stmt.executeQuery("SELECT * FROM employee WHERE salary>30000;");System.out.println("Result:");System.out.println(" ID \t Name \t Salary \t Designation \t Dept ");while (res.next()) {System.out.println(res.getInt(1) + " " + res.getString(2) + " " + res.getDouble(3) + " " + res.getString(4) + " " + res.getString(5));}con.close();}}
View Code

 

$ javac HiveQLWhere.java

$ java HiveQLWhere

转载地址:http://yapvi.baihongyu.com/

你可能感兴趣的文章
数据挖掘十大经典算法(6) PageRank
查看>>
数据挖掘十大经典算法(7) AdaBoost
查看>>
数据挖掘十大经典算法(9) 朴素贝叶斯分类器 Naive Bayes
查看>>
数据挖掘十大经典算法(10) CART: 分类与回归树
查看>>
初探数据挖掘中的十大经典算法
查看>>
android- ViewPager的跳转Fragment
查看>>
android-Banner控件的简单轮播图
查看>>
android-数据存入SP SP读取数据(简单案例)
查看>>
android-创建数据库存入数据,数据库数据的增删改查
查看>>
dccker-maven插件出现“Failed to execute goal com.spotifydocker-maven-plugin0.4.13....Permission den问题
查看>>
Java内存模型及其原理
查看>>
synchronized的原理及应用
查看>>
探索大数据基础设施容器化 | StartDT Tech Lab 04
查看>>
MFC——子窗口之间传递参数和调用函数
查看>>
Ubuntu18.04安装NVIDIA显卡驱动
查看>>
Windows10配置darknet环境遇到的若干问题
查看>>
Libtorch部署模型
查看>>
雅虎采用开源搜索软件Hadoop 对抗谷歌
查看>>
李开复:新一代互联网的七大趋势
查看>>
丁磊“痛悔”网易上市 提醒企业家勿贪图虚荣
查看>>