20-SparkSQL01

字数 179阅读 20

Spark SQL

IOE

SQL:schema + file

select ... from xxx where.....

SQL on Hadoop

Hive

Impala

Presto

Shark

Drill

Phoenix

Spark SQL

Hive on Spark

MapReduce

Tez

Spark

Spark API

SQL

DataFrame/Dataset

start-thriftserver.sh

Spark SQL is not about SQL

Spark SQL is about more then SQL

===>

ETL  : DataSource API 

V1

V2

Frontend

Catalyst Spark SQL的核心

Backend

create table dept(

deptno int, dname string, loc string

)row format delimited fields terminated by '\t';

load data local inpath '/home/hadoop/data/dept.txt' overwrite into table dept;

select e.empno,e.ename,d.dname from emp e join dept d on e.deptno=d.deptno;

create tablerpgone_test(key string,value string);

explain extended select a.key*(5+6), b.value 

from ruoze_test a join ruoze_test b

on a.key=b.key and a.key>10; 

大数据数据最简单的方式就是:忽略它

thriftserver和spark-sql或者spark-shell的区别在哪?

推荐阅读更多精彩内容