TOP

Spark讀寫Elasticsearch
2018-12-26 01:25:11 】 瀏覽:836
Tags:

Spark讀寫Elasticsearch

版本說明

Spark:2.3.1

Elasticsearch: elasticsearch-6.4.0

1 Scala環境下Spark讀寫Elasticsearch

1.1 依賴包

1.1.1 Spark依賴

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>${spark.version}</version>
    <exclusions>
        <exclusion>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
        </exclusion>
    </exclusions>
</dependency>

1.1.2 Elasticeach依賴

<!--elasticsearch-->
<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch-hadoop</artifactId>
    <version>6.4.0</version>
</dependency>

1.2 RDD讀寫ES

使用RDD讀寫ES优乐棋牌app下载,主要是使用了SparkContext()的esRDD()和saveToEs()兩個方法。但是這個兩個方法需要引入es的包之后才有

import org.elasticsearch.spark._

1.2.1 寫數據到ES

在這之前先寫一個case class 用于創建RDD

case class Course(name: String, credit: Int)
 val conf = new SparkConf().setAppName(this.getClass.getSimpleName).setMaster("local")
    conf.set("es.nodes", "192.168.1.188")
    conf.set("es.port", "9200")
    conf.set("es.index.auto.create", "true")
    val sc = new SparkContext(conf)
 	//rdd寫es
    val courseRdd = sc.makeRDD(Seq(Course("Hadoop", 4), Course("Spark", 3), Course("Kafka", 4)))
    courseRdd.saveToEs("/course/rdd")

1.2.2 從ES讀數據