奇米影视亚洲四色8888,波多野结衣久久

關(guān)于sparksql

印度阿三17 >《開發(fā)》

2019.03.26

關(guān)注

1.讀取json文件，并且進行查詢等操作

所使用的jar包為

?

json文件內(nèi)容

{ "id":1 ,"name":" Ella","age":36 } { "id":2,"name":"Bob","age":29 } { "id":3 ,"name":"Jack","age":29 } { "id":4 ,"name":"Jim","age":28 } { "id":5 ,"name":"Damon" } { "id":5 ,"name":"Damon" }

　　val conf = new SparkConf().setAppName("DataFrameTest").setMaster("local")    val sc = new SparkContext(conf)    val sqlContext = new SQLContext(sc)      val df = sqlContext.jsonFile("H:\\文件\\數(shù)據(jù)集\\test1\\1.json");
　　    df.show()  　　　　　　//查詢所有    df.distinct.show()   //去重    df.filter(df.col("age")>20).show()  //age>20的行    df.groupBy("name").count().show()   //根據(jù)name分組    df.sort(df("name").asc).show()      //將數(shù)據(jù)按 name 升序排列    df.head(3).foreach(print)           //取出前 3 行數(shù)據(jù)    df.select(df("name").as("username")).show()  //查詢所有記錄的 name 列，并為其取別名為 username    df.agg("age"->"avg").foreach(print)   //查詢年齡 age 的平均值    df.agg("age"->"min").foreach(print)   //) 查詢年齡 age 的最小值

2.編程實現(xiàn)將 RDD 轉(zhuǎn)換為 DataFrame

文件內(nèi)容

1,Ella,36

2,Bob,29

3,Jack,29

?

 val conf = new SparkConf()         conf.setMaster("local")             .setAppName("Testsql")         val sc = new SparkContext(conf)         val sqlContext = new SQLContext(sc)       //hdfs://192.168.6.134:9000/wys/1.txt       //H:\文件\數(shù)據(jù)集       val aRDD = sc.textFile("H:\\文件\\數(shù)據(jù)集\\test1\\2.txt", 1)      .map { line => Row(line.split(",")(0), line.split(",")(1),line.split(",")(2) )}         // 第二步，編程方式動態(tài)構(gòu)造元數(shù)據(jù)      val structType = StructType(Array(          StructField("id", StringType, true),          StructField("name", StringType, true),          StructField("age", StringType, true)))              // 第三步，進行RDD到DataFrame的轉(zhuǎn)換      val aDF = sqlContext.createDataFrame(aRDD, structType)              // 繼續(xù)正常使用      aDF.registerTempTable("A")        val teenagerDF4 = sqlContext.sql("select id,name,age from A")        teenagerDF4.map(t => "id:" t(0) "," "name:" t(1) "," "age:" t(2)).foreach(println)

3.編程實現(xiàn)利用 DataFrame 讀寫 MySQL 的數(shù)據(jù)

 val conf = new SparkConf()         conf.setMaster("local")               .setAppName("Testsql")        val sc = new SparkContext(conf)       val sqlContext = new SQLContext(sc)      val employeeRDD = sc.parallelize(Array("3 Mary F 26","4 Tom M 23")).map(_.split(" "))      val schema = StructType(List(StructField("id", IntegerType,true),StructField("name", StringType, true),StructField("gender", StringType,true),StructField("age", IntegerType, true)))      val rowRDD = employeeRDD.map(p => Row(p(0).toInt,p(1).trim,p(2).trim,p(3).toInt))      val employeeDF = sqlContext.createDataFrame(rowRDD, schema)      val prop = new Properties()      prop.put("user", "root")      prop.put("password", "root")      prop.put("driver","com.mysql.jdbc.Driver")      employeeDF.write.mode("append").jdbc("jdbc:mysql://localhost:3306/sparktest","sparktest.spark", prop)      val jdbcDF = sqlContext.read.format("jdbc").option("url","jdbc:mysql://localhost:3306/sparktest").option("driver","com.mysql.jdbc.Driver").option("dbtable","spark").option("user","root").option("password", "root").load()      jdbcDF.agg("age" -> "max", "age" -> "sum")

?

來源：http://www.icode9.com/content-2-149201.html

本站僅提供存儲服務(wù)，所有內(nèi)容均由用戶發(fā)布，如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點擊舉報。

打開APP，閱讀全文并永久保存查看更多類似文章

理解Spark SQL(二）—— SQLContext和HiveContext

Spark RDD、DataFrame和DataSet的區(qū)別

spark sql根本使用方法介紹

大數(shù)據(jù)IMF傳奇行動絕密課程第58課：使用Java和Scala在IDE中開發(fā)DataFrame實戰(zhàn)

SparkSQL內(nèi)置函數(shù)

php中遍歷二維數(shù)組的幾種方式

更多類似文章 >>

免费视频淫片aa毛片_日韩高清在线亚洲专区vr_日韩大片免费观看视频播放_亚洲欧美国产精品完整版