pyspark dataframe基本用法
生活随笔
收集整理的這篇文章主要介紹了
pyspark dataframe基本用法
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
pyspark dataframe基本用法
#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Fri Mar 8 19:10:57 2019@author: lg """from pyspark.sql import SparkSessionupper='/opt/spark/spark-2.4.0-bin-hadoop2.7/' spark = SparkSession \.builder \.appName("Python Spark SQL basic example") \.config("spark.some.config.option", "some-value") \.getOrCreate() # spark is an existing SparkSession df = spark.read.json(upper+"examples/src/main/resources/people.json") # Displays the content of the DataFrame to stdout df.show()df.printSchema()df.select("name").show()df.select(df['name'], df['age'] + 1).show() df.filter(df['age'] > 21).show()df.groupBy("age").count().show()# Register the DataFrame as a SQL temporary view df.createOrReplaceTempView("people")sqlDF = spark.sql("SELECT * FROM people") sqlDF.show()# Register the DataFrame as a global temporary view df.createGlobalTempView("people")# Global temporary view is tied to a system preserved database `global_temp` spark.sql("SELECT * FROM global_temp.people").show()# Global temporary view is cross-session spark.newSession().sql("SELECT * FROM global_temp.people").show() # +----+-------+ # | age| name| # +----+-------+ # |null|Michael| # | 30| Andy| # | 19| Justin| # +----+-------+spark.stop()posted on 2019-03-08 19:24 luoganttcc 閱讀(...) 評論(...) 編輯 收藏
總結
以上是生活随笔為你收集整理的pyspark dataframe基本用法的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: pyspark 读取本txt 构建RD
- 下一篇: neo4j删除所有节点