Docy

Spark Context and spark Session

Estimated reading: 2 minutes 880 views

SparkContext

Spark SparkContext is an entry point to Spark and defined in org.apache.spark package since 1.x and used to programmatically create Spark RDD, accumulators and broadcast variables on the cluster. Since Spark 2.0 most of the functionalities (methods) available in SparkContext are also available in SparkSession. Its object sc is default available in spark-shell and it can be programmatically created using SparkContext class.

SparkSession

SparkSession introduced in version 2.0 and and is an entry point to underlying Spark functionality in order to programmatically create Spark RDD, DataFrame and DataSet. It’s object spark is default available in spark-shell and it can be created programmatically using SparkSession builder pattern.

sparkconf()

Used to set various Spark parameters as key-value pairs. Most of the time, you would create a SparkConf object with SparkConf() , which will load values from spark. * Java system properties as well.

pyspark. SparkConf.
contains (key) Does this configuration contain a given key?
setAppName (value) Set application name.

Leave a Comment

Share this Doc
CONTENTS