當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

spark报错处理

發(fā)布時(shí)間：2025/3/15 编程问答 36 豆豆

生活随笔收集整理的這篇文章主要介紹了 spark报错处理小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Spark報(bào)錯(cuò)處理

1、問題：org.apache.spark.SparkException: Exception thrown in awaitResult

分析：出現(xiàn)這個(gè)情況的原因是spark啟動(dòng)的時(shí)候設(shè)置的是hostname啟動(dòng)的，導(dǎo)致訪問的時(shí)候DNS不能解析主機(jī)名導(dǎo)致。

問題解決：

第一種方法：確保URL是spark://服務(wù)器ip:7077，而不是spark://hostname:7077；啟動(dòng)的時(shí)候指定-h? ip地址

第二種方法：修改主機(jī)的host文件添加主機(jī)的解析記錄（推薦這種方式）

??????????? Ip???? 主機(jī)名

第三種方法：hive.metastore.try.direct.sql: false ? ? ? ? (in hive-site.xml)

2、spark2.x版本使用hive，即copy一份hive-site.xml文件到spark2.x的conf目錄下。

使用spark的bin目錄下的spark-sql進(jìn)入終端時(shí)總提示一個(gè)warning：

Thu Jun 15 12:56:05 CST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.

解決方法：

修改hive-site.xml文件下的mysql連接的url，設(shè)置useSSL=false。由于hive-site.xml文件采用的是xml格式，所以不支持直接使用&連接，需要使用&進(jìn)行連接。

<value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&useSSL=false</value>

重啟spark即可，

#../sbin/stop-all.sh

#../sbin/start-all.sh

3、 問題：

Spark運(yùn)行了一段時(shí)間，數(shù)據(jù)量上來以后，出現(xiàn)了一個(gè)這樣的報(bào)錯(cuò)：

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

?? at java.lang.Thread.run(Thread.java:745)

17/10/26 20:29:00 ERROR Executor: Exception in task 39.1 in stage 8.0 (TID 1122)

java.io.FileNotFoundException: /tmp/spark-2de5fa03-a7cb-47a2-9540-403de85d0371/executor-eebecccb-4cdb-4b85-80a3-73c4baa4c7bd/blockmgr-fc644c14-23e8-401c-aee8-00bc108bf607/2b/temp_shuffle_75eb7338-be41-41b4-bed4-5dcb0c1d0fdf (No space left on device)

?? at java.io.FileOutputStream.open0(Native Method)

?? at java.io.FileOutputStream.open(FileOutputStream.java:270)

?? at java.io.FileOutputStream.<init>(FileOutputStream.java:213)

?? at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:102)

?? at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:115)

?? at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:235)

?? at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)

?? at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)

?? at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)

?? at org.apache.spark.scheduler.Task.run(Task.scala:108)

?? at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)

?? at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

?? at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

從日志報(bào)錯(cuò)來看說是沒有空間了，spark默認(rèn)是把臨時(shí)文件存放到/tmp目錄下。需要修改啊！！！放到一個(gè)大存儲(chǔ)的地方：

解決方法：

修改spark-env.sh

export SPARK_DRIVER_MEMORY=5g

export SPARK_LOCAL_DIRS=/data/sparktmp

不要添加到spark-defaault.conf里面去，因?yàn)閟park從1.0版本已經(jīng)放棄了spark.local.dir參數(shù)。

源碼分析：

（1） DiskBlockManager類中的下面的方法

通過日志我們最終定位這塊出現(xiàn)的錯(cuò)誤

/**

? ?* Create local directories for storing block data. These directories are

? ?* located inside configured local directories and won't

? ?* be deleted on JVM exit when using the external shuffle service.

? ?*/

? private def createLocalDirs(conf: SparkConf): Array[File] = {

? ? Utils.getConfiguredLocalDirs(conf).flatMap { rootDir =>

? ? ? try {

? ? ? ? val localDir = Utils.createDirectory(rootDir, "blockmgr")

? ? ? ? logInfo(s"Created local directory at $localDir")

? ? ? ? Some(localDir)

? ? ? } catch {

? ? ? ? case e: IOException =>

? ? ? ? ? logError(s"Failed to create local dir in $rootDir. Ignoring this directory.", e)

? ? ? ? ? None

? ? ? }

? ? }

? }

（2） SparkConf.scala 類中的方法

這個(gè)方法告訴我們?cè)趕park-defaults.conf 中配置spark.local.dir參數(shù)在spark1.0 版本后已經(jīng)過時(shí)。

/** Checks for illegal or deprecated config settings. Throws an exception for the former. Not

? ? * idempotent - may mutate this conf object to convert deprecated settings to supported ones. */

? private[spark] def validateSettings() {

? ? if (contains("spark.local.dir")) {

? ? ? val msg = "In Spark 1.0 and later spark.local.dir will be overridden by the value set by " +

? ? ? ? "the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN)."

? ? ? logWarning(msg)

? ? }

? ? val executorOptsKey = "spark.executor.extraJavaOptions"

? ? val executorClasspathKey = "spark.executor.extr

? ? 。。。。

}

（3）Utils.scala 類中的方法

通過分析下面的代碼，我們發(fā)現(xiàn)不在spark-env.sh 下配置SPARK_LOCAL_DIRS的情況下，

通過該conf.get("spark.local.dir", System.getProperty("java.io.tmpdir")).split(",")設(shè)置spark.local.dir，然后或根據(jù)路徑創(chuàng)建，導(dǎo)致上述錯(cuò)誤。

故我們直接在spark-env.sh 中設(shè)置SPARK_LOCAL_DIRS 即可解決。

然后我們直接在spark-env.sh 中配置：

export SPARK_LOCAL_DIRS=/home/hadoop/data/sparktmp

/**

? ?* Return the configured local directories where Spark can write files. This

? ?* method does not create any directories on its own, it only encapsulates the

? ?* logic of locating the local directories according to deployment mode.

? ?*/

? def getConfiguredLocalDirs(conf: SparkConf): Array[String] = {

? ? val shuffleServiceEnabled = conf.getBoolean("spark.shuffle.service.enabled", false)

? ? if (isRunningInYarnContainer(conf)) {

? ? ? // If we are in yarn mode, systems can have different disk layouts so we must set it

? ? ? // to what Yarn on this system said was available. Note this assumes that Yarn has

? ? ? // created the directories already, and that they are secured so that only the

? ? ? // user has access to them.

? ? ? getYarnLocalDirs(conf).split(",")

? ? } else if (conf.getenv("SPARK_EXECUTOR_DIRS") != null) {

? ? ? conf.getenv("SPARK_EXECUTOR_DIRS").split(File.pathSeparator)

? ? } else if (conf.getenv("SPARK_LOCAL_DIRS") != null) {

? ? ? conf.getenv("SPARK_LOCAL_DIRS").split(",")

? ? } else if (conf.getenv("MESOS_DIRECTORY") != null && !shuffleServiceEnabled) {

? ? ? // Mesos already creates a directory per Mesos task. Spark should use that directory

? ? ? // instead so all temporary files are automatically cleaned up when the Mesos task ends.

? ? ? // Note that we don't want this if the shuffle service is enabled because we want to

? ? ? // continue to serve shuffle files after the executors that wrote them have already exited.

? ? ? Array(conf.getenv("MESOS_DIRECTORY"))

? ? } else {

? ? ? if (conf.getenv("MESOS_DIRECTORY") != null && shuffleServiceEnabled) {

? ? ? ? logInfo("MESOS_DIRECTORY available but not using provided Mesos sandbox because " +

? ? ? ? ? "spark.shuffle.service.enabled is enabled.")

? ? ? }

? ? ? // In non-Yarn mode (or for the driver in yarn-client mode), we cannot trust the user

? ? ? // configuration to point to a secure directory. So create a subdirectory with restricted

? ? ? // permissions under each listed directory.

? ? ? conf.get("spark.local.dir", System.getProperty("java.io.tmpdir")).split(",")

? ? }

? }

3、Join condition is missing or trivial.Use the CROSS JOIN syntax to allow cartesian products between these relations.;

解決方法：

spark.sql.crossjoin.enabled: true

4、Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "eval(Lorg/apache/spark/sql/catalyst/InternalRow;)Z" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate" grows beyond 64 KB

解決方法：

spark.sql.codegen.wholeStage : false

5、java.lang.OutOfMemoryError: Java heap space

解決方法：

spark.driver.memory : 10g ? <to a higher-value>

spark.sql.ui.retainedExecutions: 5 ? <to some lower-value>

轉(zhuǎn)載于:https://www.cnblogs.com/cuishuai/p/7744233.html

總結(jié)

以上是生活随笔為你收集整理的spark报错处理的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

报错
Spark

上一篇： Haproxy基于ACL做访问控制
下一篇：使用 Skeleton Screen 提