728x90
spark-submit으로 Spark streaming을 실행하면 2시간 마다 배치가 중단되는 현상이 발생하고 아래와 같은 에러가 나타난다.
local모드로 실행하면 문제가 없지만 client 모드 또는 cluster 모드로 실행하면 에러가 발생한다.
[2021-08-09 02:19:05,746] {bash_operator.py:128} INFO - 21/08/09 02:19:05 ERROR cluster.YarnScheduler: Lost executor 1 on xxx-Xxxx: Slave lost [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - 21/08/09 02:19:05 ERROR client.TransportClient: Failed to send RPC RPC 8387377159996559940 to /xxx.xxx.xxx.xxx:59188: java.nio.channels.ClosedChannelException [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - java.nio.channels.ClosedChannelException [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - at io.netty.channel.AbstractChannel$AbstractUnsafe.newClosedChannelException(AbstractChannel.java:957) [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:865) [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1367) [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:764) [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - at io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1104) [2021-08-09 02:19:05,911] {bash_operator.py:128} INFO - at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [2021-08-09 02:19:05,912] {bash_operator.py:128} INFO - at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [2021-08-09 02:19:05,912] {bash_operator.py:128} INFO - at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [2021-08-09 02:19:05,912] {bash_operator.py:128} INFO - at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [2021-08-09 02:19:05,912] {bash_operator.py:128} INFO - at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [2021-08-09 02:19:05,912] {bash_operator.py:128} INFO - at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [2021-08-09 02:19:05,912] {bash_operator.py:128} INFO - at java.lang.Thread.run(Thread.java:748) |
아래 글을 참고하면 yarn-site.xml 파일에서 옵션을 변경해야 한다고 제안한다.
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
시도1.
spark-submit 실행 시 아래 옵션을 2g로 변경해봤으나 동일한 에러 발생
Property NameDefaultMeaningSince Version
spark.yarn.am.memory | 512m | Amount of memory to use for the YARN Application Master in client mode, in the same format as JVM memory strings (e.g. 512m, 2g). In cluster mode, use spark.driver.memory instead.Use lower-case suffixes, e.g. k, m, g, t, and p, for kibi-, mebi-, gibi-, tebi-, and pebibytes, respectively. | 1.3.0 |
참고
728x90
'빅데이터(BigData) > Spark' 카테고리의 다른 글
spark context config 확인 방법 (0) | 2021.08.24 |
---|---|
spark structured streaming + kafka를 이용한 개발 후기 (2) | 2021.08.12 |
pyspark 실행 후 config를 수정하는 방법 (0) | 2021.07.06 |
ModuleNotFoundError: No module named 'pyspark' 에러 발행할 때 findspark로 해결하기 (0) | 2021.06.14 |
pyspark에서 비어있는 DataFrame 만들기 (0) | 2021.04.28 |