DSE Spark: Setting SPARK_LOCAL_IP may fix something annoying by breaking something necessary

Ryan Svihla
1 min readMay 8, 2016

--

There is a bug in the Spark project SPARK–12963 I hit on a couple of trouble tickets where the SPARK_LOCAL_IP set in the spark-env.sh resulted in messages in the executor log when using –deploy-mode cluster

Exception in thread “main” java.net.BindException: Failed to bind to: /10.1.1.7:0 <http://10.1.1.7:0>: Service ‘Driver’ failed after 16 retries!

The workaround for this is to comment out any setting of SPARK_LOCAL_IP, but on clouds using public IPs when you try and use dse spark-shell or try and use client mode to submit your jobs you’ll get nice errors like this:

Exception in thread “main” java.net.BindException: Failed to bind to: /104.99.99.99:0 <http://104.99.99.99:0>: Service ‘sparkDriver’ failed after 16 retries!

So the workaround to all this is pretty simple:

  1. Never ever set SPARK_LOCAL_IP until there is a fix for SPARK–12963
  2. For all other commands pass the appropriate IP address, this is more effort but it just works (Note I believe this applies to OSS spark as well just take the dse off):
  • dse spark-submit –deploy-mode client –conf spark.driver.host <routeable ip>
  • dse spark –master spark://<master ip you want>:7077
  • dse spark-sql –master spark://<master ip you want>:7077

--

--