All in one custom and comprehensive Docker Image for the data engineering developer on Apache Spark

$ docker load < bigdata.tgz
$ docker image lsREPOSITORY          TAG       IMAGE ID       CREATED       SIZEjentekllc/bigdata   latest    b2b671d197f7   4 hours ago   5.51GB
$ tar -xf  bigdata_docker.tar
$ nohup docker-compose -p j up --scale spark-worker=3 &
$ docker psCONTAINER ID   IMAGE                      COMMAND                  CREATED              STATUS              PORTS                                                                                                     NAMESf23c2863e235   nginx:latest               "/docker-entrypoint.…"   About a minute ago   Up 56 seconds       80/tcp, 0.0.0.0:5000->5000/tcp                                                                            nginx-lb1cb418088d2c   jentekllc/bigdata:latest   "/run_sshd_worker.sh"    About a minute ago   Up 57 seconds       22/tcp, 0.0.0.0:49851->38080/tcp                                                                          j-spark-worker-3997537fb1887   jentekllc/bigdata:latest   "/run_sshd_worker.sh"    About a minute ago   Up 57 seconds       22/tcp, 0.0.0.0:49852->38080/tcp                                                                          j-spark-worker-161bd4afc30a0   jentekllc/bigdata:latest   "/run_sshd_worker.sh"    About a minute ago   Up 58 seconds       22/tcp, 0.0.0.0:49850->38080/tcp                                                                          j-spark-worker-216a493eb513d   jentekllc/bigdata:latest   "/run_sshd_master.sh"    About a minute ago   Up About a minute   0.0.0.0:7077->7077/tcp, 0.0.0.0:8080->8080/tcp, 0.0.0.0:8888-8889->8888-8889/tcp, 0.0.0.0:20022->22/tcp   spark-master2707ab560407   jentekllc/bigdata:latest   "/run_sshd_hive.sh"      About a minute ago   Up About a minute   0.0.0.0:9000->9000/tcp, 0.0.0.0:9083->9083/tcp, 0.0.0.0:30022->22/tcp                                     hadoop-hive
$ docker-compose -p j down
http://localhost:8080
http://localhost:5000
http://localhost:8888
ssh -p 20022 hadoop@localhost$ cd $SPARK_HOME
$ bin/spark-submit /spark/examples/src/main/python/pi.py

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
George Jen

George Jen

I am founder of Jen Tek LLC, a startup company in East Bay California developing AI powered, cloud based documentation/publishing software as a service.