You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 26, 2018. It is now read-only.
##TensorSpark productionalized in yarn-cluster mode
This latest version contains modifications/improvements that are mostly relevant to someone interested in taking TensorSpark to production in yarn-cluster mode (tested with a Hortonworks distribution [HDP 2.4] with CPU machines). For other deployment and machine types, the earlier version as of [Commit #62] (https://github.com/adatao/tensorspark/tree/2eae6732709884f08e800efa24653340f2f7997b) might still be a better option.
tensorspark.py: Reading the testset from the HDFS instead (Avoiding the need to put the testset on local disk; we are putting training and test sets at the same location on the HDFS)
parameterwebsocketclient.py: Find the machine that gets the Spark Driver in yarn-cluster mode (either way, there are some configs to be done here)
###To run
zip pyfiles.zip ./parameterwebsocketclient.py ./parameterservermodel.py ./mnistcnn.py ./mnistdnn.py ./moleculardnn.py ./higgsdnn.py
spark-submit
--master yarn
--deploy-mode cluster
--queue default
--num-executors 3
--driver-memory 20g
--executor-memory 60g
--executor-cores 8
--py-files ./pyfiles.zip
./tensorspark.py
Partial project layout:
tensorspark/gpu_install.sh - script to build tf from source with gpu support for aws
tensorspark/simple_websocket_*.py - simple tornado websocket example
tensorspark/parameterservermodel.py - "abstract" model class that has all tensorspark required methods implemented
tensorspark/*dnn.py - specific fully connected models for specific datasets
tensorspark/mnistcnn.py - convolutional model for mnist
tensorspark/parameterwebsocketclient.py - spark worker code
tensorspark/tensorspark.py - entry point and spark driver code