You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A convenient way to receive data from other processes. This small library can:
Send a list of numpy arrays from python; serialization is written in C++ for efficiency.
One copy in merging all the buffers; One copy in pybind11 overhead (TODO); One copy in ZMQ send.
Receive a list of tensors from tensorflow;
One copy in ZMQ recv; One copy to split the buffer into tensors.
The op is stateful and safe to be evaluated multiple times in one sess.run call.
Serialization is in a custom protocol for efficiency;
Why:
Sometimes for complicated large-scale tasks you would really want data processing to be separate from TensorFlow.
However in TensorFlow there is no good way to receive data from other processes.
Build:
Require gcc>=5.3, tensorflow>=1.4, zeromq>=4.
Require the zmq.hpp header from cppzmq at
your compiler's include path, or under the src directory.
Add /path/to/git/clone/zmq_ops to PYTHONPATH to be able to import it.
Or use pip install . to install it.
Ops will be compiled the first time it gets imported.
Note that it usually requires recompilation after a TensorFlow reinstallation.
Use:
See benchmark.py for usage.
On my machine this script can achieve about 1.3GB/s throughput. Equivalent to about 2.3k float32 (or 9.2k uint8) imagenet images per second.