RunIt

Note

This tool still has some limitations. If you encounter any problems in use, please feel free to ask.

A simple program scheduler for your code on different devices.

Let the machine move!

Putting the machine into sleep is a disrespect for time.

Usage

Note

2024-8-14: Now, the config file contains the information of your GPUs and jobs, more details can be found in config.py.

Dependency

PyYAML==6.0
nvidia-ml-py (pynvml only for runit_based_on_detected_memory.py)

Scripts

We provides 3 scripts for different ways to run jobs.

runit_with_exclusive_gpu.py: One GPU can only be used by one job at a time.
runit_based_on_memory：One GPU can be used by many job at a time based on the memory usage.
runit_based_on_detected_memory.py: Use pynvml for detecting the total memory usage of each GPU. But this may not be suitable for scenarios where the memory used by a running GPU application is unstable.

demo

$ python run_it.py --config ./examples/config.yaml
$ python run_it.py --max-workers 3 --config ./examples/config.yaml

graph TD
    A[Start] --> B[Read Configuration and Command Pool]
    B --> C[Initialize Shared Resources]
    C --> |Maximum number of requirements met| D[Loop Until All Jobs Done]
    D --> E[Check Available GPUs]
    E -->|Enough GPUs| F[Run Job in Separate Process]
    E -->|Not Enough GPUs| G[Wait and Retry]
    F --> H[Job Completes]
    F --> I[Job Fails]
    H --> J[Update Job Status and Return GPUs]
    I --> J
    G --> D
    J -->|All Jobs Done| K[End]
    C -->|Maximum number of requirements not met| L[Terminate Workers]
    L --> M[Shutdown Manager and Join Pool]
    M --> K

Thanks

@BitCalSaul: Thanks for the positive feedbacks!
- #3
- #2
- #1
https://www.jb51.net/article/142787.htm
https://docs.python.org/zh-cn/3/library/subprocess.html
https://stackoverflow.com/a/23616229
https://stackoverflow.com/a/14533902

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
runit_based_on_detected_memory.py		runit_based_on_detected_memory.py
runit_based_on_memory.py		runit_based_on_memory.py
runit_with_exclusive_gpu.py		runit_with_exclusive_gpu.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RunIt

Usage

Dependency

Scripts

demo

Thanks

About

Uh oh!

Releases 1

Uh oh!

Languages

License

lartpang/RunIt

Folders and files

Latest commit

History

Repository files navigation

RunIt

Usage

Dependency

Scripts

demo

Thanks

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Languages