You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The NeMo Framework focuses on foundation model training for generative AI models.
Large language model (LLM) pretraining typically requires a lot of compute and model parallelism to efficiently scale training.
NeMo Framework includes the latest in large-scale training techniques including:
Model parallelism
Tensor
Pipeline
Sequence
Distributed Optimizer
Mixed precision training
FP8
BF16
Distributed Checkpointing
Community Models
LLAMA-2
NeMo Framework model training scales to 1000's of GPUs and can be used for training LLMs on trillions of tokens.
The Launcher is designed to be a simple and easy to use tool for launching NeMo FW training jobs
on CSPs or on-prem clusters. The launcher is typically used from a head node and only requires
a minimal python installation.
The Launcher will generate and launch submission scripts for the cluster scheduler and will also organize
and store jobs results. Tested configuration files are included with the launcher but anything
in a configuration file can be easily modified by the user.
The NeMo FW Launcher is tested with the NeMo FW Container which can be applied for here.
Access is automatic.
Users may also easily configure the launcher to use any container image that they want to provide.
The NeMo Framework Launcher should be installed on a head node or a local machine in a virtual python environment.
git clone https://github.com/NVIDIA/NeMo-Framework-Launcher.git
cd NeMo-Framework-Launcher
pip install -r requirements.txt
Usage
The best way to get started with the NeMo Framework Launcher is go through
the NeMo Framework Playbooks
After everything is configured in the .yaml files, the Launcher can be run with:
python main.py
Since the Launcher uses Hydra,
any configuration can be overridden directly in the .yaml file or via the command line.
See Hydra's override grammar for more information.
Contributing
Contributions are welcome!
To contribute to the NeMo Framework Launcher, simply create a pull request with the changes on GitHub.
After the pull request is reviewed by a NeMo FW Developer, approved, and passes the unit and CI tests,
then it will be merged.