You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fromtransformersimportAutoModelForCausalLMfromMoDimportapply_mod_to_hf# Initialize your model from an available hf modelmodel=AutoModelForCausalLM.from_pretrained("some-repo/some-model")
# Convert the model to include the mixture of depths layersmodel=apply_mod_to_hf(model)
# train the model# ...# save the modelmodel.save_pretrained('some_local_directory')
Loading the converted Model
To utilize the converted model, you will need to load the model from the AutoClass. Below is an example demonstrating how to load the model from a local directory:
fromMoDimportAutoMoDModelForCausalLM# Replace 'path_to_your_model' with the actual path to your model's directorymodel=AutoMoDModelForCausalLM.from_pretrained('path_to_your_model')
Using generate()
Before calling the hf generate() method please explicitly use eval() on the model
🫱🏼🫲🏽 Contributing
We welcome contributions from the community, whether it's adding new features, improving documentation, or reporting bugs. Please refer to our contribution guidelines before making a pull request.
📜 License
This repo is open-sourced under the Apache-2.0 license.
Citation
If you use our code in your research, please cite it using the following Bibtex entry:
@article{MoD2024,
title={Unofficial implementation for the paper "Mixture-of-Depths"},
author={AstraMind AI},
journal={https://github.com/astramind-ai/Mixture-of-depths},
year={2024}
}
Support
For questions, issues, or support, please open an issue on our GitHub repository.
About
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"