Codebase for inspecting approximate 2 layer MLPs in transformers

The official repository for our paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers" (https://arxiv.org/abs//2310.10837).

Please note that this repository is a cleaned-up version of the internal research repository we use. In case you encounter any problems with it, please don't hesitate to contact me.

If you are interested in a plug-and-play MoE layer

Please check out https://github.com/robertcsordas/moe_layer. The development and support of the layer will happen there. This repository is for reproducing our experiments from the paper.

Installation

This project requires Python 3.10 and PyTorch 2.

pip3 install -r requirements.txt

Create a Weights and Biases account and run

wandb login

More information on setting up Weights and Biases can be found on https://docs.wandb.com/quickstart.

For plotting, LaTeX is required (to avoid Type 3 fonts and to render symbols). Installation is OS specific.

Usage

The code makes use of Weights and Biases for experiment tracking. In the "sweeps" directory, we provide sweep configurations for all experiments we have performed. The sweeps are officially meant for hyperparameter optimization, but we use them to run multiple configurations of our models.

To reproduce our results, start a sweep for each of the YAML files in the "sweeps" directory. Run wandb agent for each of them in the main directory. This will run all the experiments, and they will be displayed on the W&B dashboard.

Re-creating plots from the paper

Edit config file "paper/config.json". Enter your project name in the field "wandb_project" (e.g. "username/modules").

Run the script of interest within the "paper" directory. For example:

cd paper
python3 moe_ablations.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
dataset		dataset
framework		framework
interfaces		interfaces
layers		layers
models		models
optimizer		optimizer
paper		paper
sweeps		sweeps
tasks		tasks
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requrements.txt		requrements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Codebase for inspecting approximate 2 layer MLPs in transformers

If you are interested in a plug-and-play MoE layer

Installation

Usage

Re-creating plots from the paper

About

Uh oh!

Releases 1

Packages

Languages

License

RobertCsordas/moe

Folders and files

Latest commit

History

Repository files navigation

Codebase for inspecting approximate 2 layer MLPs in transformers

If you are interested in a plug-and-play MoE layer

Installation

Usage

Re-creating plots from the paper

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages