The Transformer model in Attention is all you need：a Keras implementation.

A Keras+TensorFlow Implementation of the Transformer: "Attention is All You Need" (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017)

Usage

Please refer to en2de_main.py and pinyin_main.py

en2de_main.py

This task is same as in jadore801120/attention-is-all-you-need-pytorch: WMT'16 Multimodal Translation: Multi30k (de-en) (https://www.statmt.org/wmt16/multimodal-task.html). We borrowed the data preprocessing step 0 and 1 in the repository, and then construct the input file en2de.s2s.txt

Results

The code achieves near results as in the repository: about 70% valid accuracy. If using smaller model parameters, such as layers=2 and d_model=256, the valid accuracy is better since the task is quite small.

For your own data

Just preprocess your source and target sequences as the format in en2de.s2s.txt and pinyin.corpus.examples.txt.

Some notes

For larger number of layers, the special learning rate scheduler reported in the papar is necessary.
In pinyin_main.py, I tried another method to train the deep network. I train the first layer and the embedding layer first, then train a 2-layers model, and then train a 3-layers, etc. It works in this task.

Upgrades

Reconstruct some classes.
It is easier to use the components in other models, just import transformer.py
A fast step-by-step decoder is added, including an upgraded beam-search. But they should be modified to be reuseable.
Updated for tensorflow 2.6.0

Acknowledgement

Some model structures and some scripts are borrowed from jadore801120/attention-is-all-you-need-pytorch.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
.gitignore		.gitignore
README.md		README.md
dataloader.py		dataloader.py
en2de_main.py		en2de_main.py
ljqpy.py		ljqpy.py
pinyin_main.py		pinyin_main.py
rnn_s2s.py		rnn_s2s.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Transformer model in Attention is all you need：a Keras implementation.

Usage

en2de_main.py

Results

For your own data

Some notes

Upgrades

Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

lsdefine/attention-is-all-you-need-keras

Folders and files

Latest commit

History

Repository files navigation

The Transformer model in Attention is all you need：a Keras implementation.

Usage

en2de_main.py

Results

For your own data

Some notes

Upgrades

Acknowledgement

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages