Skip to content

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

License

Notifications You must be signed in to change notification settings

rish-16/aft-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

28 Commits

Repository files navigation

aft-pytorch

Unofficial PyTorch implementation of Attention Free Transformer's layers by Zhai, et al. [abs, pdf] from Apple Inc.

I'd like to thank primary author, Dr. Shuangfei Zhai, for his informal guidance and feedback as I built this package!

Installation

You can install aft-pytorch via pip:

pip install aft-pytorch

Usage

You can import the AFT-Full or AFT-Simple layer (as described in the paper) from the package like so:

AFTFull

fromaft_pytorchimportAFTFulllayer=AFTFull( max_seqlen=20, dim=512, hidden_dim=64 ) # a batch of sequences with 10 timesteps of length 512 eachx=torch.rand(32, 10, 512) y=layer(x) # [32, 10, 512]

AFTSimple

fromaft_pytorchimportAFTSimplelayer=AFTSimple( max_seqlen=20, dim=512, hidden_dim=64 ) # a batch of sequences with 10 timesteps of length 512 eachx=torch.rand(32, 10, 512) y=layer(x) # [32, 10, 512]

AFTLocal

fromaft_pytorchimportAFTLocallayer=AFTLocal( max_seqlen=20, dim=512, hidden_dim=64 ) # a batch of sequences with 10 timesteps of length 512 eachx=torch.rand(32, 10, 512) y=layer(x) # [32, 10, 512]

This layer wrapper is a 'plug-and-play' with your existing networks / Transformers. You can swap out the Self-Attention layer with the available layers in this package with minimal changes.

TODO

  • Add full AFT architecture
  • Add variants like, AFTConv
  • Benchmark using Karpathy's minGPT

Contributing

If you like this repo, please leave a star! If there are any amends or suggestions, feel free to raise a PR/issue.

Credits

@misc{attention-free-transformer, title ={An Attention Free Transformer}, author ={Shuangfei Zhai and Walter Talbott and Nitish Srivastava and Chen Huang and Hanlin Goh and Ruixiang Zhang and Josh Susskind}, year ={2021}, URL ={https://arxiv.org/pdf/2105.14103.pdf} } 

License

MIT

About

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages