Skip to content

NVIDIA/Megatron-LM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Repository files navigation

Megatron-LM & Megatron Core

GPU-optimized library for training transformer models at scale

Documentationversionlicense

🚨 DEVELOPMENT BRANCH

⚠️EXPERIMENTAL FEATURES - This is the dev branch with experimental features.

→ For releases and comprehensive documentation, visit the main branch

⚡ Quickstart

# Clone the dev branch git clone -b dev https://github.com/NVIDIA/Megatron-LM.git cd Megatron-LM # Install from source with dev dependencies (includes transformer_engine) pip install -e .[mlm,dev]
Table of Contents

Getting Started

For Complete DocumentationMain Branch | Official Docs

Dev Branch Philosophy

Fast Iteration

  • Streamlined Review: 1 code owner + 1 dev approver (can delegate review) + CI/CD

Feature Lifecycle (Coming Soon)

  • 6-Month Timeline: Experimental features must graduate to stable or be deprecated
  • Migration Support: Assistance provided for feature transitions

Stability Expectations

  • Experimental Nature: Features may change or be removed as development progresses
  • Testing: All features will pass convergence and performance validation before inclusion
  • Support: Dev branch issues should include [DEV] prefix

Performance & Benchmarking

Community & Support

Getting Help

Contributing

We ❤️ contributions! Ways to contribute:

  • 🐛 Report bugs - Help us improve reliability
  • 💡 Suggest features - Shape the future of Megatron Core
  • 📝 Improve docs - Make Megatron Core more accessible
  • 🔧 Submit PRs - Contribute code improvements

Contributing Guide

Citation

@article{megatron-lm, title={Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism}, author={Shoeybi, Mohammad and Patwary, Mostofa and Puri, Raul and LeGresley, Patrick and Casper, Jared and Catanzaro, Bryan}, journal={arXiv preprint arXiv:1909.08053}, year={2019} }