Free views, likes and subscribers at YouTube. Now!
Get Free YouTube Subscribers, Views and Likes

Coding Llama 3 from scratch in PyTorch - Part 1

Follow
Prince Canuma

In this video series, you will learn how to train and finetune Llama 3 model from scratch.

The goal is to code LLaMA 3 from scratch in PyTorch to create models with sizes 3B, 6B, 35B and 45BM params. In this first video, you'll learn about upcycling, downcycling and infiniattention.

Papers:
Sparse Upcycling Training MixtureofExperts from Dense Checkpoints
: https://arxiv.org/abs/2212.05055
Pretraining Small Base LMs with Fewer Tokens: https://arxiv.org/abs/2404.08634
Leave No Context Behind Efficient Infinite Context Transformers with Infiniattention: https://arxiv.org/abs/2404.07143


To follow along you can use this colab notebook:
https://github.com/Blaizzy/CodingLLM...

Coding Llama 2 from scratch video series
Part 1: https://youtube.com/live/XHmag4damTg
Part 2: https://youtube.com/live/LSWDpFmbE90
Part 3:    • Coding Llama 2 from scratch in PyTorc...  

posted by liutanrm