Secret weapon how to promote your YouTube channel
Get Free YouTube Subscribers, Views and Likes

why llama-3-8B is 8 billion parameters instead of 7?

Follow
Chris Hay

llama3 has ditched it's tokenizer and has instead opted to use the same tokenizer as gpt4 (tiktoken created by openai), it's even using the same first 100K token vocabulary.

In this video chris walks through why Meta has switched tokenizer and the implications on the model sizes, embeddings layer and multilingual tokenization.

he also runs his tokenizer benchmark and show's how it's more efficient in languages such as japanese

repos

https://github.com/chrishayuk/embeddings
https://github.com/chrishayuk/tokeniz...

posted by apetitus7d