Русские видео

Сейчас в тренде

Иностранные видео


Скачать с ютуб What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more в хорошем качестве

What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more 10 месяцев назад


Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса savevideohd.ru



What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more

Tokenizers are one of the key components of Large Language Models (LLMs). One of the best ways to understand what they do, is to compare the behavior of different tokenizers. In this video, Jay takes a carefully crafted piece of text (that contains English, code, indentation, numbers, emoji, and other languages) and passes it through different trained tokenizers to reveal what they succeed and fail at encoding, and the different design choices for different tokenizers and what they say about their respective models. --- Contents: 0:00 Introduction 1:25 The carefully polished text to test tokenizers 2:19 BERT Uncased 3:59 BERT Cased 4:29 GPT-2 6:00 FLAN-T5 7:00 GPT-4 9:24 Starcoder 21:31 Galactica --- Twitter:   / jayalammar   Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ Access the Early Release version of the book with a 30-day free trial of the O'Reilly learning platform: https://learning.oreilly.com/get-lear... [The formatting for the tokenization chapter is still a work-in-progress, but the video gives you a better look at the approach]

Comments