Introduction to Embeddings and Tokenization
- Invited Talks
- March 8, 2024
Joel Kowalewski, PhD Finetuning LLMs for Specific Tasks Introduction to Embeddings and Tokenization At the core of modern language models (LLMs) like GPT-3,4 and BERT are the concepts of embeddings and tokenization. Tokenization initially was simply a method to group text data into smaller packets that are easier to process, resulting in better predictive models. As LLMs steadily developed alongside …
View Post
