Introduction to Large Language Models and their Adaptation

We cordially invite you to our Large Language Models workshop organized by Saarland University. This hands-on workshop took place at Saarland University, Campus Saarbrücken on September 21st 2023, 10:00 - 17:00.

Registration closed.

This course provides an overview of language models (LMs), with a specific focus on large transformer-based language models. Participants will learn about the various types, applications, and limitations of LMs. As well as, on a more practical note, how to train their own LMs, adapt them to specific domains or languages, and how apply them to real world tasks. We also cover the computational challenges associated with training LMs and provide an overview on how the NHR can help overcome these challenges.

The course covers the following topics:

Introduction to LMs
Types of LMs
Applications of LMs
Limitations and future directions
Training your own LM from scratch

By the end of this course, participants will have a solid understanding of LMs and the skills to create and apply their own LMs to various sequence processing tasks.

Introduction session

by Marius Mosbach

The first part of this course provides a basic introduction to neural language models, explaining how they work and how they are different from traditional statistical language models. We will discuss the data on which language models can be trained, present different types of neural language models, and compare their strengths and weaknesses. After establishing a basic understanding, we will explore a wide range of applications of language models and demonstrate how they can be used to solve real-world problems. Finally, we will discuss the limitations of language models and introduce some of the current research directions in the field.

Adaptation session

by Jesujoba Alabi

This talk delves into adapting language models to data from specialized domains. As language models have revolutionized natural language processing tasks, it has become crucial to extend their capabilities to support a wide range of domains (for example news, business, or medicine). This session will explore the challenges and strategies involved in adapting language models to new domains. Real-world examples will be presented to demonstrate the effectiveness and potential applications of adapting language models to new domains.

Keynote session

by Dr. Johannes Otterbach

With the release of ChatGPT in December 2022, Foundation and Large Language Models have entered dinner conversations around the globe. In this talk, we will go on a journey to understand the evolution of the GPT-family and what the future of Generative AI holds for us. We will cover topics such as the homogenization of algorithms, the shift from model- to data-centric AI, the merging of high-performance computing and consumer applications and the challenges ahead to enable broad access to this revolutionary technology.

Dr. Johannes Otterbach is the CTO of nyonic, a Berlin-based startup, committed to build best-in-class, multilingual LLMs for European businesses and industry. During his career in Silicon Valley and Berlin, he held positions at OpenAI, Palantir and Merantix and more. At nyonic he is combining this experience to bring modern AI technology development and deployment to the heart of Europe.

Panel discussion

Panelist: Dr. Johannes Otterbach, Marius Mosbach and Jesujoba Alabi

Practical session

by Israel A. Azime and Paloma García de Herreros García

During the practical section of this workshop, we will cover the entire process of language model training, with a specific focus on the medical domain, including:

Preprocessing medical text datasets for language model training.
Creating your own tokenizers to break down sentences and convert them into inputs for model training.
Understanding how language model training works.
Learning to use Huggingface transformers, an open-source library, to train transformer-based language models.
Delving into named entity recognition tasks for identifying disease names in texts using the language models we trained. NER models are core algorithms in medical document query and organization applications.
Gaining insight into how to publish your language models and take advantage of freely available ones.

We understand that training these models may raise some questions, so we have prepared Q&A sessions to address any concerns. Additionally, we will offer AI office hours to provide ongoing consulting support for your AI-related projects.

Join us to develop your skills and become more proficient in working with large language models.

Training notebooks