Newsletter
Get notified when new AI tools are added
Join the community.
Gandalf AI is an online game by Lakera that highlights the limits and security weaknesses of large language models like ChatGPT. Your task is to chat with “Gandalf” and trick him into revealing a secret password—despite built-in rules designed to prevent disclosure.
The game is built around common real-world attack patterns against LLMs, especially prompt injection, where a user tries to bypass developer-imposed restrictions through carefully crafted instructions.
Beyond the main challenge, side missions introduce other classes of LLM vulnerabilities, such as context substitution and manipulating the model’s input data. Gandalf AI is designed to be both a game and a practical way to understand how LLMs can be attacked—and how those attacks can be mitigated in real AI systems.