Scope the system, not just the model
LLM red teaming should test the full application: prompts, retrieved documents, tool permissions, memory, file uploads, output handling, external APIs, logging, and human handoff.
- Test direct prompt injection and indirect prompt injection inside retrieved content.
- Test whether model output can trigger unsafe downstream actions.
- Test access-control failures in RAG, memory, and tool results.