LLDF Logo
LLDF FRAMEWORK

Defending AI, One Word at a Time

Multi-layered defense against prompt injection attacks
Six Layer Defense ▼
LLDE Lab Coming Soon

L1

Input Validation

Filter and sanitize all incoming prompts.

L2

Context Protection

Secure system instructions and context.

L3

Behavioral Boundaries

Define and enforce AI behavior limits.

L4

Output Filtering

Validate responses before delivery.

L5

Monitoring & Detection

Real-time threat identification.

L6

Response & Recovery

Incident handling and adaptation.

LLDF FRAMEWORK

Language Layer Defense Framework — Operational mapping of techniques with plain language definitions and defenses.

Join the LLDF Community

Submit New Technique

50 techniques
ID Technique Tactic Definition Defense Author
LLDF-T001 Prompt Injection Initial Access Inserting malicious instructions into user prompts to override system behavior Input validation, prompt templating, instruction hierarchy LLDF Team
LLDF-T002 Context Poisoning Persistence Injecting malicious context that persists across conversation turns Context sanitization, session isolation, memory bounds LLDF Team
LLDF-T003 Jailbreak Techniques Evasion Bypassing safety guardrails through creative prompt engineering Multi-layer filtering, behavioral analysis, output validation LLDF Team
LLDF-T004 Role-Play Exploitation Execution Using fictional scenarios to elicit prohibited responses Intent classification, scenario detection, response filtering LLDF Team
LLDF-T005 Token Smuggling Evasion Hiding malicious content in token sequences that bypass filters Token-level analysis, semantic validation, pattern detection LLDF Team
LLDF-T006 System Prompt Leakage Reconnaissance Extracting system instructions through targeted queries Prompt isolation, response sanitization, instruction obfuscation LLDF Team
LLDF-T007 Chain-of-Thought Manipulation Execution Exploiting reasoning chains to reach unintended conclusions Reasoning validation, logic bounds, output verification LLDF Team
LLDF-T008 Few-Shot Poisoning Initial Access Providing malicious examples to influence model behavior Example validation, source verification, pattern analysis LLDF Team
LLDF-T009 Encoding Obfuscation Evasion Using alternative encodings to bypass content filters Multi-encoding detection, normalization, semantic analysis LLDF Team
LLDF-T010 Instruction Hierarchy Bypass Evasion Overriding system instructions with user-level commands Privilege separation, instruction priority, command validation LLDF Team
LLDF-T011 Memory Exploitation Persistence Manipulating conversation memory to maintain malicious state Memory sanitization, state validation, session limits LLDF Team
LLDF-T012 Output Manipulation Execution Crafting inputs that produce specific malicious outputs Output filtering, content validation, response monitoring LLDF Team
LLDF-T013 Semantic Drift Evasion Gradually shifting conversation context toward prohibited topics Topic tracking, drift detection, context boundaries LLDF Team
LLDF-T014 Multi-Turn Attacks Persistence Building malicious payloads across multiple conversation turns Cross-turn analysis, state tracking, cumulative filtering LLDF Team
LLDF-T015 Function Calling Abuse Execution Exploiting tool/function calling capabilities for unauthorized actions Function whitelisting, parameter validation, execution monitoring LLDF Team
LLDF-T016 Retrieval Poisoning Initial Access Injecting malicious content into retrieval sources Source validation, content sanitization, retrieval filtering LLDF Team
LLDF-T017 Adversarial Suffixes Evasion Appending crafted tokens that trigger unintended behaviors Suffix detection, token analysis, behavioral monitoring LLDF Team
LLDF-T018 Prompt Leaking Reconnaissance Extracting training data or system prompts through queries Data isolation, response filtering, leakage detection LLDF Team
LLDF-T019 Instruction Confusion Execution Creating ambiguous instructions that exploit parsing logic Instruction clarification, parsing validation, ambiguity detection LLDF Team
LLDF-T020 Context Window Overflow Evasion Exceeding context limits to drop security instructions Context management, instruction pinning, overflow detection LLDF Team
LLDF-T021 Delimiter Injection Initial Access Injecting special delimiters to break prompt structure Delimiter escaping, structure validation, parsing hardening LLDF Team
LLDF-T022 Refusal Suppression Evasion Techniques to prevent model from refusing requests Refusal reinforcement, safety layer redundancy, response validation LLDF Team
LLDF-T023 Persona Injection Execution Forcing model to adopt malicious personas or identities Persona validation, identity constraints, behavior monitoring LLDF Team
LLDF-T024 Indirect Prompt Injection Initial Access Injecting prompts through external data sources Source isolation, data sanitization, indirect detection LLDF Team
LLDF-T024 Gradient-Based Attacks Reconnaissance Using model gradients to craft adversarial inputs Gradient masking, input perturbation, adversarial training LLDF Team
LLDF-T025 Tokenization Exploits Evasion Exploiting tokenization quirks to bypass filters Tokenization normalization, boundary detection, semantic validation LLDF Team
LLDF-T027 Instruction Negation Evasion Using negation to reverse safety instructions Negation detection, instruction reinforcement, logic validation LLDF Team
LLDF-T028 Multilingual Evasion Evasion Using non-English languages to bypass filters Multilingual filtering, translation validation, language detection LLDF Team
LLDF-T029 Code Injection Execution Injecting executable code through prompts Code detection, execution prevention, sandbox isolation LLDF Team
LLDF-T030 Metadata Manipulation Initial Access Exploiting metadata fields to inject instructions Metadata validation, field sanitization, structure enforcement LLDF Team
LLDF-T031 Attention Manipulation Execution Crafting inputs that exploit attention mechanisms Attention monitoring, pattern detection, mechanism hardening LLDF Team
LLDF-T032 Embedding Poisoning Persistence Poisoning vector embeddings to influence retrieval Embedding validation, anomaly detection, source verification LLDF Team
LLDF-T033 Prompt Chaining Execution Chaining multiple prompts to achieve complex attacks Chain detection, cumulative analysis, sequence validation LLDF Team
LLDF-T034 Safety Alignment Bypass Evasion Circumventing RLHF and safety fine-tuning Alignment reinforcement, multi-layer safety, behavioral monitoring LLDF Team
LLDF-T035 Template Injection Initial Access Injecting malicious content into prompt templates Template validation, variable sanitization, structure enforcement LLDF Team
LLDF-T036 Reasoning Exploitation Execution Exploiting chain-of-thought to reach harmful conclusions Reasoning validation, logic bounds, conclusion filtering LLDF Team
LLDF-T037 Tool Misuse Execution Misusing integrated tools for unauthorized purposes Tool authorization, usage monitoring, capability limits LLDF Team
LLDF-T038 Context Injection Initial Access Injecting malicious context through external sources Context validation, source verification, injection detection LLDF Team
LLDF-T039 Behavioral Cloning Persistence Training model to mimic malicious behaviors Behavior monitoring, anomaly detection, training validation LLDF Team
LLDF-T040 Adversarial Examples Evasion Crafting inputs that cause misclassification Adversarial training, input validation, robustness testing LLDF Team
LLDF-T041 Prompt Smuggling Initial Access Hiding prompts in seemingly benign content Content analysis, hidden instruction detection, semantic validation LLDF Team
LLDF-T042 Output Steering Execution Steering model outputs toward specific harmful content Output monitoring, steering detection, content validation LLDF Team
LLDF-T043 Instruction Injection Initial Access Injecting new instructions mid-conversation Instruction tracking, injection detection, command validation LLDF Team
LLDF-T044 Capability Probing Reconnaissance Systematically testing model capabilities and limits Probing detection, rate limiting, capability obfuscation LLDF Team
LLDF-T045 Reward Hacking Evasion Exploiting reward functions to bypass safety Reward validation, objective alignment, behavior monitoring LLDF Team
LLDF-T046 Prompt Wrapping Evasion Wrapping malicious prompts in benign context Context analysis, wrapping detection, intent classification LLDF Team
LLDF-T047 Instruction Overload Evasion Overwhelming model with conflicting instructions Instruction prioritization, conflict resolution, load management LLDF Team
LLDF-T048 Semantic Injection Initial Access Injecting malicious semantics through subtle phrasing Semantic analysis, intent detection, phrasing validation LLDF Team
LLDF-T049 Model Extraction Reconnaissance Extracting model parameters or architecture details Query monitoring, extraction detection, response limiting LLDF Team
LLDF-T050 Backdoor Activation Execution Triggering hidden backdoors in model behavior Backdoor detection, trigger monitoring, behavioral analysis LLDF Team
© 2025 LLDF Framework