Compatibility Matrix#

Auto-generated by docs/scripts/generate_compat_matrix.py — do not edit manually. Since it is based on rough dynamic evaluation it may include some errors—but is useful for quick reference.

Each cell lists the concrete loss functions supported for the given optimizer-model pair, or shows Unsupported if the model does not satisfy the optimizer’s requirements. Optimizers with both token and text flows appear as two rows, one per flow.

Optimizer

CLIPTextEncoderHFModel

ClassifierHFModel

EncoderGeminiModel

EncoderHFModel

EncoderOpenAIModel

EncoderVoyageModel

LMHFModel

LiteLLMModel

ARCAOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

AutoPromptOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

BeamSearchOptimizer

SimilarityLoss

MisclassCELoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss

FirstTokenNLLLoss, ResponseHarmfulnessLoss

GASLITEOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

GASLITEPlusOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

GBDAOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

GCGOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

GCGPlusOptimizer

SimilarityLoss

MisclassCELoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss

FirstTokenNLLLoss, ResponseHarmfulnessLoss

HotFlipOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

PALOptimizer

SimilarityLoss

MisclassCELoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss

FirstTokenNLLLoss, ResponseHarmfulnessLoss

PEZOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

QCGOptimizer

SimilarityLoss

MisclassCELoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss

FirstTokenNLLLoss, ResponseHarmfulnessLoss

RASLITEPlusOptimizer

SimilarityLoss

MisclassCELoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss

FirstTokenNLLLoss, ResponseHarmfulnessLoss

RandomSearchOptimizer

SimilarityLoss

MisclassCELoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

SimilarityLoss

FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss

FirstTokenNLLLoss, ResponseHarmfulnessLoss

SoftPromptOptimizer

ExternalTriggerPerplexityLoss, SimilarityLoss

ExternalTriggerPerplexityLoss, MisclassCELoss

Unsupported

ExternalTriggerPerplexityLoss, SimilarityLoss

Unsupported

Unsupported

AttentionEnhLoss, ExternalTriggerPerplexityLoss, FirstTokenNLLLoss, PrefillCELoss, PrefillCWLoss, PrefillDistillationLoss, PrefillMellowMaxLoss, ResponseHarmfulnessLoss, SteeringActivationLoss, TriggerPerplexityLoss

Unsupported

Legend#

Access levels required by each optimizer#

Optimizer

Required Mixins

Flows

Access Level

ARCAOptimizer

LossTokenAccessMixin, GradientTokenAccessMixin

token

White-box

AutoPromptOptimizer

LossTokenAccessMixin, GradientTokenAccessMixin

token

White-box

BeamSearchOptimizer

LossTextAccessMixin

text

Black-box

GASLITEOptimizer

LossTokenAccessMixin, GradientTokenAccessMixin

token

White-box

GASLITEPlusOptimizer

LossTokenAccessMixin, GradientTokenAccessMixin

token

White-box

GBDAOptimizer

LossTokenAccessMixin, GradientTokenAccessMixin

token

White-box

GCGOptimizer

LossTokenAccessMixin, GradientTokenAccessMixin

token

White-box

GCGPlusOptimizer

LossTextAccessMixin

text

Black-box

HotFlipOptimizer

LossTokenAccessMixin, GradientTokenAccessMixin

token

White-box

PALOptimizer

LossTextAccessMixin

text

Black-box

PEZOptimizer

LossTokenAccessMixin, GradientEmbedAccessMixin

token

White-box (embedding)

QCGOptimizer

LossTextAccessMixin

text

Black-box

RASLITEPlusOptimizer

LossTextAccessMixin

text

Black-box

RandomSearchOptimizer

LossTextAccessMixin

text

Black-box

SoftPromptOptimizer

GradientEmbedAccessMixin

token

White-box (embedding)

Concrete loss functions#

Loss

Base Type

Required Parameters

AttentionEnhLoss

AttentionBasedLoss

full_attentions, input_slices

ExternalTriggerPerplexityLoss

BaseLoss

input_trigger_strs

FirstTokenNLLLoss

TextBasedLoss

response_first_token_logprobs

InputFluencyLoss

BinaryLMJudgeLoss

input_texts

MisclassCELoss

ClassificationBasedLoss

output_class_logits

PrefillCELoss

PrefillBasedLoss

prefill_response_logits, target_response_toks

PrefillCWLoss

PrefillBasedLoss

prefill_response_logits, target_response_toks

PrefillDistillationLoss

PrefillBasedLoss

prefill_response_logits, target_response_logits

PrefillMellowMaxLoss

PrefillBasedLoss

prefill_response_logits, target_response_toks

ResponseHarmfulnessLoss

BinaryLMJudgeLoss

generated_response_strs

SimilarityLoss

EmbeddingBasedLoss

output_embeddings, target_vectors

SteeringActivationLoss

HiddenStateBasedLoss

full_hidden_states, target_directions

TriggerPerplexityLoss

TriggerLogitBasedLoss

full_logits, input_trigger_ids, input_slices

Discovered fields per model (via source AST)#

Model

Flow

ModelOutput fields

ModelInput fields

CLIPTextEncoderHFModel

token

output_embeddings

input_attention_mask, input_embeds, input_prefix_cache_kwargs, input_slices, input_trigger_ids, input_trigger_strs, message_targets

CLIPTextEncoderHFModel

text

output_embeddings

(none)

ClassifierHFModel

token

output_class_logits

input_attention_mask, input_embeds, input_prefix_cache_kwargs, input_slices, input_trigger_ids, input_trigger_strs, message_targets

ClassifierHFModel

text

output_class_logits

(none)

EncoderGeminiModel

token

(none)

(none)

EncoderGeminiModel

text

output_embeddings

(none)

EncoderHFModel

token

output_embeddings

input_attention_mask, input_embeds, input_prefix_cache_kwargs, input_slices, input_trigger_ids, input_trigger_strs, message_targets

EncoderHFModel

text

output_embeddings

(none)

EncoderOpenAIModel

token

(none)

(none)

EncoderOpenAIModel

text

output_embeddings

(none)

EncoderVoyageModel

token

(none)

(none)

EncoderVoyageModel

text

output_embeddings

(none)

LMHFModel

token

full_attentions, full_hidden_states, full_logits, generated_response_ids, generated_response_logits, generated_response_strs, prefill_response_logits, response_first_token_logprobs

input_attention_mask, input_embeds, input_prefix_cache_kwargs, input_slices, input_trigger_ids, input_trigger_strs, message_targets

LMHFModel

text

full_ids, full_strs, generated_response_ids, generated_response_logits, generated_response_strs, prefill_response_logits, response_first_token_logprobs

(none)

LiteLLMModel

token

(none)

(none)

LiteLLMModel

text

generated_response_strs, response_first_token_logprobs

(none)