INDEX
Explanations
phrases indicating potential threats or significant events
New Auto-Interp
Negative Logits
reembols
-0.54
TAINER
-0.50
outchouc
-0.50
krivning
-0.49
tanleria
-0.49
LANTA
-0.48
الاطلاع
-0.48
ICTS
-0.48
anium
-0.48
Kür
-0.47
POSITIVE LOGITS
spur
1.14
trigger
1.11
prompt
1.11
prompting
1.08
spurs
1.07
motivate
1.05
inspire
1.03
prompts
1.02
stimulate
1.01
triggering
1.01
Activations Density 0.645%