INDEX
Explanations
phrases that indicate confrontation or opposition
New Auto-Interp
Negative Logits
TransparentColor
-0.16
.sul
-0.14
iders
-0.14
ielding
-0.14
ิà¸ķร
-0.14
вад
-0.13
villa
-0.13
gon
-0.13
ĵåIJį
-0.13
.answers
-0.13
POSITIVE LOGITS
uhan
0.16
ourt
0.15
pin
0.15
DÄĽ
0.14
sound
0.14
cáo
0.14
aux
0.14
lord
0.14
Modifiers
0.14
’
0.14
Activations Density 0.030%