INDEX
Explanations
actions and concepts related to conflict or opposition
New Auto-Interp
Negative Logits
Forge
-0.16
CLS
-0.15
aises
-0.15
_flash
-0.15
actics
-0.14
eldo
-0.14
uns
-0.14
å¥ı
-0.14
лаÑĪ
-0.13
neau
-0.13
POSITIVE LOGITS
able
0.22
ables
0.19
ssi
0.18
ABLE
0.18
ingly
0.17
inand
0.17
erve
0.16
ÂŃing
0.15
inf
0.15
apel
0.15
Activations Density 0.013%