INDEX
Explanations
expressions of lack of alternatives or options
New Auto-Interp
Negative Logits
umbing
-0.18
inx
-0.16
Це
-0.15
Never
-0.14
ÃŃm
-0.14
946
-0.14
åİ
-0.14
ion
-0.14
æ°¸
-0.14
LY
-0.13
POSITIVE LOGITS
alternatives
0.33
alternative
0.32
Alternative
0.27
alternative
0.26
altern
0.26
Alternative
0.25
Altern
0.25
Altern
0.24
alternate
0.21
option
0.20
Activations Density 0.074%