INDEX
Explanations
negations and terms associated with conditions or states of being
New Auto-Interp
Negative Logits
slack
-0.14
Nack
-0.14
ATTER
-0.14
useForm
-0.14
aron
-0.14
uos
-0.14
inar
-0.13
erna
-0.13
atter
-0.13
anda
-0.13
POSITIVE LOGITS
pas
0.42
PAS
0.32
Pas
0.31
Pas
0.31
pas
0.30
_pas
0.27
pasa
0.22
pás
0.19
jamais
0.19
gu
0.19
Activations Density 0.009%