INDEX
Explanations
phrases that describe the state or condition of a subject
New Auto-Interp
Negative Logits
OTHERWISE
-0.16
berman
-0.16
undy
-0.15
ände
-0.15
ertas
-0.15
rava
-0.15
pii
-0.14
idis
-0.14
lassen
-0.14
_advanced
-0.14
POSITIVE LOGITS
fond
0.21
generally
0.18
creatures
0.17
increasingly
0.17
notorious
0.17
encouraged
0.17
taught
0.16
sensitive
0.16
ke
0.16
uman
0.16
Activations Density 0.137%