INDEX
Explanations
actions involving recognition and participation in activities or events
New Auto-Interp
Negative Logits
forall
-0.17
oneself
-0.16
obuf
-0.16
everyone
-0.15
everybody
-0.15
ów
-0.15
cludes
-0.15
something
-0.15
anes
-0.14
Everybody
-0.14
POSITIVE LOGITS
some
0.35
some
0.29
Some
0.27
algun
0.26
.some
0.24
einige
0.24
Some
0.24
none
0.23
SOME
0.23
некоÑĤоÑĢÑĭе
0.23
Activations Density 0.016%