INDEX
Explanations
modal verbs indicating capability or possibility
New Auto-Interp
Negative Logits
him
-0.19
eux
-0.19
neither
-0.18
arkin
-0.17
nowhere
-0.17
Neither
-0.16
onde
-0.15
Them
-0.15
gratuits
-0.15
Them
-0.15
POSITIVE LOGITS
we
0.37
they
0.32
you
0.29
anyone
0.24
it
0.24
this
0.22
these
0.22
he
0.22
such
0.22
/do
0.22
Activations Density 0.083%