INDEX
Explanations
verbs related to helping or assisting in various contexts
New Auto-Interp
Negative Logits
their
-0.22
they
-0.22
sWith
-0.20
the
-0.19
that
-0.19
able
-0.19
swith
-0.18
those
-0.18
ox
-0.18
's
-0.18
POSITIVE LOGITS
itself
0.31
heets
0.21
cales
0.20
ided
0.20
Ñģобой
0.18
’
0.17
/is
0.17
us
0.17
OwnProperty
0.17
heet
0.16
Activations Density 0.763%