INDEX
Explanations
structures related to possession or existence
New Auto-Interp
Negative Logits
iek
-0.15
ior
-0.14
ocol
-0.14
517
-0.14
tah
-0.14
ubo
-0.14
their
-0.13
your
-0.13
že
-0.13
ikes
-0.13
POSITIVE LOGITS
become
0.20
htags
0.20
htag
0.19
been
0.19
/h
0.19
/is
0.19
a
0.18
an
0.17
always
0.17
got
0.16
Activations Density 0.081%