INDEX
Explanations
personal pronouns and their usage in relationships and interactions
New Auto-Interp
Negative Logits
allow
-0.20
åħģ
-0.16
cht
-0.16
Allow
-0.16
Allow
-0.16
permit
-0.15
allow
-0.15
Alone
-0.15
superClass
-0.15
ÅĻev
-0.14
POSITIVE LOGITS
with
0.31
through
0.28
along
0.25
understand
0.25
towards
0.25
navigate
0.24
with
0.23
toward
0.23
avoid
0.23
stay
0.22
Activations Density 0.068%